Improve memory usage and cache overlaps for system.jemalloc_profile_text#99121
Improve memory usage and cache overlaps for system.jemalloc_profile_text#99121antonio2368 merged 4 commits intomasterfrom
system.jemalloc_profile_text#99121Conversation
Two bugs in the symbolization cache caused cross-mode cache hits to produce wrong output: 1. Inconsistent frame storage order: symbolized mode cached frames in callback order (inline-first), while collapsed mode reversed frames before caching (main-first). Cross-mode cache hits silently produced wrong inline ordering or double-reversed stacks. 2. Missing `symbolize_with_inline` in cache key: a query with `symbolize_with_inline=false` cached a 1-element vector, and a later `=true` query silently reused it, dropping all inline frames. Fix by: - Changing the cache key to `(address, symbolize_with_inline)` so different inline settings get separate cache entries. - Using `shared_ptr<const vector<string>>` as the value type to avoid copying the symbol vector on every cache hit. - Introducing a single `resolveAddress` helper as the sole cache writer, always storing frames in callback order. Both symbolized and collapsed modes now reverse only at output time. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Eliminate `profile_lines` vector: instead of storing all file lines in memory during the address-collection pass, re-read the file in the heap-output phase. This halves peak memory for large profiles. - Eliminate `collapsed_lines` vector: stream collapsed output directly from the `stack_to_metric` map via an iterator, wrapped in a `CollapsedState` struct. The map is released once streaming completes. - Use `WriteBufferFromString` for collapsed stack assembly instead of quadratic `operator+=` loop. - Extract `parseStackAddresses` helper to deduplicate address parsing between `collectAddresses` and `generateCollapsed`. - Sort collected addresses for deterministic symbolized output. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Workflow [PR], commit [fb39b3a5] Summary: ❌
|
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
…state Remove the `aggregated` flag and instead make `CollapsedState` non-copyable and non-movable to prevent accidental iterator invalidation. Ensure `collapsed_state.reset()` is called on all completion paths to free memory promptly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
azat
left a comment
There was a problem hiding this comment.
Looks good
Fix a bug in generateCollapsed where collapsed_state.reset() without setting is_finished caused re-aggregation and duplicate output rows.
What was the problem? Wasn't it some intermediate version?
| if (state.iter == state.stack_to_metric.end()) | ||
| is_finished = true; |
There was a problem hiding this comment.
Do we need this second is_finished = true?
There was a problem hiding this comment.
We know it's last chunk so why not set it instantly to true?
LLVM Coverage Report
PR changed lines: PR changed-lines coverage: 13.50% (22/163) |
system.jemalloc_profile_text
Cherry pick #99121 to 26.2: Improve memory usage and cache overlaps for `system.jemalloc_profile_text`
…`system.jemalloc_profile_text`
Backport #99121 to 26.2: Improve memory usage and cache overlaps for `system.jemalloc_profile_text`
Summary
JemallocProfileSourceby storing cached symbol vectors asshared_ptr<const vector<string>>instead of copying on every cache hit.symbolize_with_inlineflag in the cache key so that queries with different inline settings don't silently reuse each other's results.resolveAddresshelper to deduplicate symbolization code betweengenerateSymbolizedandgenerateCollapsed.parseStackAddresseshelper to deduplicate address parsing logic.Test plan
03925_jemalloc_profile_system_table.shto verify no duplicate lines in collapsed format usingcount() = uniqExact(line)withmax_block_size = 1Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
Reduce memory usage and fix potential duplicate output in
system.jemalloc_profile_textcollapsed format.🤖 Generated with Claude Code
Note
Medium Risk
Touches profiling output generation and caching/streaming state; mistakes could change symbolization results or produce incomplete/duplicated profile text under chunked reads.
Overview
Reduces memory usage and improves determinism in
system.jemalloc_profile_textprocessing by refactoringJemallocProfileSourcesymbolization and parsing logic.Symbolization now uses an LRU keyed by
(address, symbolize_with_inline)and stores cached frames asshared_ptrvia a newresolveAddress()helper, avoiding vector copies and cross-setting cache pollution. Symbolized mode no longer buffers heap lines in memory; it re-reads the profile file when streaming the heap section, and address collection is deduplicated viaparseStackAddresses()with sorted output.Collapsed mode now aggregates once into a persisted
CollapsedStateand streams directly from the map, fixing a bug that could re-aggregate and emit duplicate rows; the test asserts uniqueness withmax_block_size=1.Written by Cursor Bugbot for commit fb39b3a. This will update automatically on new commits. Configure here.