Skip to content

tp: scope gpu_counter custom-groups cache to per-sequence state#5742

Merged
dreveman merged 1 commit into
mainfrom
dev/reveman/gpu-counter-group-fix
May 6, 2026
Merged

tp: scope gpu_counter custom-groups cache to per-sequence state#5742
dreveman merged 1 commit into
mainfrom
dev/reveman/gpu-counter-group-fix

Conversation

@dreveman
Copy link
Copy Markdown
Collaborator

@dreveman dreveman commented May 6, 2026

gpu_custom_groups_inserted_ in GpuEventParser was a global FlatHashMap<uint64_t, bool> keyed solely by counter_descriptor_iid, used to dedupe GpuCounterBlock insertions for the interned descriptor path. Interned ids are scoped to a packet sequence's incremental-state interval, not global, so two producers that legitimately share an iid on different packet sequences collide: whichever arrives first flips the cache, and the second producer's InsertCustomCounterGroups call is silently short-circuited. If the first producer's descriptor has no counter_groups, the second producer's blocks never reach gpu_counter_group_table and the UI renders those counters flat instead of grouped.

Move the cache into a new GpuCounterSequenceState : CustomState on PacketSequenceStateGeneration, alongside the interned-data table the descriptor was looked up from. Cache key stays plain counter_descriptor_iid; correctness comes from CustomState being intrinsically per-sequence per-incremental-state-interval, matching where the iids are valid. Same pattern as AndroidKernelWakelockState.

gpu_custom_groups_inserted_ in GpuEventParser was a global
FlatHashMap<uint64_t, bool> keyed solely by counter_descriptor_iid,
used to dedupe GpuCounterBlock insertions for the interned descriptor
path. Interned ids are scoped to a packet sequence's incremental-state
interval, not global, so two producers that legitimately share an iid
on different packet sequences collide: whichever arrives first flips
the cache, and the second producer's InsertCustomCounterGroups call
is silently short-circuited. If the first producer's descriptor has
no counter_groups, the second producer's blocks never reach
gpu_counter_group_table and the UI renders those counters flat
instead of grouped.

Move the cache into a new GpuCounterSequenceState : CustomState on
PacketSequenceStateGeneration, alongside the interned-data table the
descriptor was looked up from. Cache key stays plain
counter_descriptor_iid; correctness comes from CustomState being
intrinsically per-sequence per-incremental-state-interval, matching
where the iids are valid. Same pattern as AndroidKernelWakelockState.
@dreveman dreveman requested a review from LalitMaganti May 6, 2026 04:11
@dreveman dreveman requested a review from a team as a code owner May 6, 2026 04:11
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 6, 2026

🎨 Perfetto UI Builds

@dreveman dreveman merged commit afc643c into main May 6, 2026
23 checks passed
@dreveman dreveman deleted the dev/reveman/gpu-counter-group-fix branch May 6, 2026 12:36
LalitMaganti added a commit that referenced this pull request May 6, 2026
tp: scope gpu_counter custom-groups cache to per-sequence state (#5742)
tp: extract IncrementalState to fix CustomState UAF (#5593)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants