admission: garbage collect resource manager groups less frequently#170630
Merged
Conversation
Contributor
|
😎 Merged successfully - details. |
Member
dc8fda5 to
8e120aa
Compare
tbg
approved these changes
May 26, 2026
gcGroupsResetUsedAndUpdateEstimators previously dropped any non-built-in groupInfo after one second of inactivity. That could cause groupInfoPool churn for tenants whose work arrived in bursts. It also left holes in per-tenant metric series: releaseGroupInfo Unlinks the group's per-group AggCounter children, so at typical 10-30s scrape intervals their labeled time series could disappear before ever being observed. Here, we decouple GC from the used reset. Each groupInfo now carries an idleSince that the GC loop refreshes on activity. A group is dropped only once (now - idleSince) >= groupGCIdleThreshold, currently one minute. Built-ins remain exempt. The fair-share used reset still happens every tick. Epic: none Release note: None Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
8e120aa to
a9fcaf9
Compare
Collaborator
Author
|
/trunk merge |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
gcGroupsResetUsedAndUpdateEstimatorspreviously dropped any non-built-ingroupInfoafter one second of inactivity. That could causegroupInfoPoolchurn for tenants whose work arrived in bursts, and — more importantly — left holes in per-tenant metric series:releaseGroupInfoUnlinks the group's per-groupAggCounterchildren, so at typical 10–30s scrape intervals their labeled time series could disappear before ever being observed.This change decouples GC from the used reset. Each
groupInfonow carries anidleSincethat the GC loop refreshes on activity. A group is dropped only once(now - idleSince) >= groupGCIdleThreshold, currently one minute. Built-ins remain exempt. The fair-shareusedreset still happens every tick.TestGCKeepsMetricChildVisibleAcrossScrapesexercises the scenario directly: admits work for a tenant, simulates several sub-threshold GC ticks while scraping the per-group parent, and confirms the labeled child stays visible until the threshold elapses.Epic: none