-
Notifications
You must be signed in to change notification settings - Fork 442
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: remove eviction batching #6060
Conversation
2148 tests run: 2064 passed, 0 failed, 84 skipped (full report)Code coverage (full report)
The comment gets automatically updated with the latest test results
0b4109b at 2023-12-11T18:17:04.756Z :recycle: |
f8ea50a
to
7769901
Compare
7769901
to
835ee7a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The disk-usage-based eviction batched by timeline.
IIRC one goal behind that was to only update the IndexPart once, instead of on each individual eviction.
I guess we already lost that property when we introduced the struct Layer
?
Also, I feel like the per-timeline eviction code could also benefit from the slightly advanced joinset acrobatics you're doing in the disk-usage-based eviction. Not sure if it's easy to extract that into a common function.
If you can't extract it into a common function, please add a comment explaining the acrobatics. My understanding is that it's to limit the number of pending evict_and_wait
tasks?
No. The unlinking has already done at the end of compaction or GC, it was introduced on #5645 perhaps.
I did not remember that, but compaction is still scheduling 1-2 updates even with the inverted l0=>l1 vs. image layer ordering going to prod soon (#5950) and gc schedules one. My motivation for this PR is recently gained absence of One could say we should have at least one test asserting how many index part updates we do, just in case. |
Noted thing:
This failure looks weird, but not caused by this PR. |
931511a
to
0b4109b
Compare
…rics (#6131) Because of bugs evictions could hang and pause disk usage eviction task. One such bug is known and fixed #6928. Guard each layer eviction with a modest timeout deeming timeouted evictions as failures, to be conservative. In addition, add logging and metrics recording on each eviction iteration: - log collection completed with duration and amount of layers - per tenant collection time is observed in a new histogram - per tenant layer count is observed in a new histogram - record metric for collected, selected and evicted layer counts - log if eviction takes more than 10s - log eviction completion with eviction duration Additionally remove dead code for which no dead code warnings appeared in earlier PR. Follow-up to: #6060.
We no longer have
layer_removal_cs
since #5108, we no longer need batching.