-
Notifications
You must be signed in to change notification settings - Fork 439
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Perform L0 compaction before creating new image layers #5950
Conversation
Interesting. The idea with the current code is that if you're about to create image layers anyway, it's a waste of time and unnecessary write amplification to compact the deltas, because subsequent getpage requests will make use of the image layers and won't look at the deltas anymore. But if it greatly speeds up the image layer creation, maybe we should indeed do it as a quick fix. Another approach would be to optimize the getpage requests during image layer creation. We're reconstruct the pages with individual "retail" getpage calls, but if you know you're going to call getpage for every page, you could surely do it more efficiently. In other words, we could merge of the WAL records from different L0 layers "on-the-fly", as part of image layer creation, similar to how the L0 compaction does. |
Yep, we'd like to do this at some point in the medium-term future: #4979 In the near term, the measurements in the PR description are quite persuasive... conversely we might need to think a bit and see if we can identify write patterns where this model becomes particularly painful. |
@hlinnaka , thank you. Is it correct to assume that when PITR horizon is set, we will perform L0 to L1 compaction anyway?
Interesting. Btw, even after L0->L1 compaction, creating of new images is long process. It's CPU bound to 1 CPU. Disk usage is low:
Should the compacting be reworked for multi-threading, or is there still potential for further optimizations on single core? |
We probably won't want to parallelize compaction itself:
If we got to the point where our I/O was all nice an efficient and the actual walredo (pure CPU) part becomes a bottleneck, then I would perhaps look to do a localized worker pool of walredo processes (currently we have one per tenant), and/or make pure Rust implementations of certain common WAL operations (but I'm told this is harder than it sounds). |
2436 tests run: 2339 passed, 0 failed, 97 skipped (full report)Code coverage (full report)
The comment gets automatically updated with the latest test results
dd76c94 at 2023-12-04T11:47:32.842Z :recycle: |
e017283
to
b6e5ca2
Compare
b6e5ca2
to
a4a939c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change makes sense to me, cannot see any reason why not to merge this.
a4a939c
to
3772812
Compare
Great! Glad to hear that. I fixed flacky @koivunej , could you re-run CI please? |
dee3095
to
dd76c94
Compare
Problem
If there are too many L0 layers before compaction, the compaction process becomes slow because of slow
timeline::get
. As a result of the slowdown, the pageserver will generate even more L0 layers for the next iteration, further exacerbating the slow performance.Summary of changes
Perform L0 -> L1 compaction before of creating new images. It simple change speedups compaction time and
timeline::get
to 5x.timeline::get
is faster on top of L1 layers.For layers map:
reconstruct data latency before:
and after:
Checklist before requesting a review
Checklist before merging