-
Notifications
You must be signed in to change notification settings - Fork 6.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cut output files at compaction cursors #10227
Cut output files at compaction cursors #10227
Conversation
cf949b1
to
0ecab03
Compare
0ecab03
to
15f02dd
Compare
15f02dd
to
d7f26fe
Compare
d7f26fe
to
83ce16f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@littlepig2013 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
const InternalKey output_split_key = c->GetOutputSplitKey(); | ||
if (output_split_key.Valid()) { | ||
output_split_user_key = ExtractUserKey(output_split_key.Encode()); | ||
bounds.emplace_back(output_split_key.Encode()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confusingly, InternalKey::Encode()
does not make a copy. I think that makes it unsafe for use in bounds
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace output_split_key
in 648 with c->GetOutputSplitKey()
should be fine?
Summary: Update HISTORY.md for CompactionPri::kRoundRobin. Detailed implementation can be found in [PR10107](#10107), [PR10227](#10227), [PR10250](#10250), [PR10278](#10278), [PR10316](#10316), and [PR10341](#10341) Pull Request resolved: #10421 Reviewed By: ajkr Differential Revision: D38194070 Pulled By: littlepig2013 fbshipit-source-id: 4ce153dc0bf22cd865d09c5429955023dbc90f37
Summary:
The files behind the compaction cursor contain newer data than the files ahead of it. If a compaction writes a file that spans from before its output level’s cursor to after it, then data before the cursor will be contaminated with the old timestamp from the data after the cursor. To avoid this, we can split the output file into two – one entirely before the cursor and one entirely after the cursor. Note that, in rare cases, we DO NOT need to cut the file if it is a trivial move since the file will not be contaminated by older files. In such case, the compact cursor is not guaranteed to be the boundary of the file, but it does not hurt the round-robin selection process.
Test Plan: Add 'RoundRobinCutOutputAtCompactCursor' unit test in
db_compaction_test
Task: T122216351