-
Notifications
You must be signed in to change notification settings - Fork 6.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add basic kRoundRobin compaction policy #10107
Conversation
b42b229
to
16481d9
Compare
05f177f
to
6e6bc24
Compare
a6adabe
to
62d20b5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mind reminding me why we thought the compaction cursor was updated too early? Was there some test failure (perhaps triggered by an additional code change)?
Oh, there was no testing failure on that. The motivation is that we want to update the cursor only when the compaction succeeds (suppose that it selects the first file), while the current implementation just blindly updates (advances) the cursor no matter whether the following compaction succeeds or not. If we want to implement this, we may need to update the cursor here (after Run() status is ok) rocksdb/db/db_impl/db_impl_compaction_flush.cc Line 3406 in ff32346
|
62d20b5
to
a8f3379
Compare
8c28b0c
to
3ccb840
Compare
55e09d1
to
5911b87
Compare
5911b87
to
ae7f11e
Compare
@@ -448,6 +448,10 @@ bool LevelCompactionBuilder::PickFileToCompact() { | |||
// do not pick a file to compact if it is being compacted | |||
// from n-1 level. | |||
if (f->being_compacted) { | |||
if (ioptions_.compaction_pri == kRoundRobin) { | |||
// Advance the cursor if a file is being compacted | |||
vstorage_->UpdateCompactCursor(start_level_, cmp_idx + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few design questions:
-
I think we should only update cursor when the compaction is done;
-
for one level, only one compaction (the one starts from the cursor) is able to update the cursor when it's done, which can be blocked by:
- there's already another compaction starts from that level, or
- a high level compaction, for example, in your "compacting" test case, L1 file
[100, 240]
is compacting L2 which overlaps with current cursor;
In both cases, the "next" file may likely also be overlapping with the ongoing compaction, in the following example, ifT1
is compacting to the next level,T2
is the next file, which is unable to compact, because the next level file is compacting:
So we could continue select the next next file T3, until we find one. Or another strategy is just fall back tokMinOverlappingRatio
, because anyway these compaction won't update cursor (hopefully these compactions are only happening because of high compaction pressure)
- ideally, we should use subcompaction to increase the compaction parallelism instead of having multiple compactions, but deciding how many input files (subcompaction number) is challenge, as if we choose too many it may delay other levels compaction. We don't need to handle it for now, it will be the next step for the project.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should only update cursor when the compaction is done;
I may have suggested the opposite to Zichen. Do we know the compaction reason at compaction done time? We only want to advance for CompactionReason::kLevelMaxLevelSize
compactions not all compactions. This is known when we pick the compaction and it sounds like we don't plan to use the cursor again until after that compaction finishes regardless of where the update happens. The only difference then is whether the advance happens in case compaction fails. I am neutral on this matter so prefer whatever's simpler to implement. Note that LevelDB actually considered it a benefit to advance the cursor despite failure -- https://github.com/google/leveldb/blob/d019e3605f222ebc5a3a2484a2cb29db537551dd/db/version_set.cc#L1440-L1445.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is known when we pick the compaction and it sounds like we don't plan to use the cursor again until after that compaction finishes regardless of where the update happens
Actually I'd take that back and guess it's OK to let picking try the next file. It'll almost certainly conflict due to output-level files in common with the concurrent compaction. But it doesn't seem bad to let it be picked if it somehow works out.
d8b5feb
to
df23188
Compare
bac58f8
to
59b8fb5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the progress. Seems we have a good grasp of the approach now. Had some suggestions for finishing this.
I think we should call SetCompactCursor()
when constructing the VersionEdit
for installing a compaction result. I found three places where that happens:
InstallCompactionResults()
BackgroundCompaction()
'sc->deletion_compaction()
branchBackgroundCompaction()
'sc->IsTrivialMove()
branch
The intuition for updating at this layer is we have the relevant info like knowledge of compaction input level (we only want to advance the input level's cursor), and compaction reason may be useful too. In the initial implementation we could always advance input-level cursor when reason is CompactionReason::kLevelMaxLevelSize
, input level is one or higher, and compaction priority is kRoundRobin
.
// To ensure every files is selcted in a round-robin manner, we cannot | ||
// skip the current file. So we return false and wait for the next time | ||
// we can pick this file to compact | ||
if (ioptions_.compaction_pri == kRoundRobin) { | ||
return false; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we intend to remove these early returns eventually? If so perhaps add TODOs.
For this condition, |
Oh ok. Yeah it does not apply so is fine to omit. |
59b8fb5
to
fa20019
Compare
fa20019
to
ad5d855
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, few more things.
ad5d855
to
155aee6
Compare
155aee6
to
7de1c28
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!!
7de1c28
to
832d6aa
Compare
97f2606
to
8e8d1bd
Compare
8e8d1bd
to
9fcc1c5
Compare
@littlepig2013 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, feel free to land.
Summary: Update HISTORY.md for CompactionPri::kRoundRobin. Detailed implementation can be found in [PR10107](#10107), [PR10227](#10227), [PR10250](#10250), [PR10278](#10278), [PR10316](#10316), and [PR10341](#10341) Pull Request resolved: #10421 Reviewed By: ajkr Differential Revision: D38194070 Pulled By: littlepig2013 fbshipit-source-id: 4ce153dc0bf22cd865d09c5429955023dbc90f37
Summary:
Add
kRoundRobin
as a compaction priority. The implementation is as follows.vector<InternalKey> compact_cursor_
intoVersionStorageInfo
where each element (InternalKey
) incompact_cursor_
represents a cursor. In round-robin compaction policy, we just need to select the first file (assuming files are sorted) and also has the smallest InternalKey larger than/equal to the cursor. After a file is chosen, we create a newFsize
vector which puts the selected file is placed at the first position intemp
, the next cursor is then updated as the smallest InternalKey in successor of the selected file (the above logic is implemented inSortFileByRoundRobin
).InstallCompactionResults()
, we choose the next cursor for the input level and save it toedit
. When callingLogAndApply
, we save the next cursor with its level into some local variable and finally apply the change tovstorage
inSaveTo
function.EncodeTo
so that they can be reconstructed when reopening. An empty cursor will not be encoded to MANIFESTTest Plan: add unit test (
CompactionPriRoundRobin
) incompaction_picker_test
, addkRoundRobin
priority inCompactionPriTest
fromdb_compaction_test
, and addPersistRoundRobinCompactCursor
indb_compaction_test