-
Notifications
You must be signed in to change notification settings - Fork 6.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consolidate file cutting logic in compaction loop #1832
Conversation
@ajkr updated the pull request - view changes |
@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
drawback is we no longer call ShouldStopBefore() on the very first key, which shouldn't return true, but it does update some internal counters. edit: this is no longer the case due to my update below |
@ajkr updated the pull request - view changes - changes since last import |
5b2c0e6
to
37c5262
Compare
@ajkr updated the pull request - view changes - changes since last import |
ping |
快来review吧 |
你们也是够了 哈哈哈 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started reviewing it but didn't finish.
db/compaction_job.cc
Outdated
// ShouldStopBefore() maintains state based on keys processed so far. The | ||
// compaction loop always calls it on the "next" key, thus won't tell it the | ||
// first key. So we do that here. | ||
bool should_stop = sub_compact->ShouldStopBefore( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this function can be safely called if sub_compact->compaction->output_level() == 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't find the safety issue, but it's unnecessary to call it in the case you described, so I'll update the conditional.
db/compaction_job.cc
Outdated
// first key. So we do that here. | ||
bool should_stop = sub_compact->ShouldStopBefore( | ||
c_iter->key(), sub_compact->current_output_file_size); | ||
assert(!should_stop); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure about the assert?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so, because ShouldStopBefore() checks the amount of overlap with grandparent level, excluding the current key. In this case, the current key is the first key, so there'll be no overlap.
Even if ShouldStopBefore() changes in the future to do more checks, I still think it'd be a bug for it to try ending a file before (i.e., excluding) the first key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may not be a bug. It's just we didn't expect it to be called before the first key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, would you rather ignore the return value? Because I think there's no reasonable way for us to handle terminating files before keys are added, while there are still keys remaining in the compaction input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ajkr yes I think it feels better to ignore this output. I can't say we won't hit any issue either, but it feels a better workflow to me.
I also don't get why it is simpler after the refactoring. It feels more complicated to me. |
The compaction loop used to have file-terminating logic both at the top and the bottom. The one at the top was exercised only in rare situations, like when max_compaction_bytes is small and grandparent overlap is high. So if someone wants to change file-terminating logic (like for deleterange or compression dictionary), they have to remember to change it in two places, one of which is easy to miss. So after making this mistake in both deleterange and compression dictionary, I did this refactoring to make such mistakes impossible. |
@ajkr I must have missed something. I saw you just moved one FinishCompactionOutputFile() to another place. So the number of FinishCompactionOutputFile() calls didn't change. |
db/compaction_job.cc
Outdated
// first key. So we do that here. | ||
bool should_stop = sub_compact->ShouldStopBefore( | ||
c_iter->key(), sub_compact->current_output_file_size); | ||
assert(!should_stop); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ajkr yes I think it feels better to ignore this output. I can't say we won't hit any issue either, but it feels a better workflow to me.
CompactionIterationStats range_del_out_stats; | ||
status = FinishCompactionOutputFile(input->status(), sub_compact, | ||
range_del_agg.get(), | ||
&range_del_out_stats, &key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand correctly.
Here we predict the grandparent size for the file after adding the key, before we add it. After remove this check, we only check after inserting the key. Is it correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the existing behavior is as you described. But it should be the same afterwards since I moved the "c_iter->Next();" just above the ShouldStopBefore(), so it is still doing the check before the next key has been added.
@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
@ajkr updated the pull request - view changes - changes since last import |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's hard to think of all the possible problems. If we need to add unit tests to verify it, please add. Also maybe observe the behavior of some normal benchmark running to make sure the behavior is the same.
Hard to write unit tests because the goal is no functional change. Thanks for reviewing, will run a few benchmarks. |
@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
It was really annoying to have two places (top and bottom of compaction loop) where we cut output files. I had bugs in both DeleteRange and dictionary compression due to updating only one of the two. This diff consolidates the file-cutting logic to the bottom of the compaction loop.
Keep in mind that my goal with input_status is to be consistent with the past behavior, even though I'm not sure it's ideal.
Test Plan:
Commands:
bpl=10485760;overlap=10;mcz=2;del=300000000;levels=2;ctrig=10000000; delay=10000000; stop=10000000; wbn=2; mbc=20; mb=1073741824;wbs=640000000; dds=1; sync=0; r=10000000; t=1; vs=800; bs=65536; cs=1048576; of=500000; si=1000000; ./db_bench --benchmarks=fillrandom --disable_seek_compaction=1 --mmap_read=0 --num=$r --threads=$t --value_size=$vs --block_size=$bs --cache_size=$cs --bloom_bits=10 --cache_numshardbits=4 --open_files=$of --verify_checksum=1 --db=/data/del_range_bench --sync=$sync --disable_wal=1 --compression_type=zlib --compression_ratio=0.5 --disable_data_sync=$dds --write_buffer_size=$wbs --target_file_size_base=$mb --max_write_buffer_number=$wbn --max_background_compactions=$mbc --level0_file_num_compaction_trigger=$ctrig --level0_slowdown_writes_trigger=$delay --level0_stop_writes_trigger=$stop --num_levels=$levels --delete_obsolete_files_period_micros=$del --min_level_to_compress=$mcz --max_bytes_for_level_base=$bpl --memtablerep=vector --use_existing_db=0 --disable_auto_compactions=1
bpl=10485760;overlap=10;mcz=2;del=300000000;levels=2;ctrig=10000000; delay=10000000; stop=10000000; wbn=30; mbc=20; mb=1073741824;wbs=268435456; dds=1; sync=0; r=10000000; t=1; vs=800; bs=65536; cs=1048576; of=500000; si=1000000; /usr/bin/time ./db_bench --benchmarks=compact --disable_seek_compaction=1 --mmap_read=0 --num=$r --threads=$t --value_size=$vs --block_size=$bs --cache_size=$cs --bloom_bits=10 --cache_numshardbits=4 --open_files=$of --verify_checksum=1 --db=/data/del_range_bench --sync=$sync --disable_wal=1 --compression_type=zlib --compression_ratio=0.5 --disable_data_sync=$dds --write_buffer_size=$wbs --target_file_size_base=$mb --max_write_buffer_number=$wbn --max_background_compactions=$mbc --level0_file_num_compaction_trigger=$ctrig --level0_slowdown_writes_trigger=$delay --level0_stop_writes_trigger=$stop --num_levels=$levels --delete_obsolete_files_period_micros=$del --min_level_to_compress=$mcz --max_bytes_for_level_base=$bpl --memtablerep=vector --use_existing_db=1 --disable_auto_compactions=1
Results before this diff:
6.60user 6.64system 0:27.43elapsed 48%CPU
Results after this diff:
6.48user 6.95system 0:21.99elapsed 61%CPU