-
Notifications
You must be signed in to change notification settings - Fork 6.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce overhead of SortFileByOverlappingRatio() #10161
Conversation
@siying has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
db/version_set.cc
Outdated
// We don't pick last level files based on compaction priority, | ||
// so we don't need to do the sorting. | ||
if (level != num_levels() - 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the level
should never be bottommost:
https://github.com/facebook/rocksdb/blob/48ce44240c398850a752d13d1a27c9849ae6fad2/db/version_set.cc#L3216-L3217
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh. Got it.
db/version_set.cc
Outdated
// need to pick many files, so we limit files for this partial order. | ||
// In case we use up all the sorted files, we are essentially pick files | ||
// in random order, but it should be rare. | ||
const size_t kTotalSortedElements = 8; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For kByCompensatedSize
, it sorts the top n (50) files:
https://github.com/facebook/rocksdb/blob/48ce44240c398850a752d13d1a27c9849ae6fad2/db/version_set.cc#L3229-L3233
can we reuse that and apply it to both kByCompensatedSize
and kMinOverlappingRatio
, or even to all the other strategies.
db/version_set.cc
Outdated
std::nth_element(temp->begin(), temp->begin() + num_to_sort, temp->end(), | ||
comp_func); | ||
std::sort(temp->begin(), temp->begin() + num_to_sort, comp_func); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think std::partial_sort()
would be better.
Summary: Currently SortFileByOverlappingRatio() is O(nlogn). It is usually OK but When there are a lot of files in an LSM-tree, SortFileByOverlappingRatio() can take non-trivial amount of time. The problem is severe when the user is loading keys in sorted order, where compaction is only trivial move and this operation becomes the bottleneck and limit the total throughput. This commit does two things: (1) SortFileByOverlappingRatio() to only find the top 8 files based on score. 8 files are usually enough for the parallel compactions needed for the level, and in case it is not enough, we would fall back to random, which should be acceptable. (2) Don't sort files in the last level Test Plan: Run a fillseq that generates a lot of files, and observe throughput improved (although stall is not yet eliminated). The command ran: TEST_TMPDIR=/dev/shm/ ./db_bench_sort --benchmarks=fillseq --compression_type=lz4 --write_buffer_size=5000000 --num=100000000 --value_size=1000 The throughput improved by 11%.
48ce442
to
25330e2
Compare
@siying has updated the pull request. You must reimport the pull request before landing. |
@siying has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@siying has updated the pull request. You must reimport the pull request before landing. |
@siying has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Summary: Currently SortFileByOverlappingRatio() is O(nlogn). It is usually OK but When there are a lot of files in an LSM-tree, SortFileByOverlappingRatio() can take non-trivial amount of time. The problem is severe when the user is loading keys in sorted order, where compaction is only trivial move and this operation becomes the bottleneck and limit the total throughput. This commit makes SortFileByOverlappingRatio() only find the top 50 files based on score. 50 files are usually enough for the parallel compactions needed for the level, and in case it is not enough, we would fall back to random, which should be acceptable.
Test Plan:
Run a fillseq that generates a lot of files, and observe throughput improved (although stall is not yet eliminated). The command ran:
TEST_TMPDIR=/dev/shm/ ./db_bench_sort --benchmarks=fillseq --compression_type=lz4 --write_buffer_size=5000000 --num=100000000 --value_size=1000
The throughput improved by 11%.