Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce overhead of SortFileByOverlappingRatio() #10161

Closed
wants to merge 3 commits into from

Conversation

siying
Copy link
Contributor

@siying siying commented Jun 14, 2022

Summary: Currently SortFileByOverlappingRatio() is O(nlogn). It is usually OK but When there are a lot of files in an LSM-tree, SortFileByOverlappingRatio() can take non-trivial amount of time. The problem is severe when the user is loading keys in sorted order, where compaction is only trivial move and this operation becomes the bottleneck and limit the total throughput. This commit makes SortFileByOverlappingRatio() only find the top 50 files based on score. 50 files are usually enough for the parallel compactions needed for the level, and in case it is not enough, we would fall back to random, which should be acceptable.

Test Plan:
Run a fillseq that generates a lot of files, and observe throughput improved (although stall is not yet eliminated). The command ran:

TEST_TMPDIR=/dev/shm/ ./db_bench_sort --benchmarks=fillseq --compression_type=lz4 --write_buffer_size=5000000 --num=100000000 --value_size=1000

The throughput improved by 11%.

@facebook-github-bot
Copy link
Contributor

@siying has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Comment on lines 3234 to 3236
// We don't pick last level files based on compaction priority,
// so we don't need to do the sorting.
if (level != num_levels() - 1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh. Got it.

// need to pick many files, so we limit files for this partial order.
// In case we use up all the sorted files, we are essentially pick files
// in random order, but it should be rare.
const size_t kTotalSortedElements = 8;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For kByCompensatedSize, it sorts the top n (50) files:
https://github.com/facebook/rocksdb/blob/48ce44240c398850a752d13d1a27c9849ae6fad2/db/version_set.cc#L3229-L3233

can we reuse that and apply it to both kByCompensatedSize and kMinOverlappingRatio, or even to all the other strategies.

Comment on lines 3202 to 3204
std::nth_element(temp->begin(), temp->begin() + num_to_sort, temp->end(),
comp_func);
std::sort(temp->begin(), temp->begin() + num_to_sort, comp_func);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think std::partial_sort() would be better.

Summary: Currently SortFileByOverlappingRatio() is O(nlogn). It is usually OK but When there are a lot of files in an LSM-tree, SortFileByOverlappingRatio() can take non-trivial amount of time. The problem is severe when the user is loading keys in sorted order, where compaction is only trivial move and this operation becomes the bottleneck and limit the total throughput. This commit does two things:
(1) SortFileByOverlappingRatio() to only find the top 8 files based on score. 8 files are usually enough for the parallel compactions needed for the level, and in case it is not enough, we would fall back to random, which should be acceptable.
(2) Don't sort files in the last level

Test Plan:
Run a fillseq that generates a lot of files, and observe throughput improved (although stall is not yet eliminated). The command ran:

TEST_TMPDIR=/dev/shm/ ./db_bench_sort --benchmarks=fillseq --compression_type=lz4 --write_buffer_size=5000000 --num=100000000 --value_size=1000

The throughput improved by 11%.
@facebook-github-bot
Copy link
Contributor

@siying has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@siying has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@siying has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@siying has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Contributor

@jay-zhuang jay-zhuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants