Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an option to trigger flush when the number of range deletions reach a threshold #11358

Closed
wants to merge 16 commits into from

Conversation

vrdhn
Copy link
Contributor

@vrdhn vrdhn commented Apr 7, 2023

Add a mutable column family option memtable_max_range_deletions. When non-zero, RocksDB will try to flush the current memtable after it has at least memtable_max_range_deletions range deletions. Java API is added and crash test is updated accordingly to randomly enable this option.

Test plan:

  • New unit test: DBRangeDelTest.MemtableMaxRangeDeletions
  • Ran crash test python3 ./tools/db_crashtest.py whitebox --simple --memtable_max_range_deletions=20 and saw logs showing flushed memtables usually with 20 range deletions.

@vrdhn vrdhn force-pushed the vrdhn-upstream-range-del-read branch 2 times, most recently from 2287f58 to 857b8eb Compare April 7, 2023 17:18
@vrdhn vrdhn changed the title Improve performance the mix of range-delete and read workloads. Improve performance of mix of range-delete and read workloads. Apr 8, 2023
include/rocksdb/options.h Outdated Show resolved Hide resolved
Copy link

@yao-xiao-github yao-xiao-github left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making the change and it's very helpful for our use case (#11407)!

db/memtable.cc Show resolved Hide resolved
options/cf_options.h Outdated Show resolved Hide resolved
@@ -170,6 +172,11 @@ size_t MemTable::ApproximateMemoryUsage() {
}

bool MemTable::ShouldFlushNow() {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is possible to log the reason for flush? Maybe consider adding the reason to FlushReason .

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, let me see if i can identify where to generate the flush reason.

@vrdhn vrdhn force-pushed the vrdhn-upstream-range-del-read branch from 857b8eb to b44ef52 Compare April 28, 2023 17:49
@vrdhn
Copy link
Contributor Author

vrdhn commented Apr 28, 2023

hi @yao-xiao-github ,@cbi42 I did the requested changes. I couldn't figure out where to set the FlushReason.
Any hints will be hightly appreciated.
Thanks.

@vrdhn vrdhn force-pushed the vrdhn-upstream-range-del-read branch from b44ef52 to e7949f0 Compare May 2, 2023 01:56
@yao-xiao-github
Copy link

hi @yao-xiao-github ,@cbi42 I did the requested changes. I couldn't figure out where to set the FlushReason.
Any hints will be hightly appreciated.
Thanks.

Thanks for making the changes. I also tried to expose the FlushReason but didn't find a good way. All the memtable related flushes seem to be put under the same reason. @cbi42 Could you provide some hints? Or is it possible to put info log here?

@cbi42
Copy link
Member

cbi42 commented May 8, 2023

hi @yao-xiao-github ,@cbi42 I did the requested changes. I couldn't figure out where to set the FlushReason.
Any hints will be hightly appreciated.
Thanks.

Thanks for making the changes. I also tried to expose the FlushReason but didn't find a good way. All the memtable related flushes seem to be put under the same reason. @cbi42 Could you provide some hints? Or is it possible to put info log here?

Hi - I also don't see an easy way to add FlushReason for this change, especially because ShouldFlushNow() was exclusively used to check if a memtable was full. Maybe we can just log the number of range tombstones here for now:

<< GetFlushReasonString(flush_reason_);

Also could you add a unit test to check that flush is triggered correctly based on the new option?

Copy link

@saintstack saintstack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't have perms to +1 this PR but LGTM.

Patched a 8.1.1 checkout then ran against a dataset that was stuck in recovery, OOM'ing every time because too many range deletes (Yao's case). Ran patched binary with all defaults and still OOM. Then I set memtable_max_range_deletions=1000, recompiled and retried. Lots of flushes but was able to open the database.

@@ -331,6 +331,11 @@ struct ColumnFamilyOptions : public AdvancedColumnFamilyOptions {
// Default: nullptr
std::shared_ptr<SstPartitionerFactory> sst_partitioner_factory = nullptr;

// Automatic flush after range deletions count in memtable hits this limit.
// helps with workloads having lot of range deletes.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: If making another edit, you might say more than 'helps'; it is a guard against accumulating too many range deletes in memtable such that you get into a state where you cannot flush w/o OOM'ing.

options/options.cc Outdated Show resolved Hide resolved
@vrdhn vrdhn force-pushed the vrdhn-upstream-range-del-read branch 2 times, most recently from 2222b8e to a1e333c Compare May 24, 2023 17:05
@yao-xiao-github
Copy link

@cbi42 Could you please take another look? Thanks!

@cbi42 cbi42 self-assigned this Jun 8, 2023
@cbi42
Copy link
Member

cbi42 commented Jun 8, 2023

@cbi42 Could you please take another look? Thanks!

Hi - there are some requested changes like testing and logging that are not done. I have made these changes locally but realize this PR is not open for maintainer to edit yet. @vrdhn - can you update this PR to allow maintainer to edit? Thanks.

@bremac
Copy link
Contributor

bremac commented Jun 8, 2023

@cbi42 Could you please take another look? Thanks!

Hi - there are some requested changes like testing and logging that are not done. I have made these changes locally but realize this PR is not open for maintainer to edit yet. @vrdhn - can you update this PR to allow maintainer to edit? Thanks.

Hi @cbi42. Apparently GitHub forbids opening PRs for a maintainer to edit on repos associated with an organization. At @vrdhn's request I did make you a contributor on the repo. Let me know if that works for you. Otherwise, we can create a new PR from a repo associated with a personal account instead.

@cbi42 cbi42 force-pushed the vrdhn-upstream-range-del-read branch 2 times, most recently from 0a99266 to 956cb69 Compare June 9, 2023 01:25
@facebook-github-bot
Copy link
Contributor

@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@cbi42 cbi42 requested a review from ajkr June 9, 2023 01:30
@facebook-github-bot
Copy link
Contributor

@vrdhn has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@vrdhn has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

db/memtable.h Outdated
if (memtable_max_range_deletions_ > 0 &&
memtable_max_range_deletions_reached_ == false &&
val >= (uint64_t)memtable_max_range_deletions_) {
memtable_max_range_deletions_reached_ = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like a data race. What about comparing num_range_deletes_ against memtable_max_range_deletions_ in ShouldFlushNow()?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related (similar scenario where the flag started with one value and could only ever flip once to the other value): #4801

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to check num_range_deletes_ against memtable_max_range_deletions_ in ShouldFlushNow().

@cbi42 cbi42 changed the title Improve performance of mix of range-delete and read workloads. Add option to trigger flush when the number of range deletions reach a threshold Jul 26, 2023
@cbi42 cbi42 changed the title Add option to trigger flush when the number of range deletions reach a threshold Add an option to trigger flush when the number of range deletions reach a threshold Jul 27, 2023
@cbi42 cbi42 force-pushed the vrdhn-upstream-range-del-read branch from 4c6d625 to bdfe899 Compare July 31, 2023 22:16
@facebook-github-bot
Copy link
Contributor

@vrdhn has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@vrdhn has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Contributor

@ajkr ajkr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@facebook-github-bot
Copy link
Contributor

@vrdhn has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@cbi42 merged this pull request in 87a21d0.

rockeet pushed a commit to topling/toplingdb that referenced this pull request Dec 18, 2023
…ch a threshold (#11358)

Summary:
Add a mutable column family option `memtable_max_range_deletions`. When non-zero, RocksDB will try to flush the current memtable after it has at least `memtable_max_range_deletions` range deletions. Java API is added and crash test is updated accordingly to randomly enable this option.

Pull Request resolved: facebook/rocksdb#11358

Test Plan:
* New unit test: `DBRangeDelTest.MemtableMaxRangeDeletions`
* Ran crash test `python3 ./tools/db_crashtest.py whitebox --simple --memtable_max_range_deletions=20` and saw logs showing flushed memtables usually with 20 range deletions.

Reviewed By: ajkr

Differential Revision: D46582680

Pulled By: cbi42

fbshipit-source-id: f23d6fa8d8264ecf0a18d55c113ba03f5e2504da
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants