-
Notifications
You must be signed in to change notification settings - Fork 6.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix memtable-only iterator regression #10705
Conversation
@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
table/merging_iterator.cc
Outdated
if (to_add_memtable_range_tombstone_iter_) { | ||
merge_iter->AddRangeTombstoneIterator(nullptr); | ||
to_add_memtable_range_tombstone_iter_ = false; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assumptions on the caller are reasonable but adding up to be somewhat complicated IMO.
What if we add bool ignore_range_deletions
to the MergeIteratorBuilder
constructor, and then have a common SwitchToMergingIterator()
function that does everything when needed - create the MergingIterator, add first_iter as first point key iterator, add nullptr as first range del iterator (when !ignore_range_deletions AND we aren't switching due to non-nullptr AddMemtableRangeTombstoneIterator())?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps a better design is having a single AddComponentIterators() function that takes both point and range iterator pointers. It could internally decide when to upgrade to MergingIterator (upon first call when range tombstone iter is non-null or second call otherwise). It could also have some logic to only call AddRangeTombstoneIterator()
upon the first non-nullptr range iterator provided - first calling AddRangeTombstoneIterator(nullptr)
for all the previous components for which we have not yet added a range tombstone iterator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review. I agree with the complexity involved in MergeIteratorBuilder
, and updated the API to AddPointAndTombstoneIterator
that adds a pair of point and range tombstone iters at a time. I did not consolidate AddPointAndTombstoneIterator
and AddIterator
yet to separate use cases that may or may not involve range tombstones. I'm thinking to consolidate them (or decide on not to consolidate them) as a next step after this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think improving the design of the MergeIteratorBuilder would be nice to have, though we can still land this if there's no easy way right now
table/merging_iterator.cc
Outdated
if (to_add_memtable_range_tombstone_iter_) { | ||
merge_iter->AddRangeTombstoneIterator(nullptr); | ||
to_add_memtable_range_tombstone_iter_ = false; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps a better design is having a single AddComponentIterators() function that takes both point and range iterator pointers. It could internally decide when to upgrade to MergingIterator (upon first call when range tombstone iter is non-null or second call otherwise). It could also have some logic to only call AddRangeTombstoneIterator()
upon the first non-nullptr range iterator provided - first calling AddRangeTombstoneIterator(nullptr)
for all the previous components for which we have not yet added a range tombstone iterator.
@cbi42 has updated the pull request. You must reimport the pull request before landing. |
@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Summary: when there is a single memtable without range tombstones and no SST files in the database, DBIter should wrap memtable iterator directly. Currently we create a merging iterator on top of the memtable iterator, and have DBIter wrap around it. This causes iterator regression and this PR fixes this issue. Pull Request resolved: facebook#10705 Test Plan: - `make check` - Performance: - Set up: `./db_bench -benchmarks=filluniquerandom -write_buffer_size=$((1 << 30)) -num=10000` - Benchmark: `./db_bench -benchmarks=seekrandom -use_existing_db=true -avoid_flush_during_recovery=true -write_buffer_size=$((1 << 30)) -num=10000 -threads=16 -duration=60 -seek_nexts=$seek_nexts` ``` seek_nexts main op/sec facebook#10705 RocksDB v7.6 0 5746568 5749033 5786180 30 2411690 3006466 2837699 1000 102556 128902 124667 ``` Reviewed By: ajkr Differential Revision: D39644221 Pulled By: cbi42 fbshipit-source-id: 8063ff611ba31b0e5670041da3927c8c54b2097d
auto t = sv->mem->NewRangeTombstoneIterator( | ||
read_options_, latest_seq, false /* immutable_memtable */); | ||
if (!t || t->empty()) { | ||
if (memtable_range_tombstone_iter_) { | ||
*memtable_range_tombstone_iter_ = nullptr; | ||
} | ||
delete t; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about delete *memtable_range_tombstone_iter_
? Could *memtable_range_tombstone_iter_
have been non-null if the SV changed after the sv_number_ != cur_sv_number
check picked the else branch, and the mutable memtable in the old SV had range tombstones while the mutable memtable in the new SV does not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this could happen. Thanks for catching this, fixing it in #10716
Summary: Fix potential memory leak in ArenaWrappedDBIter::Refresh() introduced in #10705. See #10705 (comment) for detail. Pull Request resolved: #10716 Test Plan: make check Reviewed By: ajkr Differential Revision: D39698561 Pulled By: cbi42 fbshipit-source-id: dc0d0c6e3878eaa84f87623fbe4916b9b08b077a
…10716) Summary: Fix potential memory leak in ArenaWrappedDBIter::Refresh() introduced in facebook#10705. See facebook#10705 (comment) for detail. Pull Request resolved: facebook#10716 Test Plan: make check Reviewed By: ajkr Differential Revision: D39698561 Pulled By: cbi42 fbshipit-source-id: dc0d0c6e3878eaa84f87623fbe4916b9b08b077a
Summary: when there is a single memtable without range tombstones and no SST files in the database, DBIter should wrap memtable iterator directly. Currently we create a merging iterator on top of the memtable iterator, and have DBIter wrap around it. This causes iterator regression and this PR fixes this issue. Pull Request resolved: #10705 Test Plan: - `make check` - Performance: - Set up: `./db_bench -benchmarks=filluniquerandom -write_buffer_size=$((1 << 30)) -num=10000` - Benchmark: `./db_bench -benchmarks=seekrandom -use_existing_db=true -avoid_flush_during_recovery=true -write_buffer_size=$((1 << 30)) -num=10000 -threads=16 -duration=60 -seek_nexts=$seek_nexts` ``` seek_nexts main op/sec #10705 RocksDB v7.6 0 5746568 5749033 5786180 30 2411690 3006466 2837699 1000 102556 128902 124667 ``` Reviewed By: ajkr Differential Revision: D39644221 Pulled By: cbi42 fbshipit-source-id: 8063ff611ba31b0e5670041da3927c8c54b2097d
Summary: Fix potential memory leak in ArenaWrappedDBIter::Refresh() introduced in #10705. See #10705 (comment) for detail. Pull Request resolved: #10716 Test Plan: make check Reviewed By: ajkr Differential Revision: D39698561 Pulled By: cbi42 fbshipit-source-id: dc0d0c6e3878eaa84f87623fbe4916b9b08b077a
Summary: when there is a single memtable without range tombstones and no SST files in the database, DBIter should wrap memtable iterator directly. Currently we create a merging iterator on top of the memtable iterator, and have DBIter wrap around it. This causes iterator regression and this PR fixes this issue.
Test plan:
make check
./db_bench -benchmarks=filluniquerandom -write_buffer_size=$((1 << 30)) -num=10000
./db_bench -benchmarks=seekrandom -use_existing_db=true -avoid_flush_during_recovery=true -write_buffer_size=$((1 << 30)) -num=10000 -threads=16 -duration=60 -seek_nexts=$seek_nexts