-
Notifications
You must be signed in to change notification settings - Fork 6.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merging iterator to avoid child iterator reseek for some cases #5286
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. A few minor comments.
PERF_TIMER_GUARD(seek_child_seek_time); | ||
child.Seek(target); | ||
PERF_COUNTER_ADD(seek_child_seek_count, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be interesting to have a stat for how many were skipped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean adding a new counter just for this? But reseek might not be common in most use cases. Is it worth it?
table/internal_iterator.h
Outdated
@@ -130,6 +132,7 @@ class InternalIteratorBase : public Cleanable { | |||
Prev(); | |||
} | |||
} | |||
bool is_mutable_ = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Can this be default initialized in the constructor for sake of consistency?
Summary: When reseek happens in merging iterator, reseeking a child iterator can be avoided if: (1) the iterator represents imutable data (2) reseek() to a larger key than the current key (3) the current key of the child iterator is larger than the seek key because it is guaranteed that the result will fall into the same position. This optimization will be useful for use cases where users keep seeking to keys nearby in ascending order. Test Plan: Add a unit test and run existing tests. Move InternalIteratorBase::is_mutable_ initaliziation to constructors only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@siying has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
…ook#5286) Summary: When reseek happens in merging iterator, reseeking a child iterator can be avoided if: (1) the iterator represents imutable data (2) reseek() to a larger key than the current key (3) the current key of the child iterator is larger than the seek key because it is guaranteed that the result will fall into the same position. This optimization will be useful for use cases where users keep seeking to keys nearby in ascending order. Pull Request resolved: facebook#5286 Differential Revision: D15283635 Pulled By: siying fbshipit-source-id: 35f79ffd5ce3609146faa8cd55f2bfd733502f83
facebook#5286)" This reverts commit 9fad3e2. Iterator verification in stress tests sometimes fail for assertion table/block_based/block_based_table_reader.cc:2973: void rocksdb::BlockBasedTableIterator<TBlockIter, TValue>::FindBlockForward() [with TBlockIter = rocksdb::DataBlockIter; TValue = rocksdb::Slice]: Assertion `!next_block_is_out_of_bound || user_comparator_.Compare(*read_options_.iterate_upper_bound, index_iter_->user_key()) <= 0' failed. It is likely to be linked to facebook#5286 together with facebook#5468 as the former PR makes some child ierator's seek being avoided, so that upper bound condition fails to be updated there. Stricly speaking, the former PR was merged before the latter one, but the later one feels a more important improvement so I choose to revert the former one for now.
#5286)" (#5871) Summary: This reverts commit 9fad3e2. Iterator verification in stress tests sometimes fail for assertion table/block_based/block_based_table_reader.cc:2973: void rocksdb::BlockBasedTableIterator<TBlockIter, TValue>::FindBlockForward() [with TBlockIter = rocksdb::DataBlockIter; TValue = rocksdb::Slice]: Assertion `!next_block_is_out_of_bound || user_comparator_.Compare(*read_options_.iterate_upper_bound, index_iter_->user_key()) <= 0' failed. It is likely to be linked to #5286 together with #5468 as the former PR makes some child iterator's seek being avoided, so that upper bound condition fails to be updated there. Strictly speaking, the former PR was merged before the latter one, but the latter one feels a more important improvement so I choose to revert the former one for now. Pull Request resolved: #5871 Differential Revision: D17689196 fbshipit-source-id: 4ded5be68f67bee2782d31a29cb72ea68f59dd8c
#5286)" (#5871) Summary: This reverts commit 9fad3e2. Iterator verification in stress tests sometimes fail for assertion table/block_based/block_based_table_reader.cc:2973: void rocksdb::BlockBasedTableIterator<TBlockIter, TValue>::FindBlockForward() [with TBlockIter = rocksdb::DataBlockIter; TValue = rocksdb::Slice]: Assertion `!next_block_is_out_of_bound || user_comparator_.Compare(*read_options_.iterate_upper_bound, index_iter_->user_key()) <= 0' failed. It is likely to be linked to #5286 together with #5468 as the former PR makes some child iterator's seek being avoided, so that upper bound condition fails to be updated there. Strictly speaking, the former PR was merged before the latter one, but the latter one feels a more important improvement so I choose to revert the former one for now. Pull Request resolved: #5871 Differential Revision: D17689196 fbshipit-source-id: 4ded5be68f67bee2782d31a29cb72ea68f59dd8c
#5286)" (#5871) Summary: This reverts commit 9fad3e2. Iterator verification in stress tests sometimes fail for assertion table/block_based/block_based_table_reader.cc:2973: void rocksdb::BlockBasedTableIterator<TBlockIter, TValue>::FindBlockForward() [with TBlockIter = rocksdb::DataBlockIter; TValue = rocksdb::Slice]: Assertion `!next_block_is_out_of_bound || user_comparator_.Compare(*read_options_.iterate_upper_bound, index_iter_->user_key()) <= 0' failed. It is likely to be linked to #5286 together with #5468 as the former PR makes some child iterator's seek being avoided, so that upper bound condition fails to be updated there. Strictly speaking, the former PR was merged before the latter one, but the latter one feels a more important improvement so I choose to revert the former one for now. Pull Request resolved: #5871 Differential Revision: D17689196 fbshipit-source-id: 4ded5be68f67bee2782d31a29cb72ea68f59dd8c
facebook#5286)" (facebook#5871) Summary: This reverts commit 9fad3e2. Iterator verification in stress tests sometimes fail for assertion table/block_based/block_based_table_reader.cc:2973: void rocksdb::BlockBasedTableIterator<TBlockIter, TValue>::FindBlockForward() [with TBlockIter = rocksdb::DataBlockIter; TValue = rocksdb::Slice]: Assertion `!next_block_is_out_of_bound || user_comparator_.Compare(*read_options_.iterate_upper_bound, index_iter_->user_key()) <= 0' failed. It is likely to be linked to facebook#5286 together with facebook#5468 as the former PR makes some child iterator's seek being avoided, so that upper bound condition fails to be updated there. Strictly speaking, the former PR was merged before the latter one, but the latter one feels a more important improvement so I choose to revert the former one for now. Pull Request resolved: facebook#5871 Differential Revision: D17689196 fbshipit-source-id: 4ded5be68f67bee2782d31a29cb72ea68f59dd8c
facebook#5286)" (facebook#5871) Summary: This reverts commit 9fad3e2. Iterator verification in stress tests sometimes fail for assertion table/block_based/block_based_table_reader.cc:2973: void rocksdb::BlockBasedTableIterator<TBlockIter, TValue>::FindBlockForward() [with TBlockIter = rocksdb::DataBlockIter; TValue = rocksdb::Slice]: Assertion `!next_block_is_out_of_bound || user_comparator_.Compare(*read_options_.iterate_upper_bound, index_iter_->user_key()) <= 0' failed. It is likely to be linked to facebook#5286 together with facebook#5468 as the former PR makes some child iterator's seek being avoided, so that upper bound condition fails to be updated there. Strictly speaking, the former PR was merged before the latter one, but the latter one feels a more important improvement so I choose to revert the former one for now. Pull Request resolved: facebook#5871 Differential Revision: D17689196 fbshipit-source-id: 4ded5be68f67bee2782d31a29cb72ea68f59dd8c
Summary: When reseek happens in merging iterator, reseeking a child iterator can be avoided if:
(1) the iterator represents imutable data
(2) reseek() to a larger key than the current key
(3) the current key of the child iterator is larger than the seek key
because it is guaranteed that the result will fall into the same position.
This optimization will be useful for use cases where users keep seeking to keys nearby in ascending order.
Test Plan: Add a unit test and run existing tests.