-
Notifications
You must be signed in to change notification settings - Fork 6.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ReadOptions.auto_prefix_mode #6314
Conversation
84d63ef
to
5ad3b3a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@siying has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One general question: there are other places in the code where the behavior depends on whether total_order_seek
is set; do we need any changes there considering that auto_prefix_mode
is supposed to have the same semantics? Examples would be DBIter
and DBImpl::NewInternalIterator
.
HISTORY.md
Outdated
@@ -25,6 +25,7 @@ | |||
* It is now possible to enable periodic compactions for the base DB when using BlobDB. | |||
* BlobDB now garbage collects non-TTL blobs when `enable_garbage_collection` is set to `true` in `BlobDBOptions`. Garbage collection is performed during compaction: any valid blobs located in the oldest N files (where N is the number of non-TTL blob files multiplied by the value of `BlobDBOptions::garbage_collection_cutoff`) encountered during compaction get relocated to new blob files, and old blob files are dropped once they are no longer needed. Note: we recommend enabling periodic compactions for the base DB when using this feature to deal with the case when some old blob files are kept alive by SSTs that otherwise do not get picked for compaction. | |||
* `db_bench` now supports the `garbage_collection_cutoff` option for BlobDB. | |||
* Introduce ReadOptions.auto_prefix_mode. When set to true, iterator will return the same result as total order seek, but choose to use prefix seek internally based on seek key and iterator upper bound. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... "but may choose to use prefix seek" ?
// The same queries without recreating iterator | ||
{ | ||
ub_str = "b9"; | ||
ro.iterate_upper_bound = &ub; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ub = Slice(ub_str);
is missing here and a couple more places below.
@@ -899,9 +909,7 @@ Status StressTest::TestIterate(ThreadState* thread, | |||
std::unique_ptr<Iterator> cmp_iter(db_->NewIterator(cmp_ro, cmp_cfh)); | |||
bool diverged = false; | |||
|
|||
bool support_seek_first_or_last = | |||
(options_.prefix_extractor.get() != nullptr) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a question: was this broken previously? (Should it have been options_.prefix_extractor.get() == nullptr
all along?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good point...
@@ -2785,7 +2785,7 @@ void BlockBasedTableIterator<TBlockIter, TValue>::SeekImpl( | |||
const Slice* target) { | |||
is_out_of_bound_ = false; | |||
is_at_first_key_from_index_ = false; | |||
if (target && !CheckPrefixMayMatch(*target)) { | |||
if (target && !CheckPrefixMayMatch(*target, true)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/* is_forward_direction */ true
?
if (!CheckPrefixMayMatch(target)) { | ||
// For now totally disable prefix seek in auto prefix mode because we don't | ||
// have logic | ||
if (!CheckPrefixMayMatch(target, false)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here with false
.
bool CheckPrefixMayMatch(const Slice& ikey) { | ||
bool CheckPrefixMayMatch(const Slice& ikey, bool is_forward_direction) { | ||
if (need_upper_bound_check_ && !is_forward_direction) { | ||
// Upper bound check isn't sufficnet for backward direction to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: typo in "sufficient".
@ltamasi that's a good point. I will modify DBIter and treat auto_prefix_mode the same as total_order_seek. |
@@ -724,7 +724,13 @@ class BlockBasedTableIterator : public InternalIteratorBase<TValue> { | |||
block_iter_points_to_real_block_; | |||
} | |||
|
|||
bool CheckPrefixMayMatch(const Slice& ikey) { | |||
bool CheckPrefixMayMatch(const Slice& ikey, bool is_forward_direction) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change bool is_forward_direction
to Direction direction
and compare it with kForward or kReverse?
Actually, it seems to me that |
@ltamasi i hope |
@siying has updated the pull request. Re-import the pull request |
db/db_iter.cc
Outdated
@@ -73,7 +73,8 @@ DBIter::DBIter(Env* _env, const ReadOptions& read_options, | |||
? read_options.prefix_same_as_start | |||
: false), | |||
pin_thru_lifetime_(read_options.pin_data), | |||
total_order_seek_(read_options.total_order_seek), | |||
expect_total_order_inner_iter_(read_options.total_order_seek || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could include the prefix_extractor_
check in the value of this boolean to simplify all the places where it's used, right? (That is, expect_total_order_inner_iter_
could be defined as prefix_extractor_ == nullptr || ro.total_order_seek || ro.auto_prefix_mode
.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... otherwise LGTM, thanks!
@siying has updated the pull request. Re-import the pull request |
Summary: Add a new option ReadOptions.auto_prefix_mode. When set to true, iterator should return the same result as total order seek, but may choose to do prefix seek internally, based on iterator upper bounds. Also fix two previous bugs when handling prefix extrator changes: (1) reverse iterator should not rely on upper bound to determine prefix. Fix it with skipping prefix check. (2) block-based filter is not handled properly. Test Plan: (1) add a unit test; (2) add the check to stress test and run see whether it can pass at least one run. DBIter changes Address comments. Address comments.
7d5ca6a
to
d92c68e
Compare
@siying has updated the pull request. Re-import the pull request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@siying has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Although it shows that "Facebook Internal - Builds & Tests" failed, inside the link it says the tests finished. Tests passed except the two diffcoverage ones, which are not related to the PR. |
This pull request has been merged in 8f2bee6. |
@@ -3202,6 +3204,7 @@ InternalIterator* BlockBasedTable::NewIterator( | |||
size_t compaction_readahead_size) { | |||
BlockCacheLookupContext lookup_context{caller}; | |||
bool need_upper_bound_check = | |||
read_options.auto_prefix_mode || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When auto_prefix_mode=true and PrefixExtractorChanged=false ,user set rep_->index_type == BlockBasedTableOptions::kHashSearch, then user will not to able to use prefix seek. It seems to me that auto_prefix_mode and kHashSearch can't be used at the same time, Can you help me explain why?
Summary:
Add a new option ReadOptions.auto_prefix_mode. When set to true, iterator should return the same result as total order seek, but may choose to do prefix seek internally, based on iterator upper bounds. Also fix two previous bugs when handling prefix extrator changes: (1) reverse iterator should not rely on upper bound to determine prefix. Fix it with skipping prefix check. (2) block-based filter is not handled properly.
Test Plan: (1) add a unit test; (2) add the check to stress test and run see whether it can pass at least one run.