Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ReadOptions.auto_prefix_mode #6314

Closed
wants to merge 1 commit into from

Conversation

siying
Copy link
Contributor

@siying siying commented Jan 18, 2020

Summary:
Add a new option ReadOptions.auto_prefix_mode. When set to true, iterator should return the same result as total order seek, but may choose to do prefix seek internally, based on iterator upper bounds. Also fix two previous bugs when handling prefix extrator changes: (1) reverse iterator should not rely on upper bound to determine prefix. Fix it with skipping prefix check. (2) block-based filter is not handled properly.

Test Plan: (1) add a unit test; (2) add the check to stress test and run see whether it can pass at least one run.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@siying has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@ltamasi ltamasi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One general question: there are other places in the code where the behavior depends on whether total_order_seek is set; do we need any changes there considering that auto_prefix_mode is supposed to have the same semantics? Examples would be DBIter and DBImpl::NewInternalIterator.

HISTORY.md Outdated
@@ -25,6 +25,7 @@
* It is now possible to enable periodic compactions for the base DB when using BlobDB.
* BlobDB now garbage collects non-TTL blobs when `enable_garbage_collection` is set to `true` in `BlobDBOptions`. Garbage collection is performed during compaction: any valid blobs located in the oldest N files (where N is the number of non-TTL blob files multiplied by the value of `BlobDBOptions::garbage_collection_cutoff`) encountered during compaction get relocated to new blob files, and old blob files are dropped once they are no longer needed. Note: we recommend enabling periodic compactions for the base DB when using this feature to deal with the case when some old blob files are kept alive by SSTs that otherwise do not get picked for compaction.
* `db_bench` now supports the `garbage_collection_cutoff` option for BlobDB.
* Introduce ReadOptions.auto_prefix_mode. When set to true, iterator will return the same result as total order seek, but choose to use prefix seek internally based on seek key and iterator upper bound.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... "but may choose to use prefix seek" ?

// The same queries without recreating iterator
{
ub_str = "b9";
ro.iterate_upper_bound = &ub;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ub = Slice(ub_str); is missing here and a couple more places below.

@@ -899,9 +909,7 @@ Status StressTest::TestIterate(ThreadState* thread,
std::unique_ptr<Iterator> cmp_iter(db_->NewIterator(cmp_ro, cmp_cfh));
bool diverged = false;

bool support_seek_first_or_last =
(options_.prefix_extractor.get() != nullptr) ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question: was this broken previously? (Should it have been options_.prefix_extractor.get() == nullptr all along?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good point...

@@ -2785,7 +2785,7 @@ void BlockBasedTableIterator<TBlockIter, TValue>::SeekImpl(
const Slice* target) {
is_out_of_bound_ = false;
is_at_first_key_from_index_ = false;
if (target && !CheckPrefixMayMatch(*target)) {
if (target && !CheckPrefixMayMatch(*target, true)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/* is_forward_direction */ true ?

if (!CheckPrefixMayMatch(target)) {
// For now totally disable prefix seek in auto prefix mode because we don't
// have logic
if (!CheckPrefixMayMatch(target, false)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here with false.

bool CheckPrefixMayMatch(const Slice& ikey) {
bool CheckPrefixMayMatch(const Slice& ikey, bool is_forward_direction) {
if (need_upper_bound_check_ && !is_forward_direction) {
// Upper bound check isn't sufficnet for backward direction to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: typo in "sufficient".

@siying
Copy link
Contributor Author

siying commented Jan 23, 2020

@ltamasi that's a good point. I will modify DBIter and treat auto_prefix_mode the same as total_order_seek.

@@ -724,7 +724,13 @@ class BlockBasedTableIterator : public InternalIteratorBase<TValue> {
block_iter_points_to_real_block_;
}

bool CheckPrefixMayMatch(const Slice& ikey) {
bool CheckPrefixMayMatch(const Slice& ikey, bool is_forward_direction) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change bool is_forward_direction to Direction direction and compare it with kForward or kReverse?

@ltamasi
Copy link
Contributor

ltamasi commented Jan 24, 2020

@ltamasi that's a good point. I will modify DBIter and treat auto_prefix_mode the same as total_order_seek.

Actually, it seems to me that auto_prefix_mode is currently meant to be used with total_order_seek==false? I'm wondering if it would make sense to change that so it's to be used in conjunction with total_order_seek==true, considering that the results are supposed to be the same? (That is, auto_prefix_mode would be a "submode" or optimized version of total_order_seek.) In that case, DBIter and friends probably would not need to be modified.

@siying
Copy link
Contributor Author

siying commented Jan 24, 2020

@ltamasi i hope auto_prefix_mode can return the same result as total_order_seek. Right now it is meant to be used with total_order_seek==false but after the feature is stable, we may consider to make it work with total_order_seek==true too. I'll try to make changes in all the places.

@facebook-github-bot
Copy link
Contributor

@siying has updated the pull request. Re-import the pull request

db/db_iter.cc Outdated
@@ -73,7 +73,8 @@ DBIter::DBIter(Env* _env, const ReadOptions& read_options,
? read_options.prefix_same_as_start
: false),
pin_thru_lifetime_(read_options.pin_data),
total_order_seek_(read_options.total_order_seek),
expect_total_order_inner_iter_(read_options.total_order_seek ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could include the prefix_extractor_ check in the value of this boolean to simplify all the places where it's used, right? (That is, expect_total_order_inner_iter_ could be defined as prefix_extractor_ == nullptr || ro.total_order_seek || ro.auto_prefix_mode.)

Copy link
Contributor

@ltamasi ltamasi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... otherwise LGTM, thanks!

@facebook-github-bot
Copy link
Contributor

@siying has updated the pull request. Re-import the pull request

Summary:
Add a new option ReadOptions.auto_prefix_mode. When set to true, iterator should return the same result as total order seek, but may choose to do prefix seek internally, based on iterator upper bounds. Also fix two previous bugs when handling prefix extrator changes: (1) reverse iterator should not rely on upper bound to determine prefix. Fix it with skipping prefix check. (2) block-based filter is not handled properly.

Test Plan: (1) add a unit test; (2) add the check to stress test and run see whether it can pass at least one run.

DBIter changes

Address comments.

Address comments.
@facebook-github-bot
Copy link
Contributor

@siying has updated the pull request. Re-import the pull request

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@siying has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@siying
Copy link
Contributor Author

siying commented Jan 28, 2020

Although it shows that "Facebook Internal - Builds & Tests" failed, inside the link it says the tests finished. Tests passed except the two diffcoverage ones, which are not related to the PR.

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 8f2bee6.

@@ -3202,6 +3204,7 @@ InternalIterator* BlockBasedTable::NewIterator(
size_t compaction_readahead_size) {
BlockCacheLookupContext lookup_context{caller};
bool need_upper_bound_check =
read_options.auto_prefix_mode ||
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When auto_prefix_mode=true and PrefixExtractorChanged=false ,user set rep_->index_type == BlockBasedTableOptions::kHashSearch, then user will not to able to use prefix seek. It seems to me that auto_prefix_mode and kHashSearch can't be used at the same time, Can you help me explain why?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants