Add ReadOptions.auto_prefix_mode #6314

siying · 2020-01-18T00:30:54Z

Summary:
Add a new option ReadOptions.auto_prefix_mode. When set to true, iterator should return the same result as total order seek, but may choose to do prefix seek internally, based on iterator upper bounds. Also fix two previous bugs when handling prefix extrator changes: (1) reverse iterator should not rely on upper bound to determine prefix. Fix it with skipping prefix check. (2) block-based filter is not handled properly.

Test Plan: (1) add a unit test; (2) add the check to stress test and run see whether it can pass at least one run.

facebook-github-bot

@siying has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

ltamasi

One general question: there are other places in the code where the behavior depends on whether total_order_seek is set; do we need any changes there considering that auto_prefix_mode is supposed to have the same semantics? Examples would be DBIter and DBImpl::NewInternalIterator.

ltamasi · 2020-01-23T19:11:01Z

HISTORY.md

@@ -25,6 +25,7 @@
 * It is now possible to enable periodic compactions for the base DB when using BlobDB.
 * BlobDB now garbage collects non-TTL blobs when `enable_garbage_collection` is set to `true` in `BlobDBOptions`. Garbage collection is performed during compaction: any valid blobs located in the oldest N files (where N is the number of non-TTL blob files multiplied by the value of `BlobDBOptions::garbage_collection_cutoff`) encountered during compaction get relocated to new blob files, and old blob files are dropped once they are no longer needed. Note: we recommend enabling periodic compactions for the base DB when using this feature to deal with the case when some old blob files are kept alive by SSTs that otherwise do not get picked for compaction.
 * `db_bench` now supports the `garbage_collection_cutoff` option for BlobDB.
+* Introduce ReadOptions.auto_prefix_mode. When set to true, iterator will return the same result as total order seek, but choose to use prefix seek internally based on seek key and iterator upper bound.


... "but may choose to use prefix seek" ?

ltamasi · 2020-01-23T19:21:29Z

db/db_test2.cc

+  // The same queries without recreating iterator
+  {
+    ub_str = "b9";
+    ro.iterate_upper_bound = &ub;


ub = Slice(ub_str); is missing here and a couple more places below.

ltamasi · 2020-01-23T20:28:26Z

db_stress_tool/db_stress_test_base.cc

@@ -899,9 +909,7 @@ Status StressTest::TestIterate(ThreadState* thread,
    std::unique_ptr<Iterator> cmp_iter(db_->NewIterator(cmp_ro, cmp_cfh));
    bool diverged = false;

-    bool support_seek_first_or_last =
-        (options_.prefix_extractor.get() != nullptr) ||


Just a question: was this broken previously? (Should it have been options_.prefix_extractor.get() == nullptr all along?)

Yes, good point...

ltamasi · 2020-01-23T20:30:35Z

table/block_based/block_based_table_reader.cc

@@ -2785,7 +2785,7 @@ void BlockBasedTableIterator<TBlockIter, TValue>::SeekImpl(
    const Slice* target) {
  is_out_of_bound_ = false;
  is_at_first_key_from_index_ = false;
-  if (target && !CheckPrefixMayMatch(*target)) {
+  if (target && !CheckPrefixMayMatch(*target, true)) {


/* is_forward_direction */ true ?

ltamasi · 2020-01-23T20:31:30Z

table/block_based/block_based_table_reader.cc

-  if (!CheckPrefixMayMatch(target)) {
+  // For now totally disable prefix seek in auto prefix mode because we don't
+  // have logic
+  if (!CheckPrefixMayMatch(target, false)) {


Same here with false.

ltamasi · 2020-01-23T20:33:24Z

table/block_based/block_based_table_reader.h

-  bool CheckPrefixMayMatch(const Slice& ikey) {
+  bool CheckPrefixMayMatch(const Slice& ikey, bool is_forward_direction) {
+    if (need_upper_bound_check_ && !is_forward_direction) {
+      // Upper bound check isn't sufficnet for backward direction to


Nit: typo in "sufficient".

siying · 2020-01-23T23:59:41Z

@ltamasi that's a good point. I will modify DBIter and treat auto_prefix_mode the same as total_order_seek.

riversand963 · 2020-01-24T00:23:15Z

table/block_based/block_based_table_reader.h

@@ -724,7 +724,13 @@ class BlockBasedTableIterator : public InternalIteratorBase<TValue> {
           block_iter_points_to_real_block_;
  }

-  bool CheckPrefixMayMatch(const Slice& ikey) {
+  bool CheckPrefixMayMatch(const Slice& ikey, bool is_forward_direction) {


Change bool is_forward_direction to Direction direction and compare it with kForward or kReverse?

ltamasi · 2020-01-24T00:51:07Z

@ltamasi that's a good point. I will modify DBIter and treat auto_prefix_mode the same as total_order_seek.

Actually, it seems to me that auto_prefix_mode is currently meant to be used with total_order_seek==false? I'm wondering if it would make sense to change that so it's to be used in conjunction with total_order_seek==true, considering that the results are supposed to be the same? (That is, auto_prefix_mode would be a "submode" or optimized version of total_order_seek.) In that case, DBIter and friends probably would not need to be modified.

siying · 2020-01-24T01:06:43Z

@ltamasi i hope auto_prefix_mode can return the same result as total_order_seek. Right now it is meant to be used with total_order_seek==false but after the feature is stable, we may consider to make it work with total_order_seek==true too. I'll try to make changes in all the places.

facebook-github-bot · 2020-01-24T02:16:28Z

@siying has updated the pull request. Re-import the pull request

ltamasi · 2020-01-24T17:53:57Z

db/db_iter.cc

@@ -73,7 +73,8 @@ DBIter::DBIter(Env* _env, const ReadOptions& read_options,
                                ? read_options.prefix_same_as_start
                                : false),
      pin_thru_lifetime_(read_options.pin_data),
-      total_order_seek_(read_options.total_order_seek),
+      expect_total_order_inner_iter_(read_options.total_order_seek ||


You could include the prefix_extractor_ check in the value of this boolean to simplify all the places where it's used, right? (That is, expect_total_order_inner_iter_ could be defined as prefix_extractor_ == nullptr || ro.total_order_seek || ro.auto_prefix_mode.)

ltamasi

... otherwise LGTM, thanks!

facebook-github-bot · 2020-01-27T22:49:12Z

@siying has updated the pull request. Re-import the pull request

Summary: Add a new option ReadOptions.auto_prefix_mode. When set to true, iterator should return the same result as total order seek, but may choose to do prefix seek internally, based on iterator upper bounds. Also fix two previous bugs when handling prefix extrator changes: (1) reverse iterator should not rely on upper bound to determine prefix. Fix it with skipping prefix check. (2) block-based filter is not handled properly. Test Plan: (1) add a unit test; (2) add the check to stress test and run see whether it can pass at least one run. DBIter changes Address comments. Address comments.

facebook-github-bot · 2020-01-28T01:19:19Z

@siying has updated the pull request. Re-import the pull request

facebook-github-bot

@siying has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

siying · 2020-01-28T22:18:22Z

Although it shows that "Facebook Internal - Builds & Tests" failed, inside the link it says the tests finished. Tests passed except the two diffcoverage ones, which are not related to the PR.

facebook-github-bot · 2020-01-28T23:40:16Z

This pull request has been merged in 8f2bee6.

ohhzj · 2021-09-02T14:18:54Z

table/block_based/block_based_table_reader.cc

@@ -3202,6 +3204,7 @@ InternalIterator* BlockBasedTable::NewIterator(
    size_t compaction_readahead_size) {
  BlockCacheLookupContext lookup_context{caller};
  bool need_upper_bound_check =
+      read_options.auto_prefix_mode ||


When auto_prefix_mode=true and PrefixExtractorChanged=false ,user set rep_->index_type == BlockBasedTableOptions::kHashSearch, then user will not to able to use prefix seek. It seems to me that auto_prefix_mode and kHashSearch can't be used at the same time, Can you help me explain why?

facebook-github-bot added the CLA Signed label Jan 18, 2020

siying force-pushed the automatic_prefix branch from 84d63ef to 5ad3b3a Compare January 18, 2020 00:34

facebook-github-bot reviewed Jan 18, 2020

View reviewed changes

ltamasi reviewed Jan 23, 2020

View reviewed changes

riversand963 reviewed Jan 24, 2020

View reviewed changes

ltamasi reviewed Jan 24, 2020

View reviewed changes

ltamasi approved these changes Jan 24, 2020

View reviewed changes

siying force-pushed the automatic_prefix branch from 7d5ca6a to d92c68e Compare January 28, 2020 01:19

facebook-github-bot reviewed Jan 28, 2020

View reviewed changes

facebook-github-bot closed this in 8f2bee6 Jan 28, 2020

facebook-github-bot added the Merged label Jan 28, 2020

ltamasi mentioned this pull request Mar 19, 2020

Introduce unified seek mode to automatically detect compatibility of bloom filters #5764

Closed

ohhzj reviewed Sep 2, 2021

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ReadOptions.auto_prefix_mode #6314

Add ReadOptions.auto_prefix_mode #6314

siying commented Jan 18, 2020

facebook-github-bot left a comment

ltamasi left a comment

ltamasi Jan 23, 2020

ltamasi Jan 23, 2020

ltamasi Jan 23, 2020

siying Jan 24, 2020

ltamasi Jan 23, 2020

ltamasi Jan 23, 2020

ltamasi Jan 23, 2020

siying commented Jan 23, 2020

riversand963 Jan 24, 2020

ltamasi commented Jan 24, 2020

siying commented Jan 24, 2020

facebook-github-bot commented Jan 24, 2020

ltamasi Jan 24, 2020

ltamasi left a comment

facebook-github-bot commented Jan 27, 2020

facebook-github-bot commented Jan 28, 2020

facebook-github-bot left a comment

siying commented Jan 28, 2020

facebook-github-bot commented Jan 28, 2020

ohhzj Sep 2, 2021

Add ReadOptions.auto_prefix_mode #6314

Add ReadOptions.auto_prefix_mode #6314

Conversation

siying commented Jan 18, 2020

facebook-github-bot left a comment

Choose a reason for hiding this comment

ltamasi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

siying commented Jan 23, 2020

Choose a reason for hiding this comment

ltamasi commented Jan 24, 2020

siying commented Jan 24, 2020

facebook-github-bot commented Jan 24, 2020

Choose a reason for hiding this comment

ltamasi left a comment

Choose a reason for hiding this comment

facebook-github-bot commented Jan 27, 2020

facebook-github-bot commented Jan 28, 2020

facebook-github-bot left a comment

Choose a reason for hiding this comment

siying commented Jan 28, 2020

facebook-github-bot commented Jan 28, 2020

Choose a reason for hiding this comment