De-template block based table iterator #6531

siying · 2020-03-13T05:35:09Z

Summary:
Right now block based table iterator is used as both of iterating data for block based table, and for the index iterator for partitioend index. This was initially convenient for introducing a new iterator and block type for new index format, while reducing code change. However, these two usage doesn't go with each other very well. For example, Prev() is never called for partitioned index iterator, and some other complexity is maintained in block based iterators, which is not needed for index iterator but maintainers will always need to reason about it. Furthermore, the template usage is not following Google C++ Style which we are following, and makes a large chunk of code tangled together. This commit separate the two iterators. Right now, here is what it is done:

Copy the block based iterator code into partitioned index iterator, and de-template them.
Remove some code not needed for partitioned index. The upper bound check and tricks are removed. We never tested performance for those tricks when partitioned index is enabled in the first place. It's unlikelyl to generate performance regression, as creating new partitioned index block is much rarer than data blocks.
Separate out the prefetch logic to a helper class and both classes call them.

This commit will enable future follow-ups. One direction is that we might separate index iterator interface for data blocks and index blocks, as they are quite different.

Test Plan: build using make and cmake. And build release

Summary: Right now block based table iterator is used as both of iterating data for block based table, and for the index iterator for partitioend index. This was initially convenient for introducing a new iterator and block type for new index format, while reducing code change. However, these two usage doesn't go with each other very well. For example, Prev() is never called for partitioned index iterator, and some other complexity is maintained in block based iterators, which is not needed for index iterator but maintainers will always need to reason about it. Furthermore, the template usage is not following Google C++ Style which we are following, and makes a large chunk of code tangled together. This commit separate the two iterators. Right now, here is what it is done: 1. Copy the block based iterator code into partitioned index iterator, and de-template them. 2. Remove some code not needed for partitioned index. The upper bound check and tricks are removed. We never tested performance for those tricks when partitioned index is enabled in the first place. It's unlikelyl to generate performance regression, as creating new partitioned index block is much rarer than data blocks. 3. Separate out the prefetch logic to a helper class and both classes call them. This commit will enable future follow-ups. One direction is that we might separate index iterator interface for data blocks and index blocks, as they are quite different. Test Plan: build using make and cmake

ltamasi · 2020-03-13T16:42:29Z

table/block_based/block_based_table_iterator.cc

+  CheckOutOfBound();
+
+  if (target) {
+    assert(!Valid() || ((block_type_ == BlockType::kIndex &&


block_type_ can no longer be kIndex in this class, right?

ltamasi · 2020-03-13T16:57:15Z

table/block_based/block_prefetcher.h

+namespace ROCKSDB_NAMESPACE {
+class BlockPrefetcher {
+ public:
+  BlockPrefetcher(size_t compaction_readahead_size)


We probably should make this ctor explicit.

ltamasi · 2020-03-13T16:59:36Z

table/block_based/partitioned_index_iterator.h

+        user_comparator_(icomp.user_comparator()),
+        index_iter_(index_iter),
+        block_iter_points_to_real_block_(false),
+        block_type_(block_type),


... on the contrary, block_type_ can only be kIndex in this class, right? If so, we could remove this member altogether.

ltamasi · 2020-03-13T17:03:15Z

src.mk

@@ -124,8 +124,10 @@ LIB_SOURCES =                                                   \
  table/block_based/block_based_filter_block.cc                 \
  table/block_based/block_based_table_builder.cc                \
  table/block_based/block_based_table_factory.cc                \
+  table/block_based/block_based_table_iterator.cc               \


Could you please run the buckifier to update TARGETS as well?

ltamasi · 2020-03-13T17:05:51Z

table/block_based/block_based_table_iterator.cc

+    bool is_for_compaction =
+        lookup_context_.caller == TableReaderCaller::kCompaction;
+    // Prefetch additional data for range scans (iterators). Enabled only for
+    // user reads.


I don't think this is true anymore?

ltamasi · 2020-03-13T17:07:09Z

table/block_based/block_based_table_iterator.h

@@ -39,7 +39,7 @@ class BlockBasedTableIterator : public InternalIteratorBase<TValue> {
        prefix_extractor_(prefix_extractor),
        block_type_(block_type),
        lookup_context_(caller),
-        compaction_readahead_size_(compaction_readahead_size) {}
+        block_prefetcher_(compaction_readahead_size) {}

  ~BlockBasedTableIterator() { delete index_iter_; }


We could use unique_ptr instead of explicitly calling delete.

ltamasi · 2020-03-13T17:10:24Z

table/block_based/partitioned_index_iterator.h

+        lookup_context_(caller),
+        block_prefetcher_(compaction_readahead_size) {}
+
+  ~ParititionedIndexIterator() { delete index_iter_; }


We could make index_iter_ a unique_ptr here as well.

ltamasi

LGTM. Thanks!

facebook-github-bot

@siying has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2020-03-16T22:39:23Z

This pull request has been merged in d669080.

Summary: #6531 removed some code in partitioned index seek logic. By mistake the logic of storing previous index offset is removed, while the logic of using it is preserved, so that the code might use wrong value to determine reseeking condition. This will trigger a bug, if following a Seek() not going to the last block, SeekToLast() is called, and then Seek() is called which should position the cursor to the block before SeekToLast(). Pull Request resolved: #6551 Test Plan: Add a unit test that reproduces the bug. In the same unit test, also some reseek cases are covered to avoid regression. Reviewed By: pdillinger Differential Revision: D20493990 fbshipit-source-id: 3919aa4861c0481ec96844e053048da1a934b91d

Fix regression bug in partitioned index reseek caused by facebook#6531 (facebook#6551)

siying requested a review from ltamasi March 13, 2020 05:35

facebook-github-bot added the CLA Signed label Mar 13, 2020

ltamasi reviewed Mar 13, 2020

View reviewed changes

Address comments.

7bfb9cd

ltamasi approved these changes Mar 16, 2020

View reviewed changes

facebook-github-bot reviewed Mar 16, 2020

View reviewed changes

facebook-github-bot closed this in d669080 Mar 16, 2020

facebook-github-bot added the Merged label Mar 16, 2020

ltamasi mentioned this pull request Mar 17, 2020

Fix regression bug in partitioned index reseek caused by #6531 #6551

Closed

sthagen mentioned this pull request Mar 18, 2020

Fix regression bug in partitioned index reseek caused by #6531 (#6551) sthagen/facebook-rocksdb#139

Merged

sthagen added a commit to sthagen/facebook-rocksdb that referenced this pull request Mar 18, 2020

Merge pull request #139 from facebook/master

5cbfa06

Fix regression bug in partitioned index reseek caused by facebook#6531 (facebook#6551)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

De-template block based table iterator #6531

De-template block based table iterator #6531

siying commented Mar 13, 2020

ltamasi Mar 13, 2020

ltamasi Mar 13, 2020

ltamasi Mar 13, 2020

ltamasi Mar 13, 2020

ltamasi Mar 13, 2020

ltamasi Mar 13, 2020

ltamasi Mar 13, 2020

ltamasi left a comment

facebook-github-bot left a comment

facebook-github-bot commented Mar 16, 2020

De-template block based table iterator #6531

De-template block based table iterator #6531

Conversation

siying commented Mar 13, 2020

ltamasi Mar 13, 2020

Choose a reason for hiding this comment

ltamasi Mar 13, 2020

Choose a reason for hiding this comment

ltamasi Mar 13, 2020

Choose a reason for hiding this comment

ltamasi Mar 13, 2020

Choose a reason for hiding this comment

ltamasi Mar 13, 2020

Choose a reason for hiding this comment

ltamasi Mar 13, 2020

Choose a reason for hiding this comment

ltamasi Mar 13, 2020

Choose a reason for hiding this comment

ltamasi left a comment

Choose a reason for hiding this comment

facebook-github-bot left a comment

Choose a reason for hiding this comment

facebook-github-bot commented Mar 16, 2020