Fix a bug in format_version 3 + partition filters + prefix search #5835

maysamyabandeh · 2019-09-20T23:47:48Z

Partitioned filters make use of a top-level index to find the partition in which the filter resides. The top-level index has a key per partition. The key is guaranteed to be larger or equal than any key in that partition. When used with format_version 3, which excludes the sequence number form index keys, the separator key in the index could be equal to the prefix of the keys in the next partition. In this way, when searching for the key, the top-level index will lead us to the previous partition, which has no key with that prefix. The prefix bloom test thus returns false, although the prefix exists in the bloom of the next partition.
The patch fixes that by always adding the prefix of the first key of the next partition to the bloom of the current partition. In this way, in the corner cases that the index will lead us to the previous partition, we still can find the bloom filter there.

Here is an example: Keys are P1K1, and P2K2, each in a separate block. The separator in the index is P2. When seek to prefix P2, the index matches with separator P2 and points to the block that contains P1K1, where there used to be no bloom of prefix P2.
This is not an issue with format_version 2, since separator will be <P2, MaxSeq> while the search key will be <P2, snapshot_seq>, and since in rocksdb keys are sorted in reverse order of their sequence number we have <P2, snapshot_seq> larger than <P2, MaxSeq>, and thus the index will lead the search to the next 2nd block, where we have filter for P2K2 and its prefix P2.

facebook-github-bot

@maysamyabandeh has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

siying · 2019-09-21T00:07:05Z

table/block_based/full_filter_block.h

@@ -59,12 +59,13 @@ class FullFilterBlockBuilder : public FilterBlockBuilder {
  virtual void AddKey(const Slice& key);
  std::unique_ptr<FilterBitsBuilder> filter_bits_builder_;
  virtual void Reset();
+  const SliceTransform* prefix_extractor_;


Instead of making it public, can you add an accesser instead?

siying · 2019-09-21T00:15:27Z

I think the bug is more than in format_version 3, so I suggest we modify the word in HISTORY.md and PR title.

siying · 2019-09-21T00:16:05Z

Btw, I don't think the fix is a hack. It's a solid solution.

facebook-github-bot · 2019-09-24T01:09:05Z

@maysamyabandeh has updated the pull request. Re-import the pull request

facebook-github-bot · 2019-09-24T17:33:50Z

@maysamyabandeh has updated the pull request. Re-import the pull request

facebook-github-bot

@maysamyabandeh has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

maysamyabandeh · 2019-09-24T17:59:49Z

Discussed offline. The description is updated to explain why the bug does not manifest with format version 2.

facebook-github-bot · 2019-09-25T00:40:51Z

@maysamyabandeh merged this pull request in 6652c94.

…cebook#5835) Summary: Partitioned filters make use of a top-level index to find the partition in which the filter resides. The top-level index has a key per partition. The key is guaranteed to be larger or equal than any key in that partition. When used with format_version 3, which excludes the sequence number form index keys, the separator key in the index could be equal to the prefix of the keys in the next partition. In this way, when searching for the key, the top-level index will lead us to the previous partition, which has no key with that prefix. The prefix bloom test thus returns false, although the prefix exists in the bloom of the next partition. The patch fixes that by a hack: It always adds the prefix of the first key of the next partition to the bloom of the current partition. In this way, in the corner cases that the index will lead us to the previous partition, we still can find the bloom filter there. Pull Request resolved: facebook#5835 Differential Revision: D17513585 Pulled By: maysamyabandeh fbshipit-source-id: e2d1ff26c759e6e03875c4d57f4228316ecf50e9

Summary: Basically, the fix in facebook#8137 was incomplete (and I missed it in the review), because if `whole_key_filtering` is false, then `last_prefix_str_` will never be set to non-empty and the fix doesn't work. Also related to facebook#5835. This is intended as a safe, simple fix that will regress CPU efficiency slightly (for `whole_key_filtering=false` cases, because of extra prefix string copies during flush & compaction). An efficient fix is not possible without some substantial refactoring. Also in this PR: new test DBBloomFilterTest.FilterNumEntriesCoalesce tests an adjacent code path that was previously untested for its effect of ensuring the number of unique prefixes and keys is tracked properly when both prefixes and whole keys are going into a filter. (Test fails when either of the two code segments checking for duplicates is disabled.) In addition, the same test would fail before the main bug fix here because the code would inappropriately add the empty string to the filter (because of unmodified `last_prefix_str_`). Test Plan: In addition to DBBloomFilterTest.FilterNumEntriesCoalesce, extended DBBloomFilterTest.SeekForPrevWithPartitionedFilters to cover the broken case. (Mostly whitespace change.)

…2872) Summary: Basically, the fix in #8137 was incomplete (and I missed it in the review), because if `whole_key_filtering` is false, then `last_prefix_str_` will never be set to non-empty and the fix doesn't work. Also related to #5835. This is intended as a safe, simple fix that will regress CPU efficiency slightly (for `whole_key_filtering=false` cases, because of extra prefix string copies during flush & compaction). An efficient fix is not possible without some substantial refactoring. Also in this PR: new test DBBloomFilterTest.FilterNumEntriesCoalesce tests an adjacent code path that was previously untested for its effect of ensuring the number of unique prefixes and keys is tracked properly when both prefixes and whole keys are going into a filter. (Test fails when either of the two code segments checking for duplicates is disabled.) In addition, the same test would fail before the main bug fix here because the code would inappropriately add the empty string to the filter (because of unmodified `last_prefix_str_`). Pull Request resolved: #12872 Test Plan: In addition to DBBloomFilterTest.FilterNumEntriesCoalesce, extended DBBloomFilterTest.SeekForPrevWithPartitionedFilters to cover the broken case. (Mostly whitespace change.) Reviewed By: jowlyzhang Differential Revision: D59873793 Pulled By: pdillinger fbshipit-source-id: 2a7b7f09ca73dc188fb4dab833826ad6da7ebb11

…cebook#12872) Summary: Basically, the fix in facebook#8137 was incomplete (and I missed it in the review), because if `whole_key_filtering` is false, then `last_prefix_str_` will never be set to non-empty and the fix doesn't work. Also related to facebook#5835. This is intended as a safe, simple fix that will regress CPU efficiency slightly (for `whole_key_filtering=false` cases, because of extra prefix string copies during flush & compaction). An efficient fix is not possible without some substantial refactoring. Also in this PR: new test DBBloomFilterTest.FilterNumEntriesCoalesce tests an adjacent code path that was previously untested for its effect of ensuring the number of unique prefixes and keys is tracked properly when both prefixes and whole keys are going into a filter. (Test fails when either of the two code segments checking for duplicates is disabled.) In addition, the same test would fail before the main bug fix here because the code would inappropriately add the empty string to the filter (because of unmodified `last_prefix_str_`). Pull Request resolved: facebook#12872 Test Plan: In addition to DBBloomFilterTest.FilterNumEntriesCoalesce, extended DBBloomFilterTest.SeekForPrevWithPartitionedFilters to cover the broken case. (Mostly whitespace change.) Reviewed By: jowlyzhang Differential Revision: D59873793 Pulled By: pdillinger fbshipit-source-id: 2a7b7f09ca73dc188fb4dab833826ad6da7ebb11

myabandeh added 3 commits September 20, 2019 16:38

the temp fix

00777f0

add unit test

072adb4

update history

9f876a9

maysamyabandeh requested a review from siying September 20, 2019 23:47

facebook-github-bot added the CLA Signed label Sep 20, 2019

facebook-github-bot reviewed Sep 21, 2019

View reviewed changes

siying approved these changes Sep 21, 2019

View reviewed changes

make format

2e7d57f

apply comments

446e830

facebook-github-bot reviewed Sep 24, 2019

View reviewed changes

facebook-github-bot closed this in 6652c94 Sep 24, 2019

facebook-github-bot added the Merged label Sep 25, 2019

pdillinger mentioned this pull request Jul 17, 2024

Fix major bug with prefixes, SeekForPrev, and partitioned filters #12872

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix a bug in format_version 3 + partition filters + prefix search #5835

Fix a bug in format_version 3 + partition filters + prefix search #5835

maysamyabandeh commented Sep 20, 2019 •

edited

Loading

facebook-github-bot left a comment

siying Sep 21, 2019

siying commented Sep 21, 2019

siying commented Sep 21, 2019

facebook-github-bot commented Sep 24, 2019

facebook-github-bot commented Sep 24, 2019

facebook-github-bot left a comment

maysamyabandeh commented Sep 24, 2019

facebook-github-bot commented Sep 25, 2019

Fix a bug in format_version 3 + partition filters + prefix search #5835

Fix a bug in format_version 3 + partition filters + prefix search #5835

Conversation

maysamyabandeh commented Sep 20, 2019 • edited Loading

facebook-github-bot left a comment

Choose a reason for hiding this comment

siying Sep 21, 2019

Choose a reason for hiding this comment

siying commented Sep 21, 2019

siying commented Sep 21, 2019

facebook-github-bot commented Sep 24, 2019

facebook-github-bot commented Sep 24, 2019

facebook-github-bot left a comment

Choose a reason for hiding this comment

maysamyabandeh commented Sep 24, 2019

facebook-github-bot commented Sep 25, 2019

maysamyabandeh commented Sep 20, 2019 •

edited

Loading