-
Notifications
You must be signed in to change notification settings - Fork 6.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Readers for partition filter #1961
Readers for partition filter #1961
Conversation
@maysamyabandeh has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
@maysamyabandeh updated the pull request - view changes - changes since last import |
@maysamyabandeh has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
@maysamyabandeh updated the pull request - view changes - changes since last import |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for delay! Haven't finish reading, but here's some comments and questions.
table/filter_block.h
Outdated
virtual size_t ApproximateMemoryUsage() const = 0; | ||
virtual size_t size() const { return size_; } | ||
virtual Statistics* statistics() const { return statistics_; } | ||
|
||
bool whole_key_filtering() const { return whole_key_filtering_; } | ||
|
||
inline int GetLevel() const { return level_; }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: remove inline keyword?
table/partitioned_filter_block.h
Outdated
Slice GetFilterPartitionHandle(const Slice& entry); | ||
BlockBasedTable::CachableEntry<FilterBlockReader> GetFilterPartition( | ||
Slice* handle, const bool no_io, bool* cached); | ||
std::map<uint64_t, FilterBlockReader*> filter_cache; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: private member names should end with underscore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use std::unordered_map instead?
table/partitioned_filter_block.cc
Outdated
kDisableGlobalSequenceNumber, | ||
0 /* read_amp_bytes_per_bit */, stats)); | ||
BlockIter iter; | ||
idx_on_fltr_blk_->NewIterator(&comparator_, &iter, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's this doing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
must be a mistake. will remote it.
table/partitioned_filter_block.h
Outdated
std::unique_ptr<Block> idx_on_fltr_blk_; | ||
const Comparator& comparator_; | ||
const BlockBasedTable* table_; | ||
Slice GetFilterPartitionHandle(const Slice& entry); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: move methods to before member variables?
if (cached) { | ||
return res; | ||
} | ||
if (LIKELY(filter_partition.IsSet())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why release the cached block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a block cache entries should always be released unless we want to pin it. here we have no intention to pin all the filter partitions in the block cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I don't know why I confuse between release and erase.
const Slice* const const_ikey_ptr) { | ||
assert(const_ikey_ptr != nullptr); | ||
assert(block_offset == kNotValid); | ||
if (!prefix_extractor_) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is the prefix extractor being use? Do we only need a boolean flag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is based on the existing impl in the full filter block: https://github.com/facebook/rocksdb/blob/master/table/full_filter_block.cc#L79
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But in full filter block the prefix extractor is actually being use. I'm not familiar with prefix match, so my question is, do you intent to let partitioned filter support prefix match, if so how is it being support right now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Partitioned Filter must be able to all that full filter does. FullFilter does it by implementing PrefixMayMatch, so does partitioned filter. The impl of full filter assumes that prefix has been added at the build time just like a normal key, so does Partitioned Filter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it.
table/partitioned_filter_block.h
Outdated
BlockBasedTable::CachableEntry<FilterBlockReader> GetFilterPartition( | ||
Slice* handle, const bool no_io, bool* cached); | ||
std::map<uint64_t, FilterBlockReader*> filter_cache; | ||
std::vector<Cache::Handle*> handle_list; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can also consider using rocksdb::autovector
table/partitioned_filter_block.cc
Outdated
auto filter = | ||
table_->GetFilter(fltr_blk_handle, is_a_filter_partition, no_io); | ||
if (pin_cached_filters && filter.IsSet()) { | ||
WriteLock wl(&mu); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems to me it is possible some other thread may cache the filter before we hold write lock. Should we double check if it is cached after holding write lock?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would not cause a correctness issue. the filter_cache would simply overwrite the value. due to low probability of this happening it is not a performance issue either. but it would be clear if we do the extra check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Look good to me overall. Will accept after comments are addressed.
@@ -1000,8 +1083,7 @@ BlockBasedTable::CachableEntry<FilterBlockReader> BlockBasedTable::GetFilter( | |||
return {nullptr /* filter */, nullptr /* cache handle */}; | |||
} | |||
|
|||
// we have a pinned filter block | |||
if (rep_->filter_entry.IsSet()) { | |||
if (!is_a_filter_partition && rep_->filter_entry.IsSet()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this go before line 1080? This branch don't depend on block cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
filter_entry refers to an entry in the block cache:
CachableEntry<FilterBlockReader> filter_entry;
if (cached) { | ||
return res; | ||
} | ||
if (LIKELY(filter_partition.IsSet())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I don't know why I confuse between release and erase.
const Slice* const const_ikey_ptr) { | ||
assert(const_ikey_ptr != nullptr); | ||
assert(block_offset == kNotValid); | ||
if (!prefix_extractor_) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But in full filter block the prefix extractor is actually being use. I'm not familiar with prefix match, so my question is, do you intent to let partitioned filter support prefix match, if so how is it being support right now?
cache_ = NewLRUCache(1, 1, false); | ||
table_options_.block_cache = cache_; | ||
table_options_.filter_policy.reset(NewBloomFilterPolicy(10, false)); | ||
table_options_.no_block_cache = true; // Otherwise BlockBasedTable::Close |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are you setting block cache (line 40-41) if you don't want it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch. will remove it.
builder->Add(keys[i]); | ||
CutABlock(pib.get(), keys[i], keys[i + 1]); | ||
i++; | ||
builder->Add(keys[i]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the purpose of adding same key twice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i do not have a particular bug in mind but if the impl of FullFilterBitsBuilder does check for two consecutive duplicate keys so partitioned filter should also be resilient against this case.
@maysamyabandeh updated the pull request - view changes - changes since last import |
This is the last split of this pull request: facebook#1891 which includes the reader part as well as the tests.
40f7b1f
to
c2b25b5
Compare
@maysamyabandeh updated the pull request - view changes - changes since last import |
@maysamyabandeh has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
@maysamyabandeh updated the pull request - view changes - changes since last import |
08991eb
to
7c6c318
Compare
@maysamyabandeh updated the pull request - view changes - changes since last import |
7c6c318
to
9e1ef29
Compare
@maysamyabandeh updated the pull request - view changes - changes since last import |
@maysamyabandeh has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
@@ -697,6 +732,7 @@ Status BlockBasedTable::Open(const ImmutableCFOptions& ioptions, | |||
if (rep->table_options.pin_l0_filter_and_index_blocks_in_cache && | |||
level == 0) { | |||
rep->filter_entry = filter_entry; | |||
rep->filter_entry.value->SetLevel(level); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this causes a segfault when filter_policy is nullptr and both pin_l0_filter_and_index_blocks_in_cache/cache_index_and_filter_blocks are set.
repro:
TEST_TMPDIR=/dev/shm/ ./db_bench -benchmarks=fillrandom -pin_l0_filter_and_index_blocks_in_cache=true -cache_index_and_filter_blocks=true
Summary: Fixes facebook#1961 which causes a segfault when filter_policy is nullptr and both pin_l0_filter_and_index_blocks_in_cache/cache_index_and_filter_blocks are set. Test Plan: TEST_TMPDIR=/data/rate_limit_bench/ ./db_bench -benchmarks=fillrandom -pin_l0_filter_and_index_blocks_in_cache=true -cache_index_and_filter_blocks=true Reviewers: Subscribers: Tasks: Tags: Blame Revision:
This is the last split of this pull request: #1891 which includes the reader part as well as the tests.