Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Readers for partition filter #1961

Conversation

maysamyabandeh
Copy link
Contributor

This is the last split of this pull request: #1891 which includes the reader part as well as the tests.

@facebook-github-bot
Copy link
Contributor

@maysamyabandeh has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@maysamyabandeh updated the pull request - view changes - changes since last import

@facebook-github-bot
Copy link
Contributor

@maysamyabandeh has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@maysamyabandeh updated the pull request - view changes - changes since last import

Copy link
Contributor

@yiwu-arbug yiwu-arbug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for delay! Haven't finish reading, but here's some comments and questions.

virtual size_t ApproximateMemoryUsage() const = 0;
virtual size_t size() const { return size_; }
virtual Statistics* statistics() const { return statistics_; }

bool whole_key_filtering() const { return whole_key_filtering_; }

inline int GetLevel() const { return level_; };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove inline keyword?

Slice GetFilterPartitionHandle(const Slice& entry);
BlockBasedTable::CachableEntry<FilterBlockReader> GetFilterPartition(
Slice* handle, const bool no_io, bool* cached);
std::map<uint64_t, FilterBlockReader*> filter_cache;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: private member names should end with underscore.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use std::unordered_map instead?

kDisableGlobalSequenceNumber,
0 /* read_amp_bytes_per_bit */, stats));
BlockIter iter;
idx_on_fltr_blk_->NewIterator(&comparator_, &iter, true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's this doing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

must be a mistake. will remote it.

std::unique_ptr<Block> idx_on_fltr_blk_;
const Comparator& comparator_;
const BlockBasedTable* table_;
Slice GetFilterPartitionHandle(const Slice& entry);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: move methods to before member variables?

if (cached) {
return res;
}
if (LIKELY(filter_partition.IsSet())) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why release the cached block?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a block cache entries should always be released unless we want to pin it. here we have no intention to pin all the filter partitions in the block cache.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I don't know why I confuse between release and erase.

const Slice* const const_ikey_ptr) {
assert(const_ikey_ptr != nullptr);
assert(block_offset == kNotValid);
if (!prefix_extractor_) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is the prefix extractor being use? Do we only need a boolean flag?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is based on the existing impl in the full filter block: https://github.com/facebook/rocksdb/blob/master/table/full_filter_block.cc#L79

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But in full filter block the prefix extractor is actually being use. I'm not familiar with prefix match, so my question is, do you intent to let partitioned filter support prefix match, if so how is it being support right now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Partitioned Filter must be able to all that full filter does. FullFilter does it by implementing PrefixMayMatch, so does partitioned filter. The impl of full filter assumes that prefix has been added at the build time just like a normal key, so does Partitioned Filter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it.

BlockBasedTable::CachableEntry<FilterBlockReader> GetFilterPartition(
Slice* handle, const bool no_io, bool* cached);
std::map<uint64_t, FilterBlockReader*> filter_cache;
std::vector<Cache::Handle*> handle_list;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can also consider using rocksdb::autovector

auto filter =
table_->GetFilter(fltr_blk_handle, is_a_filter_partition, no_io);
if (pin_cached_filters && filter.IsSet()) {
WriteLock wl(&mu);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to me it is possible some other thread may cache the filter before we hold write lock. Should we double check if it is cached after holding write lock?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would not cause a correctness issue. the filter_cache would simply overwrite the value. due to low probability of this happening it is not a performance issue either. but it would be clear if we do the extra check.

Copy link
Contributor

@yiwu-arbug yiwu-arbug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look good to me overall. Will accept after comments are addressed.

@@ -1000,8 +1083,7 @@ BlockBasedTable::CachableEntry<FilterBlockReader> BlockBasedTable::GetFilter(
return {nullptr /* filter */, nullptr /* cache handle */};
}

// we have a pinned filter block
if (rep_->filter_entry.IsSet()) {
if (!is_a_filter_partition && rep_->filter_entry.IsSet()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this go before line 1080? This branch don't depend on block cache.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filter_entry refers to an entry in the block cache:

CachableEntry<FilterBlockReader> filter_entry;

if (cached) {
return res;
}
if (LIKELY(filter_partition.IsSet())) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I don't know why I confuse between release and erase.

const Slice* const const_ikey_ptr) {
assert(const_ikey_ptr != nullptr);
assert(block_offset == kNotValid);
if (!prefix_extractor_) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But in full filter block the prefix extractor is actually being use. I'm not familiar with prefix match, so my question is, do you intent to let partitioned filter support prefix match, if so how is it being support right now?

cache_ = NewLRUCache(1, 1, false);
table_options_.block_cache = cache_;
table_options_.filter_policy.reset(NewBloomFilterPolicy(10, false));
table_options_.no_block_cache = true; // Otherwise BlockBasedTable::Close
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are you setting block cache (line 40-41) if you don't want it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch. will remove it.

builder->Add(keys[i]);
CutABlock(pib.get(), keys[i], keys[i + 1]);
i++;
builder->Add(keys[i]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the purpose of adding same key twice?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i do not have a particular bug in mind but if the impl of FullFilterBitsBuilder does check for two consecutive duplicate keys so partitioned filter should also be resilient against this case.

@facebook-github-bot
Copy link
Contributor

@maysamyabandeh updated the pull request - view changes - changes since last import

@facebook-github-bot
Copy link
Contributor

@maysamyabandeh updated the pull request - view changes - changes since last import

@facebook-github-bot
Copy link
Contributor

@maysamyabandeh has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@maysamyabandeh updated the pull request - view changes - changes since last import

@facebook-github-bot
Copy link
Contributor

@maysamyabandeh updated the pull request - view changes - changes since last import

@facebook-github-bot
Copy link
Contributor

@maysamyabandeh updated the pull request - view changes - changes since last import

@facebook-github-bot
Copy link
Contributor

@maysamyabandeh has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@@ -697,6 +732,7 @@ Status BlockBasedTable::Open(const ImmutableCFOptions& ioptions,
if (rep->table_options.pin_l0_filter_and_index_blocks_in_cache &&
level == 0) {
rep->filter_entry = filter_entry;
rep->filter_entry.value->SetLevel(level);
Copy link
Contributor

@ajkr ajkr Mar 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this causes a segfault when filter_policy is nullptr and both pin_l0_filter_and_index_blocks_in_cache/cache_index_and_filter_blocks are set.

repro:
TEST_TMPDIR=/dev/shm/ ./db_bench -benchmarks=fillrandom -pin_l0_filter_and_index_blocks_in_cache=true -cache_index_and_filter_blocks=true

maysamyabandeh pushed a commit to maysamyabandeh/rocksdb that referenced this pull request Mar 23, 2017
Summary:
Fixes facebook#1961 which causes a segfault when filter_policy is nullptr and both
pin_l0_filter_and_index_blocks_in_cache/cache_index_and_filter_blocks
are set.

Test Plan:
TEST_TMPDIR=/data/rate_limit_bench/ ./db_bench -benchmarks=fillrandom
-pin_l0_filter_and_index_blocks_in_cache=true
-cache_index_and_filter_blocks=true

Reviewers:

Subscribers:

Tasks:

Tags:

Blame Revision:
facebook-github-bot pushed a commit that referenced this pull request Mar 25, 2017
Summary:
Fixes #1961 which causes a segfault when filter_policy is nullptr and both
pin_l0_filter_and_index_blocks_in_cache/cache_index_and_filter_blocks
are set.
Closes #2029

Differential Revision: D4764862

Pulled By: maysamyabandeh

fbshipit-source-id: 05bd695
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants