Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Draft] Add BlockBasedTableOptions::filter_construct_corruption to Options::DisableExtraCheck() #9479

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions HISTORY.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,9 @@
### Behavior Changes
* Disallow the combination of DBOptions.use_direct_io_for_flush_and_compaction == true and DBOptions.writable_file_max_buffer_size == 0. This combination can cause WritableFileWriter::Append() to loop forever, and it does not make much sense in direct IO.

## New Features
* Introduced an option `BlockBasedTableBuilder::detect_filter_construct_corruption` for detecting corruption during Bloom Filter (format_version >= 5) and Ribbon Filter construction.

## 6.29.0 (01/21/2022)
Note: The next release will be major release 7.0. See https://github.com/facebook/rocksdb/issues/9390 for more info.
### Public API change
Expand Down
312 changes: 281 additions & 31 deletions db/db_bloom_filter_test.cc

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions db_stress_tool/db_stress_common.h
Original file line number Diff line number Diff line change
Expand Up @@ -149,6 +149,7 @@ DECLARE_bool(use_block_based_filter);
DECLARE_int32(ribbon_starting_level);
DECLARE_bool(partition_filters);
DECLARE_bool(optimize_filters_for_memory);
DECLARE_bool(detect_filter_construct_corruption);
DECLARE_int32(index_type);
DECLARE_string(db);
DECLARE_string(secondaries_base);
Expand Down
6 changes: 6 additions & 0 deletions db_stress_tool/db_stress_gflags.cc
Original file line number Diff line number Diff line change
Expand Up @@ -453,6 +453,12 @@ DEFINE_bool(
ROCKSDB_NAMESPACE::BlockBasedTableOptions().optimize_filters_for_memory,
"Minimize memory footprint of filters");

DEFINE_bool(
detect_filter_construct_corruption,
ROCKSDB_NAMESPACE::BlockBasedTableOptions()
.detect_filter_construct_corruption,
"Detect corruption during new Bloom Filter and Ribbon Filter construction");

DEFINE_int32(
index_type,
static_cast<int32_t>(
Expand Down
2 changes: 2 additions & 0 deletions db_stress_tool/db_stress_test_base.cc
Original file line number Diff line number Diff line change
Expand Up @@ -2272,6 +2272,8 @@ void StressTest::Open() {
block_based_options.partition_filters = FLAGS_partition_filters;
block_based_options.optimize_filters_for_memory =
FLAGS_optimize_filters_for_memory;
block_based_options.detect_filter_construct_corruption =
FLAGS_detect_filter_construct_corruption;
block_based_options.index_type =
static_cast<BlockBasedTableOptions::IndexType>(FLAGS_index_type);
block_based_options.prepopulate_block_cache =
Expand Down
31 changes: 31 additions & 0 deletions include/rocksdb/filter_policy.h
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,37 @@ class FilterBitsBuilder {
// The ownership of actual data is set to buf
virtual Slice Finish(std::unique_ptr<const char[]>* buf) = 0;

// Similar to Finish(std::unique_ptr<const char[]>* buf), except that
// for a non-null status pointer argument, it will point to
// Status::Corruption() when there is any corruption during filter
// construction or Status::OK() otherwise.
//
// WARNING: do not use a filter resulted from a corrupted construction
virtual Slice Finish(std::unique_ptr<const char[]>* buf,
Status* /* status */) {
return Finish(buf);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on how the typical API signature, a signature returning a Status would be better:
Status Finish(std::unique_ptr<...>, Slice *slice)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am reworking this signature in changes I am working on. (Also allowing a MemoryAllocator to be specified.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pdillinger Sounds good!

}

// Verify the filter returned from calling FilterBitsBuilder::Finish.
// The function returns Status::Corruption() if there is any corruption in the
// constructed filter or Status::OK() otherwise.
//
// Implementations should normally consult
// FilterBuildingContext::table_options.detect_filter_construct_corruption
// to determine whether to perform verification or to skip by returning
// Status::OK(). The decision is left to the FilterBitsBuilder so that
// verification prerequisites before PostVerify can be skipped when not
// configured.
//
// RocksDB internal will always call MaybePostVerify() on the filter after
// it is returned from calling FilterBitsBuilder::Finish
// except for FilterBitsBuilder::Finish resulting a corruption
// status, which indicates the filter is already in a corrupted state and
// there is no need to post-verify
virtual Status MaybePostVerify(const Slice& /* filter_content */) {
return Status::OK();
}

// Approximate the number of keys that can be added and generate a filter
// <= the specified number of bytes. Callers (including RocksDB) should
// only use this result for optimizing performance and not as a guarantee.
Expand Down
16 changes: 13 additions & 3 deletions include/rocksdb/table.h
Original file line number Diff line number Diff line change
Expand Up @@ -293,16 +293,16 @@ struct BlockBasedTableOptions {
// the memory, if block cache available.
//
// Charged memory usage includes:
// 1. (new) Bloom Filter and Ribbon Filter construction
// 1. Bloom Filter (format_version >= 5) and Ribbon Filter construction
// 2. More to come...
//
// Note:
// 1. (new) Bloom Filter and Ribbon Filter construction
// 1. Bloom Filter (format_version >= 5) and Ribbon Filter construction
//
// If additional temporary memory of Ribbon Filter uses up too much memory
// relative to the avaible space left in the block cache
// at some point (i.e, causing a cache full when strict_capacity_limit =
// true), construction will fall back to (new) Bloom Filter.
// true), construction will fall back to Bloom Filter.
//
// Default: false
bool reserve_table_builder_memory = false;
Expand Down Expand Up @@ -365,6 +365,16 @@ struct BlockBasedTableOptions {
// This must generally be true for gets to be efficient.
bool whole_key_filtering = true;

// If true, detect corruption during Bloom Filter (format_version >= 5)
// and Ribbon Filter construction.
//
// This is an extra check that is only
// useful in detecting software bugs or CPU+memory malfunction.
// Turning on this feature increases filter construction time by 30%.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"... may substantially increase filter construction time"

Copy link
Contributor Author

@hx235 hx235 Feb 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix in a separate tech debt PR

//
// TODO: optimize this performance
bool detect_filter_construct_corruption = false;

// Verify that decompressing the compressed block gives back the input. This
// is a verification mode that we use to detect bugs in compression
// algorithms.
Expand Down
5 changes: 5 additions & 0 deletions options/options.cc
Original file line number Diff line number Diff line change
Expand Up @@ -482,6 +482,11 @@ Options* Options::DisableExtraChecks() {
// By current API contract, not including
// * verify_checksums
// because checking storage data integrity is a more standard practice.

BlockBasedTableOptions* table_options =
table_factory->GetOptions<BlockBasedTableOptions>();
table_options->detect_filter_construct_corruption = false;
Comment on lines +486 to +488
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @pdillinger, I am learning more about how Options::DisableExtraChecks() is used. Is opening/reopening the db with the modified option from calling Options::DisableExtraChecks() a must in order for Options::DisableExtraChecks() to take effect? This somehow relates to my question below for Mark.

Copy link
Contributor Author

@hx235 hx235 Feb 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @mrambacher :) I want to modify one particular field BlockBasedTableOptions::detect_filter_construct_corruption in the current option this but can't think of ways other than modifying through the raw pointer and reset the TableFactory like my impl above. This does not seem to be a standard practice to me based on the API doc and codebase search.

However, if we always open/reopen the db with the modified option containing the modified BlockBasedTableOptions (that is, YES to my question above to @pdillinger ), then it seems fine to modify through raw pointer based on the API doc(is it?).

If not, is there any other way of achieving my goal?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found myself a potential alternative is to make detect_filter_construct_corruption mutable like #8620 and dynamically set it through SetOptions()

table_factory.reset(new BlockBasedTableFactory(*table_options));
return this;
}

Expand Down
3 changes: 2 additions & 1 deletion options/options_settable_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -174,7 +174,8 @@ TEST_F(OptionsSettableTest, BlockBasedTableOptionsAllFieldsSettable) {
"partition_filters=false;"
"optimize_filters_for_memory=true;"
"index_block_restart_interval=4;"
"filter_policy=bloomfilter:4:true;whole_key_filtering=1;"
"filter_policy=bloomfilter:4:true;whole_key_filtering=1;detect_filter_"
"construct_corruption=false;"
Comment on lines +177 to +178
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This split in a funny way. Can you put the "detect_filter_construct_corrupion=false" on its own line please?

Copy link
Contributor Author

@hx235 hx235 Feb 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix in a separate tech debt PR

"reserve_table_builder_memory=false;"
"format_version=1;"
"hash_index_allow_collision=false;"
Expand Down
4 changes: 3 additions & 1 deletion options/options_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -856,7 +856,8 @@ TEST_F(OptionsTest, GetBlockBasedTableOptionsFromString) {
"block_size_deviation=8;block_restart_interval=4;"
"format_version=5;whole_key_filtering=1;"
"reserve_table_builder_memory=true;"
"filter_policy=bloomfilter:4.567:false;"
"filter_policy=bloomfilter:4.567:false;detect_filter_construct_"
"corruption=true;"
Comment on lines +859 to +860
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above Nit

Copy link
Contributor Author

@hx235 hx235 Feb 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix in a separate tech debt PR

// A bug caused read_amp_bytes_per_bit to be a large integer in OPTIONS
// file generated by 6.10 to 6.14. Though bug is fixed in these releases,
// we need to handle the case of loading OPTIONS file generated before the
Expand All @@ -876,6 +877,7 @@ TEST_F(OptionsTest, GetBlockBasedTableOptionsFromString) {
ASSERT_EQ(new_opt.block_restart_interval, 4);
ASSERT_EQ(new_opt.format_version, 5U);
ASSERT_EQ(new_opt.whole_key_filtering, true);
ASSERT_EQ(new_opt.detect_filter_construct_corruption, true);
ASSERT_EQ(new_opt.reserve_table_builder_memory, true);
ASSERT_TRUE(new_opt.filter_policy != nullptr);
const BloomFilterPolicy* bfp =
Expand Down
17 changes: 16 additions & 1 deletion table/block_based/block_based_table_builder.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1252,6 +1252,15 @@ void BlockBasedTableBuilder::WriteRawBlock(const Slice& block_contents,
uint32_t checksum = ComputeBuiltinChecksumWithLastByte(
r->table_options.checksum, block_contents.data(), block_contents.size(),
/*last_byte*/ type);

if (block_type == BlockType::kFilter) {
Status s = r->filter_builder->MaybePostVerifyFilter(block_contents);
if (!s.ok()) {
r->SetStatus(s);
return;
}
}

EncodeFixed32(trailer.data() + 1, checksum);
TEST_SYNC_POINT_CALLBACK(
"BlockBasedTableBuilder::WriteRawBlock:TamperWithChecksum",
Expand Down Expand Up @@ -1552,7 +1561,13 @@ void BlockBasedTableBuilder::WriteFilterBlock(
std::unique_ptr<const char[]> filter_data;
Slice filter_content =
rep_->filter_builder->Finish(filter_block_handle, &s, &filter_data);
assert(s.ok() || s.IsIncomplete());

assert(s.ok() || s.IsIncomplete() || s.IsCorruption());
if (s.IsCorruption()) {
rep_->SetStatus(s);
break;
}
Comment on lines +1565 to +1569
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not put the check for s.Corruption before the assert and leave the original?

Copy link
Contributor Author

@hx235 hx235 Feb 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix in a separate tech debt PR


rep_->props.filter_size += filter_content.size();

// TODO: Refactor code so that BlockType can determine both the C++ type
Expand Down
5 changes: 5 additions & 0 deletions table/block_based/block_based_table_factory.cc
Original file line number Diff line number Diff line change
Expand Up @@ -356,6 +356,11 @@ static std::unordered_map<std::string, OptionTypeInfo>
{offsetof(struct BlockBasedTableOptions, whole_key_filtering),
OptionType::kBoolean, OptionVerificationType::kNormal,
OptionTypeFlags::kNone}},
{"detect_filter_construct_corruption",
{offsetof(struct BlockBasedTableOptions,
detect_filter_construct_corruption),
OptionType::kBoolean, OptionVerificationType::kNormal,
OptionTypeFlags::kNone}},
Comment on lines +362 to +363
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to change it "on the fly" via SetOptions, it should be marked kMutable.

{"reserve_table_builder_memory",
{offsetof(struct BlockBasedTableOptions, reserve_table_builder_memory),
OptionType::kBoolean, OptionVerificationType::kNormal,
Expand Down
12 changes: 10 additions & 2 deletions table/block_based/filter_block.h
Original file line number Diff line number Diff line change
Expand Up @@ -83,9 +83,17 @@ class FilterBlockBuilder {
,
Status* status, std::unique_ptr<const char[]>* filter_data = nullptr) = 0;

// It is for releasing the memory usage and cache reservation of filter bits
// builder in FullFilter and PartitionedFilter
// This is called when finishes using the FilterBitsBuilder
// in order to release memory usage and cache reservation
// associated with it timely
virtual void ResetFilterBitsBuilder() {}

// To optionally post-verify the filter returned from
// FilterBlockBuilder::Finish.
// Return Status::OK() if skipped.
virtual Status MaybePostVerifyFilter(const Slice& /* filter_content */) {
return Status::OK();
}
};

// A FilterBlockReader is used to parse filter from SST table.
Expand Down
Loading