-
Notifications
You must be signed in to change notification settings - Fork 6.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Draft] Add BlockBasedTableOptions::filter_construct_corruption to Options::DisableExtraCheck() #9479
[Draft] Add BlockBasedTableOptions::filter_construct_corruption to Options::DisableExtraCheck() #9479
Changes from all commits
614dff4
495437b
49dfa1f
d91c3f8
3935564
8899644
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -293,16 +293,16 @@ struct BlockBasedTableOptions { | |
// the memory, if block cache available. | ||
// | ||
// Charged memory usage includes: | ||
// 1. (new) Bloom Filter and Ribbon Filter construction | ||
// 1. Bloom Filter (format_version >= 5) and Ribbon Filter construction | ||
// 2. More to come... | ||
// | ||
// Note: | ||
// 1. (new) Bloom Filter and Ribbon Filter construction | ||
// 1. Bloom Filter (format_version >= 5) and Ribbon Filter construction | ||
// | ||
// If additional temporary memory of Ribbon Filter uses up too much memory | ||
// relative to the avaible space left in the block cache | ||
// at some point (i.e, causing a cache full when strict_capacity_limit = | ||
// true), construction will fall back to (new) Bloom Filter. | ||
// true), construction will fall back to Bloom Filter. | ||
// | ||
// Default: false | ||
bool reserve_table_builder_memory = false; | ||
|
@@ -365,6 +365,16 @@ struct BlockBasedTableOptions { | |
// This must generally be true for gets to be efficient. | ||
bool whole_key_filtering = true; | ||
|
||
// If true, detect corruption during Bloom Filter (format_version >= 5) | ||
// and Ribbon Filter construction. | ||
// | ||
// This is an extra check that is only | ||
// useful in detecting software bugs or CPU+memory malfunction. | ||
// Turning on this feature increases filter construction time by 30%. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "... may substantially increase filter construction time" There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Will fix in a separate tech debt PR |
||
// | ||
// TODO: optimize this performance | ||
bool detect_filter_construct_corruption = false; | ||
|
||
// Verify that decompressing the compressed block gives back the input. This | ||
// is a verification mode that we use to detect bugs in compression | ||
// algorithms. | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -482,6 +482,11 @@ Options* Options::DisableExtraChecks() { | |
// By current API contract, not including | ||
// * verify_checksums | ||
// because checking storage data integrity is a more standard practice. | ||
|
||
BlockBasedTableOptions* table_options = | ||
table_factory->GetOptions<BlockBasedTableOptions>(); | ||
table_options->detect_filter_construct_corruption = false; | ||
Comment on lines
+486
to
+488
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hey @pdillinger, I am learning more about how There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hi @mrambacher :) I want to modify one particular field However, if we always open/reopen the db with the modified option containing the modified BlockBasedTableOptions (that is, YES to my question above to @pdillinger ), then it seems fine to modify through raw pointer based on the API doc(is it?). If not, is there any other way of achieving my goal? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Found myself a potential alternative is to make detect_filter_construct_corruption mutable like #8620 and dynamically set it through SetOptions() |
||
table_factory.reset(new BlockBasedTableFactory(*table_options)); | ||
return this; | ||
} | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -174,7 +174,8 @@ TEST_F(OptionsSettableTest, BlockBasedTableOptionsAllFieldsSettable) { | |
"partition_filters=false;" | ||
"optimize_filters_for_memory=true;" | ||
"index_block_restart_interval=4;" | ||
"filter_policy=bloomfilter:4:true;whole_key_filtering=1;" | ||
"filter_policy=bloomfilter:4:true;whole_key_filtering=1;detect_filter_" | ||
"construct_corruption=false;" | ||
Comment on lines
+177
to
+178
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nit: This split in a funny way. Can you put the "detect_filter_construct_corrupion=false" on its own line please? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Will fix in a separate tech debt PR |
||
"reserve_table_builder_memory=false;" | ||
"format_version=1;" | ||
"hash_index_allow_collision=false;" | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -856,7 +856,8 @@ TEST_F(OptionsTest, GetBlockBasedTableOptionsFromString) { | |
"block_size_deviation=8;block_restart_interval=4;" | ||
"format_version=5;whole_key_filtering=1;" | ||
"reserve_table_builder_memory=true;" | ||
"filter_policy=bloomfilter:4.567:false;" | ||
"filter_policy=bloomfilter:4.567:false;detect_filter_construct_" | ||
"corruption=true;" | ||
Comment on lines
+859
to
+860
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. See above Nit There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Will fix in a separate tech debt PR |
||
// A bug caused read_amp_bytes_per_bit to be a large integer in OPTIONS | ||
// file generated by 6.10 to 6.14. Though bug is fixed in these releases, | ||
// we need to handle the case of loading OPTIONS file generated before the | ||
|
@@ -876,6 +877,7 @@ TEST_F(OptionsTest, GetBlockBasedTableOptionsFromString) { | |
ASSERT_EQ(new_opt.block_restart_interval, 4); | ||
ASSERT_EQ(new_opt.format_version, 5U); | ||
ASSERT_EQ(new_opt.whole_key_filtering, true); | ||
ASSERT_EQ(new_opt.detect_filter_construct_corruption, true); | ||
ASSERT_EQ(new_opt.reserve_table_builder_memory, true); | ||
ASSERT_TRUE(new_opt.filter_policy != nullptr); | ||
const BloomFilterPolicy* bfp = | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1252,6 +1252,15 @@ void BlockBasedTableBuilder::WriteRawBlock(const Slice& block_contents, | |
uint32_t checksum = ComputeBuiltinChecksumWithLastByte( | ||
r->table_options.checksum, block_contents.data(), block_contents.size(), | ||
/*last_byte*/ type); | ||
|
||
if (block_type == BlockType::kFilter) { | ||
Status s = r->filter_builder->MaybePostVerifyFilter(block_contents); | ||
if (!s.ok()) { | ||
r->SetStatus(s); | ||
return; | ||
} | ||
} | ||
|
||
EncodeFixed32(trailer.data() + 1, checksum); | ||
TEST_SYNC_POINT_CALLBACK( | ||
"BlockBasedTableBuilder::WriteRawBlock:TamperWithChecksum", | ||
|
@@ -1552,7 +1561,13 @@ void BlockBasedTableBuilder::WriteFilterBlock( | |
std::unique_ptr<const char[]> filter_data; | ||
Slice filter_content = | ||
rep_->filter_builder->Finish(filter_block_handle, &s, &filter_data); | ||
assert(s.ok() || s.IsIncomplete()); | ||
|
||
assert(s.ok() || s.IsIncomplete() || s.IsCorruption()); | ||
if (s.IsCorruption()) { | ||
rep_->SetStatus(s); | ||
break; | ||
} | ||
Comment on lines
+1565
to
+1569
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why not put the check for s.Corruption before the assert and leave the original? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Will fix in a separate tech debt PR |
||
|
||
rep_->props.filter_size += filter_content.size(); | ||
|
||
// TODO: Refactor code so that BlockType can determine both the C++ type | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -356,6 +356,11 @@ static std::unordered_map<std::string, OptionTypeInfo> | |
{offsetof(struct BlockBasedTableOptions, whole_key_filtering), | ||
OptionType::kBoolean, OptionVerificationType::kNormal, | ||
OptionTypeFlags::kNone}}, | ||
{"detect_filter_construct_corruption", | ||
{offsetof(struct BlockBasedTableOptions, | ||
detect_filter_construct_corruption), | ||
OptionType::kBoolean, OptionVerificationType::kNormal, | ||
OptionTypeFlags::kNone}}, | ||
Comment on lines
+362
to
+363
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If you want to change it "on the fly" via SetOptions, it should be marked kMutable. |
||
{"reserve_table_builder_memory", | ||
{offsetof(struct BlockBasedTableOptions, reserve_table_builder_memory), | ||
OptionType::kBoolean, OptionVerificationType::kNormal, | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on how the typical API signature, a signature returning a Status would be better:
Status Finish(std::unique_ptr<...>, Slice *slice)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am reworking this signature in changes I am working on. (Also allowing a MemoryAllocator to be specified.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pdillinger Sounds good!