-
Notifications
You must be signed in to change notification settings - Fork 6.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Draft] Add BlockBasedTableOptions::filter_construct_corruption to Options::DisableExtraCheck() #9479
[Draft] Add BlockBasedTableOptions::filter_construct_corruption to Options::DisableExtraCheck() #9479
Conversation
…entries corruption return/todo for streaming hash entry verification/keep hash entries in AddAllEntries() by default and use list
Summary: -add // -abstact out BuiltinFilterBitsReader and static cast -specfic API comment related (new) bloom filter -initialize xor_checksum in-place -remove unnecesary large write buffer size in test -bracket initialization of uint64_t from literal -inject corruption in Finish path -more realistic subtle corruption in test
BlockBasedTableOptions* table_options = | ||
table_factory->GetOptions<BlockBasedTableOptions>(); | ||
table_options->detect_filter_construct_corruption = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @pdillinger, I am learning more about how Options::DisableExtraChecks()
is used. Is opening/reopening the db with the modified option from calling Options::DisableExtraChecks() a must in order for Options::DisableExtraChecks() to take effect? This somehow relates to my question below for Mark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @mrambacher :) I want to modify one particular field BlockBasedTableOptions::detect_filter_construct_corruption
in the current option this
but can't think of ways other than modifying through the raw pointer and reset the TableFactory like my impl above. This does not seem to be a standard practice to me based on the API doc and codebase search.
However, if we always open/reopen the db with the modified option containing the modified BlockBasedTableOptions (that is, YES to my question above to @pdillinger ), then it seems fine to modify through raw pointer based on the API doc(is it?).
If not, is there any other way of achieving my goal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found myself a potential alternative is to make detect_filter_construct_corruption mutable like #8620 and dynamically set it through SetOptions()
// WARNING: do not use a filter resulted from a corrupted construction | ||
virtual Slice Finish(std::unique_ptr<const char[]>* buf, | ||
Status* /* status */) { | ||
return Finish(buf); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on how the typical API signature, a signature returning a Status would be better:
Status Finish(std::unique_ptr<...>, Slice *slice)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am reworking this signature in changes I am working on. (Also allowing a MemoryAllocator to be specified.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pdillinger Sounds good!
// | ||
// This is an extra check that is only | ||
// useful in detecting software bugs or CPU+memory malfunction. | ||
// Turning on this feature increases filter construction time by 30%. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"... may substantially increase filter construction time"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will fix in a separate tech debt PR
"filter_policy=bloomfilter:4:true;whole_key_filtering=1;detect_filter_" | ||
"construct_corruption=false;" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: This split in a funny way. Can you put the "detect_filter_construct_corrupion=false" on its own line please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will fix in a separate tech debt PR
"filter_policy=bloomfilter:4.567:false;detect_filter_construct_" | ||
"corruption=true;" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above Nit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will fix in a separate tech debt PR
assert(s.ok() || s.IsIncomplete() || s.IsCorruption()); | ||
if (s.IsCorruption()) { | ||
rep_->SetStatus(s); | ||
break; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not put the check for s.Corruption before the assert and leave the original?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will fix in a separate tech debt PR
OptionType::kBoolean, OptionVerificationType::kNormal, | ||
OptionTypeFlags::kNone}}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want to change it "on the fly" via SetOptions, it should be marked kMutable.
Will adopt the comment and update the PR later! Thanks for the early review. |
will re-open when coming back to this |
Include code in #9342 since it hasn't merged in. This is just a draft for discussion.
Context/Summary:
Turning on
BlockBasedTableOptions::filter_construct_corruption
can introduce extra cpu cost reflected in 20-30% increase in filter building time (See #9342 pr summary for more). Similar toforce_consistency_checks
in #9363, this PR bringsBlockBasedTableOptions::filter_construct_corruption
underOptions::DisableExtraCheck()
.