Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Draft] Add BlockBasedTableOptions::filter_construct_corruption to Options::DisableExtraCheck() #9479

Closed

Conversation

hx235
Copy link
Contributor

@hx235 hx235 commented Feb 1, 2022

Include code in #9342 since it hasn't merged in. This is just a draft for discussion.

Context/Summary:
Turning on BlockBasedTableOptions::filter_construct_corruption can introduce extra cpu cost reflected in 20-30% increase in filter building time (See #9342 pr summary for more). Similar to force_consistency_checks in #9363, this PR brings BlockBasedTableOptions::filter_construct_corruption under Options::DisableExtraCheck().

…entries corruption return/todo for streaming hash entry verification/keep hash entries in AddAllEntries() by default and use list
Summary:
-add //
-abstact out BuiltinFilterBitsReader and static cast
-specfic API comment related (new) bloom filter
-initialize xor_checksum in-place
-remove unnecesary large write buffer size in test
-bracket initialization of uint64_t from literal
-inject corruption in Finish path
-more realistic subtle corruption in test
Summary:
-Fix AlwaysTrueFilter length
-Static GetBuiltinFilterBitsReader in BuiltinFilterPolicy
-Fix API comment related to DisableExtraChecks

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
Comment on lines +486 to +488
BlockBasedTableOptions* table_options =
table_factory->GetOptions<BlockBasedTableOptions>();
table_options->detect_filter_construct_corruption = false;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @pdillinger, I am learning more about how Options::DisableExtraChecks() is used. Is opening/reopening the db with the modified option from calling Options::DisableExtraChecks() a must in order for Options::DisableExtraChecks() to take effect? This somehow relates to my question below for Mark.

Copy link
Contributor Author

@hx235 hx235 Feb 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @mrambacher :) I want to modify one particular field BlockBasedTableOptions::detect_filter_construct_corruption in the current option this but can't think of ways other than modifying through the raw pointer and reset the TableFactory like my impl above. This does not seem to be a standard practice to me based on the API doc and codebase search.

However, if we always open/reopen the db with the modified option containing the modified BlockBasedTableOptions (that is, YES to my question above to @pdillinger ), then it seems fine to modify through raw pointer based on the API doc(is it?).

If not, is there any other way of achieving my goal?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found myself a potential alternative is to make detect_filter_construct_corruption mutable like #8620 and dynamically set it through SetOptions()

// WARNING: do not use a filter resulted from a corrupted construction
virtual Slice Finish(std::unique_ptr<const char[]>* buf,
Status* /* status */) {
return Finish(buf);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on how the typical API signature, a signature returning a Status would be better:
Status Finish(std::unique_ptr<...>, Slice *slice)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am reworking this signature in changes I am working on. (Also allowing a MemoryAllocator to be specified.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pdillinger Sounds good!

//
// This is an extra check that is only
// useful in detecting software bugs or CPU+memory malfunction.
// Turning on this feature increases filter construction time by 30%.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"... may substantially increase filter construction time"

Copy link
Contributor Author

@hx235 hx235 Feb 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix in a separate tech debt PR

Comment on lines +177 to +178
"filter_policy=bloomfilter:4:true;whole_key_filtering=1;detect_filter_"
"construct_corruption=false;"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This split in a funny way. Can you put the "detect_filter_construct_corrupion=false" on its own line please?

Copy link
Contributor Author

@hx235 hx235 Feb 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix in a separate tech debt PR

Comment on lines +859 to +860
"filter_policy=bloomfilter:4.567:false;detect_filter_construct_"
"corruption=true;"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above Nit

Copy link
Contributor Author

@hx235 hx235 Feb 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix in a separate tech debt PR

Comment on lines +1565 to +1569
assert(s.ok() || s.IsIncomplete() || s.IsCorruption());
if (s.IsCorruption()) {
rep_->SetStatus(s);
break;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not put the check for s.Corruption before the assert and leave the original?

Copy link
Contributor Author

@hx235 hx235 Feb 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix in a separate tech debt PR

Comment on lines +362 to +363
OptionType::kBoolean, OptionVerificationType::kNormal,
OptionTypeFlags::kNone}},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to change it "on the fly" via SetOptions, it should be marked kMutable.

@hx235
Copy link
Contributor Author

hx235 commented Feb 15, 2022

Will adopt the comment and update the PR later! Thanks for the early review.

@hx235
Copy link
Contributor Author

hx235 commented Apr 5, 2022

will re-open when coming back to this

@hx235 hx235 closed this Apr 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants