-
Notifications
You must be signed in to change notification settings - Fork 457
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
persist: introduce filter pushdown audit #18648
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is wise! All comments minor and nonblocking.
@@ -348,6 +349,7 @@ where | |||
/// long as necessary to ensure the `SeqNo` isn't garbage collected while a | |||
/// read still depends on it. | |||
pub(crate) leased_seqno: Option<SeqNo>, | |||
pub(crate) filter_pushdown_audit: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mild preference for this to turn into a wrapping struct in shard_source
instead of pushing it in here. (Since it's a source-specific concern, and I doubt this is the last metadata we'll want to pass along.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had the same idea when I was typing this up, but that would mean a second copy of the Serde part, the Encoded part, and the Fetched part and I don't think the verbosity is worth it. gonna push back on this one, if that's okay
@@ -405,15 +407,26 @@ where | |||
// atomically emit all parts here (e.g. no awaits). | |||
let bytes_emitted = { | |||
let mut bytes_emitted = 0; | |||
for part_desc in std::mem::take(&mut batch_parts) { | |||
for mut part_desc in std::mem::take(&mut batch_parts) { | |||
// TODO(mfp): Push the filter down into the Subscribe? | |||
if cfg.dynamic.stats_filter_enabled() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we set the audit bit even if filtering's not enabled? (Might be overkill to assert on it, but a warning would be cool.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather not run the logic if the feature flag is off. that would give us an out if e.g. the filtering took more cpu than we expect or something
if is_filter_pushdown_audit { | ||
// Ideally we'd be able to include the part stats here, but that | ||
// would require us to exchange them around. It's unclear if that's | ||
// worth it for work that's already known to be unnecessary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Threading around / logging the PartialBatchKey
might be enough to be useful, but I agree this is fine the way it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's a great idea! no guarantee that the part sticks around long enough for us to look at it, but definitely better than nothing
0e5e40a
to
8cedf09
Compare
This is a way to verify the ongoing end-to-end correctness of filter pushdown via "audit". When the filter rejects a part as completely unnecessary, we sometimes mark it with an audit bit. This means we fetch the part like normal and if the MFP keeps anything from it, then something has gone horribly wrong.
8cedf09
to
cfcf4dd
Compare
TFTR! |
This is a way to verify the ongoing end-to-end correctness of filter pushdown via "audit". When the filter rejects a part as completely unnecessary, we sometimes mark it with an audit bit. This means we fetch the part like normal and if the MFP keeps anything from it, then something has gone horribly wrong.
Touches #12684
Motivation
Tips for reviewer
I manually tested this with bin/environmentd and a bug in the impl of should_fetch
Checklist
$T ⇔ Proto$T
mapping (possibly in a backwards-incompatible way) and therefore is tagged with aT-proto
label.