New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deletes: refactored readers process deletes. #3374
Conversation
This change allows the refactored readers to process deletes conditions. For regular fragments, the delete conditions are processed only if the delete timestamp is greater than the fragment start timestamp. For fragments consolidated with timestamps, the reader needs to generate a new condition including the timestamp of the delete and apply this condition to the fragment. --- TYPE: IMPROVEMENT DESC: Deletes: refactored readers process deletes.
This pull request has been linked to Shortcut Story #19343: Deletes: refactored sparse readers applies deletes.. |
66c3191
to
bff891f
Compare
bff891f
to
201fd15
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor questions/comments inline, but LGTM.
@@ -182,6 +189,12 @@ class ReaderBase : public StrategyBase { | |||
*/ | |||
bool use_timestamps_; | |||
|
|||
/** | |||
* Boolean, per fragment, to specify that we need to load timestamps for | |||
* deletes. This matches the fragments in 'fragment_metadata_' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will also need to store this presence information for updates, right? So may be worth putting both in one struct, either now or later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same boolean can/will be used for both.
return make_timestamped_conditions; | ||
} | ||
|
||
Status ReaderBase::generate_timestamped_conditions() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(no change needed now, but:) For testability, we can consider implementing new functions like this as a transforming version (input -> output) , along with a mutating version (this one) which calls the transform. Since the functions inside return Status
, can use throw_if_not_ok
.
@@ -161,6 +161,13 @@ class ReaderBase : public StrategyBase { | |||
/** The query condition. */ | |||
QueryCondition& condition_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(general comment) I know this is pre-existing code, but seems like it would be clearer if this was held in a std::optional
instead of using condition_.empty()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good idea but major change. I'll tackle in a future refactor. https://app.shortcut.com/tiledb-inc/story/19640/make-query-condition-optional
This change allows the refactored readers to process deletes conditions.
For regular fragments, the delete conditions are processed only if the
delete timestamp is greater than the fragment start timestamp. For
fragments consolidated with timestamps, the reader needs to generate a
new condition including the timestamp of the delete and apply this
condition to the fragment.
TYPE: IMPROVEMENT
DESC: Deletes: refactored readers process deletes.