-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KA/LA sstable reader doesn't emit partition_end and partition_start #4786
Comments
Fixed by 9b8ac5e |
…ss to emit partition_end Currently, if there is a fragment in _ready and _out_of_range was set after row end was consumer, push_ready_fragments() would return without emitting partition_end. This is problematic once we make consume_row_start() emit partiton_start directly, because we will want to assume that all fragments for the previous partition are emitted by then. If they're not, then we'd emit partition_start before partition_end for the previous partition. The fix is to make sure that push_ready_fragments() emits everything. Fixes #4786 (cherry picked from commit 9b8ac5e) Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
…ss to emit partition_end Currently, if there is a fragment in _ready and _out_of_range was set after row end was consumer, push_ready_fragments() would return without emitting partition_end. This is problematic once we make consume_row_start() emit partiton_start directly, because we will want to assume that all fragments for the previous partition are emitted by then. If they're not, then we'd emit partition_start before partition_end for the previous partition. The fix is to make sure that push_ready_fragments() emits everything. Fixes #4786 (cherry picked from commit 9b8ac5e) Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
…ss to emit partition_end (#4790) Currently, if there is a fragment in _ready and _out_of_range was set after row end was consumer, push_ready_fragments() would return without emitting partition_end. This is problematic once we make consume_row_start() emit partiton_start directly, because we will want to assume that all fragments for the previous partition are emitted by then. If they're not, then we'd emit partition_start before partition_end for the previous partition. The fix is to make sure that push_ready_fragments() emits everything. Fixes #4786 (cherry picked from commit 9b8ac5e) Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
@haaawk Can you explain why didn't we run into this bug all the time? |
@tgrabiec It requires partition ending with range tombstone and a row plus it has to happen at the end of the flat mutation reader buffer. The chances are relatively low. Moreover, this bug can go unnoticed if the fragments of the next partition are all after the last fragment of the previous partition. Such situation creates a valid sstable but data inside is wrong as some rows have migrated from one partition to the other. |
@haaawk if a user does not have range tombstones - does that mean he will not hit the bug ? |
@slivne That's my understanding. If there's no range tombstone then consuming row end will emit row that's stored in _in_progress and will leave _ready empty so push_ready_fragments won't pick this branch and instead emit partition_end by picking this branch. |
@haaawk Following Shlomi's question: can range tombstones be generated only by range DELETE operations or there are other possibilities that could lead to appearance of range tombstones, e.g. can a few tombstones of consequent tokens with the same time stamp get merged into a single range tombstone? Or probably a TRUNCATE operation? |
@vladzcloudius I don't think consecutive tombstones are ever combined into range tombstone. Same with truncate. In general, I'm not aware of any other way to create range tombstone but DELETE operation. |
@haaawk Thanks, sir! |
mp_row_consumer_k_l::push_ready_fragments has a bug that when _ready is set it will only call mp_row_consumer_k_l::push_ready_fragments_with_ready_set and no mp_row_consumer_k_l::push_ready_fragments_out_of_range.
That leads to sstable_mutation_reader::fill_buffer reading the content of the next partition without closing the previous one.
The text was updated successfully, but these errors were encountered: