Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KA/LA sstable reader doesn't emit partition_end and partition_start #4786

Closed
haaawk opened this issue Aug 1, 2019 · 8 comments
Closed

KA/LA sstable reader doesn't emit partition_end and partition_start #4786

haaawk opened this issue Aug 1, 2019 · 8 comments

Comments

@haaawk
Copy link
Contributor

haaawk commented Aug 1, 2019

mp_row_consumer_k_l::push_ready_fragments has a bug that when _ready is set it will only call mp_row_consumer_k_l::push_ready_fragments_with_ready_set and no mp_row_consumer_k_l::push_ready_fragments_out_of_range.

That leads to sstable_mutation_reader::fill_buffer reading the content of the next partition without closing the previous one.

@haaawk
Copy link
Contributor Author

haaawk commented Aug 1, 2019

Fixed by 9b8ac5e

@haaawk haaawk closed this as completed Aug 1, 2019
avikivity pushed a commit that referenced this issue Aug 1, 2019
…ss to emit partition_end

Currently, if there is a fragment in _ready and _out_of_range was set
after row end was consumer, push_ready_fragments() would return
without emitting partition_end.

This is problematic once we make consume_row_start() emit
partiton_start directly, because we will want to assume that all
fragments for the previous partition are emitted by then. If they're
not, then we'd emit partition_start before partition_end for the
previous partition. The fix is to make sure that
push_ready_fragments() emits everything.

Fixes #4786

(cherry picked from commit 9b8ac5e)
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
avikivity pushed a commit that referenced this issue Aug 1, 2019
…ss to emit partition_end

Currently, if there is a fragment in _ready and _out_of_range was set
after row end was consumer, push_ready_fragments() would return
without emitting partition_end.

This is problematic once we make consume_row_start() emit
partiton_start directly, because we will want to assume that all
fragments for the previous partition are emitted by then. If they're
not, then we'd emit partition_start before partition_end for the
previous partition. The fix is to make sure that
push_ready_fragments() emits everything.

Fixes #4786

(cherry picked from commit 9b8ac5e)
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
avikivity pushed a commit that referenced this issue Aug 5, 2019
…ss to emit partition_end (#4790)

Currently, if there is a fragment in _ready and _out_of_range was set
after row end was consumer, push_ready_fragments() would return
without emitting partition_end.

This is problematic once we make consume_row_start() emit
partiton_start directly, because we will want to assume that all
fragments for the previous partition are emitted by then. If they're
not, then we'd emit partition_start before partition_end for the
previous partition. The fix is to make sure that
push_ready_fragments() emits everything.

Fixes #4786

(cherry picked from commit 9b8ac5e)
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
@tgrabiec
Copy link
Contributor

tgrabiec commented Aug 5, 2019

@haaawk Can you explain why didn't we run into this bug all the time?

@haaawk
Copy link
Contributor Author

haaawk commented Aug 5, 2019

@tgrabiec It requires partition ending with range tombstone and a row plus it has to happen at the end of the flat mutation reader buffer. The chances are relatively low. Moreover, this bug can go unnoticed if the fragments of the next partition are all after the last fragment of the previous partition. Such situation creates a valid sstable but data inside is wrong as some rows have migrated from one partition to the other.

@slivne
Copy link
Contributor

slivne commented Aug 6, 2019

@haaawk if a user does not have range tombstones - does that mean he will not hit the bug ?

@haaawk
Copy link
Contributor Author

haaawk commented Aug 6, 2019

@slivne That's my understanding. If there's no range tombstone then consuming row end will emit row that's stored in _in_progress and will leave _ready empty so push_ready_fragments won't pick this branch and instead emit partition_end by picking this branch.

@vladzcloudius
Copy link
Contributor

vladzcloudius commented Sep 17, 2019

@haaawk Following Shlomi's question: can range tombstones be generated only by range DELETE operations or there are other possibilities that could lead to appearance of range tombstones, e.g. can a few tombstones of consequent tokens with the same time stamp get merged into a single range tombstone?

Or probably a TRUNCATE operation?

@haaawk
Copy link
Contributor Author

haaawk commented Sep 17, 2019

@vladzcloudius I don't think consecutive tombstones are ever combined into range tombstone. Same with truncate. In general, I'm not aware of any other way to create range tombstone but DELETE operation.

@vladzcloudius
Copy link
Contributor

@haaawk Thanks, sir!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants