New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Commitlog replay can cause abort due to over-extended skip #15269
Labels
area/commitlog
Issues related to the commit log.
Field-Tier2
Important issues as per FieldEngineering request
Milestone
Comments
elcallio
pushed a commit
to elcallio/scylla
that referenced
this issue
Sep 4, 2023
Fixes scylladb#15269 If segment being replayed is corrupted/truncated we can attempt skipping completely bogues byte amounts, which can cause assert (i.e. crash) in file_data_source_impl. This is not a crash-level error, so ensure we range check the distance in the reader.
elcallio
pushed a commit
to elcallio/scylla
that referenced
this issue
Sep 4, 2023
Fixes scylladb#15269 If segment being replayed is corrupted/truncated we can attempt skipping completely bogues byte amounts, which can cause assert (i.e. crash) in file_data_source_impl. This is not a crash-level error, so ensure we range check the distance in the reader.
mykaul
added
Backport candidate
backport/5.2
Issues that should be backported to 5.2 branch once they'll be fixed
Requires-Backport-to-5.1
labels
Sep 4, 2023
elcallio
pushed a commit
to elcallio/scylla
that referenced
this issue
Sep 4, 2023
Fixes scylladb#15269 If segment being replayed is corrupted/truncated we can attempt skipping completely bogues byte amounts, which can cause assert (i.e. crash) in file_data_source_impl. This is not a crash-level error, so ensure we range check the distance in the reader. v2: Add to corrupt_size if trying to skip more than available. The amount added is "wrong", but at least will ensure we log the fact that things are broken
elcallio
pushed a commit
to elcallio/scylla
that referenced
this issue
Sep 27, 2023
Fixes scylladb#15269 If segment being replayed is corrupted/truncated we can attempt skipping completely bogues byte amounts, which can cause assert (i.e. crash) in file_data_source_impl. This is not a crash-level error, so ensure we range check the distance in the reader. v2: Add to corrupt_size if trying to skip more than available. The amount added is "wrong", but at least will ensure we log the fact that things are broken
elcallio
pushed a commit
to elcallio/scylla
that referenced
this issue
Oct 31, 2023
Fixes scylladb#15269 If segment being replayed is corrupted/truncated we can attempt skipping completely bogues byte amounts, which can cause assert (i.e. crash) in file_data_source_impl. This is not a crash-level error, so ensure we range check the distance in the reader. v2: Add to corrupt_size if trying to skip more than available. The amount added is "wrong", but at least will ensure we log the fact that things are broken
mykaul
added
the
backport/5.4
Issues that should be backported to 5.4 branch once they'll be fixed
label
Nov 20, 2023
elcallio
pushed a commit
to elcallio/scylla
that referenced
this issue
Dec 4, 2023
Refs scylladb#15269 Unit test to check that trying to skip past EOF in a borked segment will not crash the process.
elcallio
pushed a commit
to elcallio/scylla
that referenced
this issue
Dec 4, 2023
Refs scylladb#15269 Unit test to check that trying to skip past EOF in a borked segment will not crash the process. file_data_input_impl asserts iff caller tries this.
avikivity
pushed a commit
that referenced
this issue
Dec 4, 2023
Refs #15269 Unit test to check that trying to skip past EOF in a borked segment will not crash the process. file_data_input_impl asserts iff caller tries this.
@scylladb/scylla-maint - please backport to 5.2 and 5.4. |
denesb
pushed a commit
that referenced
this issue
Jan 5, 2024
Fixes #15269 If segment being replayed is corrupted/truncated we can attempt skipping completely bogues byte amounts, which can cause assert (i.e. crash) in file_data_source_impl. This is not a crash-level error, so ensure we range check the distance in the reader. v2: Add to corrupt_size if trying to skip more than available. The amount added is "wrong", but at least will ensure we log the fact that things are broken Closes #15270 (cherry picked from commit 6ffb482)
denesb
pushed a commit
that referenced
this issue
Jan 5, 2024
Fixes #15269 If segment being replayed is corrupted/truncated we can attempt skipping completely bogues byte amounts, which can cause assert (i.e. crash) in file_data_source_impl. This is not a crash-level error, so ensure we range check the distance in the reader. v2: Add to corrupt_size if trying to skip more than available. The amount added is "wrong", but at least will ensure we log the fact that things are broken Closes #15270 (cherry picked from commit 6ffb482)
Backported to 5.4 and 5.2. |
denesb
removed
Backport candidate
backport/5.2
Issues that should be backported to 5.2 branch once they'll be fixed
backport/5.4
Issues that should be backported to 5.4 branch once they'll be fixed
labels
Jan 5, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
area/commitlog
Issues related to the commit log.
Field-Tier2
Important issues as per FieldEngineering request
If a segment being replayed is truncated/corrupted we can attempt to skip more than available in the file, which causes assertion in
file_data_source_impl
. This being an assert is dubious, but that does not change the fact that we can just as well do this check properly in the replay reader.The text was updated successfully, but these errors were encountered: