Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Commitlog replay can cause abort due to over-extended skip #15269

Closed
elcallio opened this issue Sep 4, 2023 · 2 comments
Closed

Commitlog replay can cause abort due to over-extended skip #15269

elcallio opened this issue Sep 4, 2023 · 2 comments
Assignees
Labels
area/commitlog Issues related to the commit log. Field-Tier2 Important issues as per FieldEngineering request
Milestone

Comments

@elcallio
Copy link
Contributor

elcallio commented Sep 4, 2023

If a segment being replayed is truncated/corrupted we can attempt to skip more than available in the file, which causes assertion in file_data_source_impl. This being an assert is dubious, but that does not change the fact that we can just as well do this check properly in the replay reader.

elcallio pushed a commit to elcallio/scylla that referenced this issue Sep 4, 2023
Fixes scylladb#15269

If segment being replayed is corrupted/truncated we can attempt skipping
completely bogues byte amounts, which can cause assert (i.e. crash) in
file_data_source_impl. This is not a crash-level error, so ensure we
range check the distance in the reader.
elcallio pushed a commit to elcallio/scylla that referenced this issue Sep 4, 2023
Fixes scylladb#15269

If segment being replayed is corrupted/truncated we can attempt skipping
completely bogues byte amounts, which can cause assert (i.e. crash) in
file_data_source_impl. This is not a crash-level error, so ensure we
range check the distance in the reader.
@mykaul mykaul added Backport candidate backport/5.2 Issues that should be backported to 5.2 branch once they'll be fixed Requires-Backport-to-5.1 labels Sep 4, 2023
@mykaul mykaul added this to the 5.4 milestone Sep 4, 2023
@mykaul mykaul added the area/commitlog Issues related to the commit log. label Sep 4, 2023
elcallio pushed a commit to elcallio/scylla that referenced this issue Sep 4, 2023
Fixes scylladb#15269

If segment being replayed is corrupted/truncated we can attempt skipping
completely bogues byte amounts, which can cause assert (i.e. crash) in
file_data_source_impl. This is not a crash-level error, so ensure we
range check the distance in the reader.

v2: Add to corrupt_size if trying to skip more than available. The amount added is "wrong", but at least will
    ensure we log the fact that things are broken
elcallio pushed a commit to elcallio/scylla that referenced this issue Sep 27, 2023
Fixes scylladb#15269

If segment being replayed is corrupted/truncated we can attempt skipping
completely bogues byte amounts, which can cause assert (i.e. crash) in
file_data_source_impl. This is not a crash-level error, so ensure we
range check the distance in the reader.

v2: Add to corrupt_size if trying to skip more than available. The amount added is "wrong", but at least will
    ensure we log the fact that things are broken
elcallio pushed a commit to elcallio/scylla that referenced this issue Oct 31, 2023
Fixes scylladb#15269

If segment being replayed is corrupted/truncated we can attempt skipping
completely bogues byte amounts, which can cause assert (i.e. crash) in
file_data_source_impl. This is not a crash-level error, so ensure we
range check the distance in the reader.

v2: Add to corrupt_size if trying to skip more than available. The amount added is "wrong", but at least will
    ensure we log the fact that things are broken
@mykaul mykaul added the backport/5.4 Issues that should be backported to 5.4 branch once they'll be fixed label Nov 20, 2023
@mykaul mykaul added the Field-Tier2 Important issues as per FieldEngineering request label Dec 3, 2023
elcallio pushed a commit to elcallio/scylla that referenced this issue Dec 4, 2023
Refs scylladb#15269

Unit test to check that trying to skip past EOF in a borked segment
will not crash the process.
elcallio pushed a commit to elcallio/scylla that referenced this issue Dec 4, 2023
Refs scylladb#15269

Unit test to check that trying to skip past EOF in a borked segment
will not crash the process. file_data_input_impl asserts iff caller
tries this.
avikivity pushed a commit that referenced this issue Dec 4, 2023
Refs #15269

Unit test to check that trying to skip past EOF in a borked segment
will not crash the process. file_data_input_impl asserts iff caller
tries this.
@mykaul
Copy link
Contributor

mykaul commented Jan 4, 2024

@scylladb/scylla-maint - please backport to 5.2 and 5.4.

denesb pushed a commit that referenced this issue Jan 5, 2024
Fixes #15269

If segment being replayed is corrupted/truncated we can attempt skipping
completely bogues byte amounts, which can cause assert (i.e. crash) in
file_data_source_impl. This is not a crash-level error, so ensure we
range check the distance in the reader.

v2: Add to corrupt_size if trying to skip more than available. The amount added is "wrong", but at least will
    ensure we log the fact that things are broken

Closes #15270

(cherry picked from commit 6ffb482)
denesb pushed a commit that referenced this issue Jan 5, 2024
Fixes #15269

If segment being replayed is corrupted/truncated we can attempt skipping
completely bogues byte amounts, which can cause assert (i.e. crash) in
file_data_source_impl. This is not a crash-level error, so ensure we
range check the distance in the reader.

v2: Add to corrupt_size if trying to skip more than available. The amount added is "wrong", but at least will
    ensure we log the fact that things are broken

Closes #15270

(cherry picked from commit 6ffb482)
@denesb
Copy link
Contributor

denesb commented Jan 5, 2024

Backported to 5.4 and 5.2.

@denesb denesb removed Backport candidate backport/5.2 Issues that should be backported to 5.2 branch once they'll be fixed backport/5.4 Issues that should be backported to 5.4 branch once they'll be fixed labels Jan 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/commitlog Issues related to the commit log. Field-Tier2 Important issues as per FieldEngineering request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants