Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fuzzing of crash recovery and fix a bunch of bugs #573

Merged
merged 18 commits into from
May 9, 2023
Merged

Conversation

cberner
Copy link
Owner

@cberner cberner commented May 4, 2023

Fixes #547

Fixes #556

Fixes #564

@cberner cberner force-pushed the crash_fuzz branch 8 times, most recently from 672a07c to bf603da Compare May 7, 2023 16:06
@cberner cberner changed the title Add fuzzing of crash recovery Add fuzzing of crash recovery and fix a bunch of bugs May 8, 2023
@cberner cberner force-pushed the crash_fuzz branch 2 times, most recently from 05c936b to be7c0f7 Compare May 8, 2023 16:19
cberner added 13 commits May 8, 2023 14:31
If an I/O error occurred during write(), the dirty page tracking would
leak causing a panic when the transaction was committed/aborted
This fixes a cache poisoning issue in which a page might be in the
userspace cache after walking the btrees during crash recovery, but then
be freed by the rollback process of recovery. This could then poison the
cache, leading to a crash in the future
Fixes a page leak if an I/O error occurred during commit(), since drop()
would still flush the recovery bit to "clean"
Fixes a corruption issue that could occur if a Durability::None commit
was made, followed by a durable commit, and the durable commit crashed
during the call to commit().

Non-durable commits pushed their pending free pages into the freed
table. The next durable commit then processed these *before* finalizing
its commit. If that durable commit crashed after processing the frees,
but before finalizing, then all the non-durable commits are rolled back,
but the freed pages have already been processed. This left the database
in a corrupted state since those pages could get overwritten as part of
finalizing the durable commit
An I/O error during writeback could cause the page being flushed to be
lost
The root would end up set, but not the path, this lead to a panic in the
drop() method
Every issue found has reproduced at < 10k, and the corpus only has items
up to ~50k
If an I/O occurring during commit(), the transaction would then try to
abort() which might panic since the data structures are left in an
inconsistent state
@cberner cberner merged commit c6ca509 into master May 9, 2023
@cberner cberner deleted the crash_fuzz branch May 9, 2023 16:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Panic in RawLeafBuilder::Append Android Probability Crash panicked at 'Tried to repair an empty database'
1 participant