Skip to content

test(persistence): snapshot crash-scenario fail-points #47

@petrpan26

Description

@petrpan26

Depends on the fail-parallel bootstrap (sibling chore(test) issue under area: persistence). Sibling to the WAL crash-coverage issue.

Cover the snapshot write / read crash surface.

Fail-points to add:

  • snapshot::write::post_header_pre_body — header on disk, body interrupted
  • snapshot::write::post_body_pre_rename — partial .tmp exists, atomic rename never happened
  • snapshot::read::pre_crc_check — body read but CRC not yet verified
  • snapshot::read::truncated_body — short read mid-stream
  • recovery::pick_snapshot_vs_wal_tail — both present, simulate ambiguous state

For each, an integration test:

  1. Build snapshot state via real pushes.
  2. Trigger the fail-point.
  3. Restart and assert recovery falls back correctly — older snapshot + WAL replay, or rebuild from WAL only if the latest snapshot is unusable.
  4. Verify no .tmp files leak across restart.

Pair with the SNAPSHOT-UPGRADE deferral whenever a real FORMAT_VERSION bump arrives; this issue is the crash-correctness scaffolding it would build on. ~8 tests, ~400 LOC.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area: persistenceWAL / snapshot / recovery (beava-persistence)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions