Skip to content

docs: document the WAL durability contract end-to-end #48

@petrpan26

Description

@petrpan26

Durability behavior under each shutdown path is enforced in code but undocumented anywhere user-facing. Users sizing their failure tolerance can't tell what's guaranteed without reading crates/beava-server/src/server.rs:1051-1079 and crates/beava-persistence/src/writer.rs:212-219.

Add a section to CONTRIBUTING.md (or a new docs/durability.md linked from README) covering each case:

  • Ack contract. acks=1: a successful push response = the event is in the active WAL buffer, not yet on disk. Durability lands at the next fsync (per-event sync, periodic tick, or shutdown drain).
  • SIGTERM / SIGINT. Orderly drain + final fsync before exit. Acked events survive; in-flight pushes whose ack hadn't been sent may also survive (apply thread drains before the WAL writer thread joins).
  • SIGKILL. No flush opportunity. Acked + fsync'd events are durable; everything still in the active buffer is lost. Recovery handles orphan segment files (see tests/writer_orphan_segment.rs).
  • Panic. Currently a gap — see the sibling area: server issue. Once the panic hook lands, panic ≡ SIGTERM for durability.
  • std::process::abort() at snapshot_task.rs:145. Skips Drop, skips fsync. Document when this fires and why.

Done when:

  • The section exists with one subsection per case, each stating: what's guaranteed, what's lost, how recovery handles it.
  • README links to it from the wire-surface section.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area: docsRepo markdown + docs site under docs/

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions