Skip to content

Write-ahead log and durable-tail recovery scan#3

Merged
tamnd merged 1 commit into
mainfrom
m1b-wal
Jun 20, 2026
Merged

Write-ahead log and durable-tail recovery scan#3
tamnd merged 1 commit into
mainfrom
m1b-wal

Conversation

@tamnd

@tamnd tamnd commented Jun 20, 2026

Copy link
Copy Markdown
Owner

Second slice of M1: the WAL that both engines commit through, plus the recovery scan that finds the durable tail.

What this adds

  • wal.WAL: an append-only log over a -wal sidecar. LogBatch + Commit is the durability path; Checkpointed records a fold boundary and rotates the salt.
  • Logical kv-batch frames carry the serialized WriteBatch verbatim, so what is durable and what is applied are byte-identical and redo reuses the runtime Apply path.
  • Chained, salted xxHash64 per frame; wal.Recover walks frames, finds the durable tail, and returns committed batches in LSN order.
  • engine.WriteBatch.Encode / engine.DecodeBatch: the batch owns its wire form.
  • Synchronous levels Off / Normal / Full / Extra mirroring PRAGMA synchronous; sync errors are fatal and non-retryable (fsyncgate).

Design notes

  • No page-LSN bookkeeping. Every mutation is keyed by a unique user_key || ^version || kind internal key, so replaying a committed batch re-inserts already-present keys as a no-op. Redo is idempotent and restartable for free. The package doc records this as a deliberate substitution for ARIES page LSNs.
  • A batch is committed iff a checksum-valid commit frame for it is reached; an uncommitted trailing batch is dropped. A torn or stale-salt frame ends the durable log.
  • page-image frame type is reserved (torn-write protection wires in with the checkpoint fold). Group commit and the on-open replay driver are reserved for later slices.

Tests

go test -race ./... green. WAL tests cover commit/recover round-trip, uncommitted-tail drop, torn-tail detection, crash durability by sync level (Full survives, Normal unsynced is lost, no corruption), salt rotation across a checkpoint, and sync accounting.

Implementation doc: notes/Spec/2059/implementation/05-wal-and-recovery.md.

The WAL logs logical kv-batch frames: the exact serialized WriteBatch
the engine later applies, so what is durable and what is applied are
byte-identical and redo shares one code path with normal operation.

A chained, salted xxHash64 over each frame lets recovery find the exact
durable tail without trusting any external pointer. The first frame that
breaks the chain or carries a stale salt ends the durable log; a batch
counts as committed only when a valid commit frame for it is reached, so
an uncommitted trailing batch is dropped.

Redo needs no page-LSN bookkeeping. Every mutation is keyed by a unique
internal key of user-key, inverted version, and kind, so replaying a
committed batch re-inserts keys already present and is a no-op. That
makes recovery idempotent and restartable for free.

Synchronous levels mirror PRAGMA synchronous: Off, Normal (defer the
per-commit sync to checkpoint), Full (fdatasync every commit), and Extra
(also sync the inode on growth). A sync error is fatal and not retried.

WriteBatch gains Encode/DecodeBatch so the batch owns its wire form and
the WAL stores it verbatim.

Group commit, physical page-image torn-write frames, and the replay
driver that feeds engine.Apply on open are reserved for later slices.
@tamnd tamnd merged commit 115636b into main Jun 20, 2026
1 check passed
@tamnd tamnd deleted the m1b-wal branch June 20, 2026 07:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant