Skip to content

v0.7.0 — Hardening & API freeze

Pre-release
Pre-release

Choose a tag to compare

@jamesgober jamesgober released this 08 Jun 11:46
· 4 commits to main since this release

lsm-db v0.7.0 — Hardening & API freeze

Run at it with hostile input, then locked down. v0.7.0 puts the engine
through adversarial property tests and edge cases, adds a fuzz harness, and
freezes the public API — no breaking change until 2.0. The hardening pass
found and fixed a real panic. The on-disk format, frozen since 0.3, is unchanged.

What is lsm-db?

A log-structured merge-tree storage engine for Rust — the write path that powers
RocksDB, LevelDB, Cassandra, and ScyllaDB, packaged as a small, audited library.
It is the storage layer the portfolio's database crates (txn-db, Hive DB) build
on, so the durability and read/write contract is implemented and tested once.

What's new in 0.7.0

Hostile-input hardening — and a real bug fixed

Library code must never panic on a corrupted or truncated on-disk file, nor be
tricked into an unbounded allocation by a hostile length prefix. New property
tests (tests/adversarial.rs) apply arbitrary corruption — bit-flips,
truncation, whole-file garbage — to the run file, the manifest, the write-ahead
log, and the bloom sidecar, then reopen the database, asserting only that it
returns a Result: Ok, or a corruption Err, never a panic.

That pass found a real panic. A bloom sidecar filled with arbitrary bytes
could postcard-deserialize into an internally-inconsistent bloom-lib filter
that then panicked (out-of-bounds) the moment it was queried. The fix wraps the
sidecar in a magic + CRC32C integrity envelope, so only bytes this crate actually
wrote — which always encode a self-consistent filter — are ever handed to the
deserializer. A corrupt or hostile sidecar fails the envelope and is discarded;
the run is consulted directly, with identical results.

Edge cases

tests/edge_cases.rs covers the awkward inputs: multi-megabyte values that span
many blocks, fifty un-compacted runs the read path must merge across, empty keys
and values, a 64 KiB key, and an I/O failure mid-flush (the database directory
removed under the engine) surfacing as an Error rather than a panic.

Fuzz harness

A standalone cargo-fuzz harness lives in fuzz/ (its own workspace, never
built by CI). Two targets — recover (the sorted-run parser) and sidecar (the
bloom sidecar) — drive arbitrary bytes through the public Lsm::open
parse/recovery path:

cargo +nightly fuzz run recover
cargo +nightly fuzz run sidecar

API freeze

The public API is frozen as of 0.7.0 — no breaking change until a 2.0 major.
The on-disk run format, manifest, sidecar, and write-ahead-log layouts are
frozen for 1.x. The remaining 0.x releases make only additive, non-breaking
changes; the full frozen surface is recorded in dev/ROADMAP.md.

Testing

  • Adversarial property tests over corrupted run / manifest / WAL / sidecar:
    never panic, never over-allocate.
  • Edge-case tests for large values, many runs, unusual keys, and I/O failure.
  • The fuzz harness type-checks on Linux; all prior suites continue to pass.

Counts at this tag:

  • Default features: 59 unit + 4 integration + 2 compaction + 6 recovery +
    3 property + 4 adversarial + 5 edge-case + 25 doctests.
  • --all-features: 74 unit + 6 bloom + 8 durability + 4 integration +
    2 compaction + 6 recovery + 3 property + 6 adversarial + 5 edge-case +
    25 doctests.
  • loom (under RUSTFLAGS="--cfg loom"): 2 model checks.

All green on stable and MSRV (1.87) across Windows and Linux (WSL2); cargo fmt,
cargo clippy -D warnings, cargo doc -D warnings, cargo deny check, and
cargo audit clean. Zero unsafe (#![forbid(unsafe_code)]).

Breaking changes

None. This release is additive (tests, a fuzz harness, the sidecar integrity
envelope). The sidecar envelope changes the sidecar file format, but a sidecar is
a rebuildable hint — old sidecars are simply discarded and rewritten — so no
data is affected.

What's next

  • 0.8.x → 0.9.x — Alpha / Beta → RC. Integrate against real consumers and fix
    what they surface (additive only); broaden testing; capture final benchmarks;
    doc polish.
  • 1.0.0 — Stable. Definition-of-Done audit and publication. The engine is
    feature-complete, hardened, and API-frozen; what stands between it and 1.0 is
    soak time and the release cut.

Installation

[dependencies]
lsm-db = "0.7"
# Crash-safe writes and/or bloom-filtered point reads:
lsm-db = { version = "0.7", features = ["durability", "bloom"] }

MSRV: Rust 1.87 (2024 edition).

Documentation


Full diff: v0.6.0...v0.7.0.
Changelog: CHANGELOG.md.