Skip to content

v0.4.0 — Durability and Crash Recovery

Pre-release
Pre-release

Choose a tag to compare

@jamesgober jamesgober released this 07 Jun 17:17
· 7 commits to main since this release

lsm-db v0.4.0 — Durability and Crash Recovery

No acknowledged write is lost across a crash. v0.4.0 adds an optional
write-ahead log: under the durability feature, every write is logged and
fsynced before it is acknowledged, and the log is replayed on open — so a write
survives a crash even if it never reached a flush. The feature is additive and
the public API is unchanged; with it off, the engine behaves exactly as in 0.3.

What is lsm-db?

A log-structured merge-tree storage engine for Rust — the write path that powers
RocksDB, LevelDB, Cassandra, and ScyllaDB, packaged as a small, audited library.
It is the storage layer the portfolio's database crates (txn-db, Hive DB) build
on, so the durability and read/write contract is implemented and tested once.

What's new in 0.4.0

The durability feature

[dependencies]
lsm-db = { version = "0.4", features = ["durability"] }

With it enabled, each put / delete / write is appended to a wal-db
write-ahead log and made durable before the call returns; a batch is logged as a
single atomic record. On open, the log is replayed into the memtable and
checkpointed to a run, so recovery only ever replays the writes since the most
recent flush.

# fn main() -> Result<(), Box<dyn std::error::Error>> {
let dir = tempfile::tempdir()?;
{
    let db = lsm_db::Lsm::open(dir.path())?;
    db.put(b"k", b"v")?;   // durable before this returns (with `durability`)
    // process exits here — no explicit flush
}
let db = lsm_db::Lsm::open(dir.path())?;
assert_eq!(db.get(b"k")?, Some(b"v".to_vec())); // recovered from the log
# Ok(())
# }

The same source compiles and runs with or without the feature: the durability
layer is a zero-sized no-op when it is off, so the non-durable path — ideal for
caches and tests — pays nothing.

How recovery stays consistent

A flush makes the buffered writes durable in a sorted run, so the log that held
them is rotated (emptied) at that point. Combined with the manifest-based run
recovery from 0.3, a crash at any moment recovers to a consistent state: flushed
runs come from the manifest, and the un-flushed tail comes from the log. Because
a clean Drop does not flush — it only stops the background compactor — the
recovery tests reproduce exactly the un-flushed-but-acknowledged state a crash
leaves, then reopen and check every write is present.

Testing

  • Crash recovery (under --all-features): un-flushed writes, overwrites,
    deletes, and a 200-op batch all survive drop-without-flush and reopen; a
    1,000-write / 200-delete workload recovers to the exact live set; reopening
    twice does not duplicate; writes continue and stay durable after recovery.
  • Log codec: encode/decode round-trips, and rejection of truncated or
    trailing-garbage records.
  • The 0.3 suites — compaction property test, concurrent-writer stress, manifest
    crash recovery, corruption detection, and the loom read-versus-compaction
    model — all run again with durability on.

Counts at this tag:

  • Default features: 54 unit + 4 integration + 2 compaction + 6 recovery +
    3 property + 23 doctests.
  • --all-features: 59 unit + 8 durability + 4 integration + 2 compaction +
    6 recovery + 3 property + 23 doctests.
  • loom (under RUSTFLAGS="--cfg loom"): 2 model checks.

All green on stable and MSRV (1.85) across Linux, macOS, and Windows; cargo fmt,
cargo clippy -D warnings, cargo doc -D warnings, cargo deny check, and
cargo audit clean. Zero unsafe (#![forbid(unsafe_code)]).

What's next

  • 0.5.0 — Bloom filters + feature freeze. Per-run bloom-lib filters under
    the bloom feature to skip runs that cannot contain a key on negative
    lookups, plus a pluggable comparator for custom key ordering.

Installation

[dependencies]
lsm-db = "0.4"
# or, for crash-safe writes:
lsm-db = { version = "0.4", features = ["durability"] }

MSRV: Rust 1.85 (2024 edition).

Documentation


Full diff: v0.3.0...v0.4.0.
Changelog: CHANGELOG.md.