v0.4.0 — Durability and Crash Recovery
Pre-releaselsm-db v0.4.0 — Durability and Crash Recovery
No acknowledged write is lost across a crash. v0.4.0 adds an optional
write-ahead log: under the durability feature, every write is logged and
fsynced before it is acknowledged, and the log is replayed on open — so a write
survives a crash even if it never reached a flush. The feature is additive and
the public API is unchanged; with it off, the engine behaves exactly as in 0.3.
What is lsm-db?
A log-structured merge-tree storage engine for Rust — the write path that powers
RocksDB, LevelDB, Cassandra, and ScyllaDB, packaged as a small, audited library.
It is the storage layer the portfolio's database crates (txn-db, Hive DB) build
on, so the durability and read/write contract is implemented and tested once.
What's new in 0.4.0
The durability feature
[dependencies]
lsm-db = { version = "0.4", features = ["durability"] }With it enabled, each put / delete / write is appended to a wal-db
write-ahead log and made durable before the call returns; a batch is logged as a
single atomic record. On open, the log is replayed into the memtable and
checkpointed to a run, so recovery only ever replays the writes since the most
recent flush.
# fn main() -> Result<(), Box<dyn std::error::Error>> {
let dir = tempfile::tempdir()?;
{
let db = lsm_db::Lsm::open(dir.path())?;
db.put(b"k", b"v")?; // durable before this returns (with `durability`)
// process exits here — no explicit flush
}
let db = lsm_db::Lsm::open(dir.path())?;
assert_eq!(db.get(b"k")?, Some(b"v".to_vec())); // recovered from the log
# Ok(())
# }The same source compiles and runs with or without the feature: the durability
layer is a zero-sized no-op when it is off, so the non-durable path — ideal for
caches and tests — pays nothing.
How recovery stays consistent
A flush makes the buffered writes durable in a sorted run, so the log that held
them is rotated (emptied) at that point. Combined with the manifest-based run
recovery from 0.3, a crash at any moment recovers to a consistent state: flushed
runs come from the manifest, and the un-flushed tail comes from the log. Because
a clean Drop does not flush — it only stops the background compactor — the
recovery tests reproduce exactly the un-flushed-but-acknowledged state a crash
leaves, then reopen and check every write is present.
Testing
- Crash recovery (under
--all-features): un-flushed writes, overwrites,
deletes, and a 200-op batch all survive drop-without-flush and reopen; a
1,000-write / 200-delete workload recovers to the exact live set; reopening
twice does not duplicate; writes continue and stay durable after recovery. - Log codec: encode/decode round-trips, and rejection of truncated or
trailing-garbage records. - The 0.3 suites — compaction property test, concurrent-writer stress, manifest
crash recovery, corruption detection, and theloomread-versus-compaction
model — all run again with durability on.
Counts at this tag:
- Default features: 54 unit + 4 integration + 2 compaction + 6 recovery +
3 property + 23 doctests. --all-features: 59 unit + 8 durability + 4 integration + 2 compaction +
6 recovery + 3 property + 23 doctests.loom(underRUSTFLAGS="--cfg loom"): 2 model checks.
All green on stable and MSRV (1.85) across Linux, macOS, and Windows; cargo fmt,
cargo clippy -D warnings, cargo doc -D warnings, cargo deny check, and
cargo audit clean. Zero unsafe (#![forbid(unsafe_code)]).
What's next
- 0.5.0 — Bloom filters + feature freeze. Per-run
bloom-libfilters under
thebloomfeature to skip runs that cannot contain a key on negative
lookups, plus a pluggable comparator for custom key ordering.
Installation
[dependencies]
lsm-db = "0.4"
# or, for crash-safe writes:
lsm-db = { version = "0.4", features = ["durability"] }MSRV: Rust 1.85 (2024 edition).
Documentation
Full diff: v0.3.0...v0.4.0.
Changelog: CHANGELOG.md.