Releases · feichai0017/holt

13 Jun 14:32

github-actions

v0.7.1

279addf

v0.7.1 Latest

Latest

Fixed

Durability: an acknowledged write could be lost after a crash. The
checkpoint's WAL-truncate gate (maybe_truncate) only checked the
BufferManager dirty/flushing/pending counters, not the store's own deferred
durability (needs_flush). The I/O worker retires a written-through blob
right after the pwrite but before the data fsync + manifest-delta persist,
so the WAL could be truncated while a just-written blob's new slot mapping
was still only in the in-memory manifest — leaving a crashed reopen with the
acknowledged record in neither the WAL nor manifest.log. The gate now also
waits on needs_flush(), mirroring the existing run_round early-skip
guard. Surfaced by the nightly crash-soak; 0.7.0's lazy routing compaction
amplified the exposure by re-writing the root blob every round.
Durability: a torn WAL tail is now truncated on reopen. Previously the
writer reopened with O_APPEND over the torn bytes, turning a partial tail
record into a mid-log torn record that a later replay would stop at,
silently stranding every acknowledged record written after it. replay_wal
now truncates the WAL to the last complete record on open — standard WAL
recovery; the torn record was never acknowledged (the crash preceded its
fdatasync), so nothing durable is lost.

Assets 2

12 Jun 07:32

github-actions

v0.5.5

0cf9089

v0.5.5

Fixed

FileBlobStore::open now takes an exclusive flock(2) on
<data_dir>/store.lock and holds it for the lifetime of the
instance. Two live instances on one data directory previously
replayed manifest.log into the same next_slot, assigned the
same slot to different blob GUIDs, and appended conflicting set
deltas — permanently poisoning the manifest (every later open
failed with FileBlobStore::Manifest::duplicate slot) while the
colliding frames overwrote each other in blobs.dat. Since 0.5.0
even read-only snapshots persist frozen root frames, so the
overlap window of a plain handover (store = reopen(path)) was
enough to trip this. Open now waits up to 5 s for the previous
instance to finish dropping, so handover reopen serializes; a
genuinely concurrent second opener fails with a clear
WouldBlock error instead of corrupting the store. Same-process
double-opens are caught too (flock is per open-file-description),
and the kernel releases the lock if the holder crashes.

Assets 2

11 Jun 13:19

github-actions

v0.7.0

f425715

v0.7.0

Added

BlobStore::read_blobs — a batched full-frame read on the public trait.
The default loops over read_blob; stores override it for device
parallelism (Linux io_uring submits one ring batch, the pread store
fans the reads across worker threads). Used by the cold-scan read-ahead.
Page-granular cold reads. A point lookup on routed (write-cold) data fetches
only the header page, the blob's routing region, and the one leaf page its
descent reaches (~18 KB mean) instead of pinning the whole 512 KB frame
(~27× less cold I/O). The routing region is built at compaction.
Per-blob bloom filter at the tail of the routing region (read for free with
it): cold negative lookups answer NotFound without a leaf-page read.
No false negatives. Additive on disk (bloom_len == 0 = no bloom).
Bounded resident routing cache: routing regions for hot blobs are held in a
byte-bounded cache so repeat cold reads skip the routing-region read.
Cold-scan I/O read-ahead: range scans prefetch upcoming child blobs through
pin_scan_many → batched read_blobs, reading them at the device's natural
queue depth instead of one serial round-trip each.

Changed

Breaking — on-disk format. Manifest format v4 → v6 (the blob header now
records the per-blob routing-region geometry). Older manifests are not
migrated — the loader rejects any non-v6 manifest, so a store written by
0.6.x cannot be opened by this release (and a v6 store cannot be opened by
0.6.x). Pre-1.0 with no production deployments; recreate the store on upgrade.
Compaction builds (and the read path validates) the routing region + bloom;
structural write-path mutations de-route a blob, and write-cold blobs are
re-routed lazily by maintenance.

Removed

Removed the cold.idx cold-read sidecar — the in-blob routing region is now
the sole cold-read path.
Removed the docs/design/ working notes. Rationale for shipped features lives
in commit messages; rationale for rejected paths (io_uring WAL rewrite, the
two blob-fill fixes) lives in git history.

Assets 2

09 Jun 16:39

github-actions

v0.6.0

2122f1c

v0.6.0

Added

Added a shared in-memory WAL byte ring as the only append path. Foreground
writers reserve byte ranges concurrently, copy encoded records directly into
the ring, and a single flusher drains committed byte prefixes to the WAL file.
Added loom coverage and crash-soak validation for the WAL ring's
reserve/publish/flush ordering, including multi-publisher gap-safety checks.

Changed

Flattened leaf storage and child addressing for the persistent ART layout:
small records can stay inline, child body offsets are stored directly, and
inner-node child scans use compact u16 addressing with SIMD fast paths.
Reworked the WAL group-commit plumbing around ring backpressure instead of a
per-record channel/worker handoff, reducing the concurrent durable write
bottleneck while preserving the existing WAL record format and replay reader.
Tightened journal validation so empty or over-capacity records are rejected
before reservation instead of relying on debug-only assertions.

Removed

Removed the legacy WAL channel backend and its transitional design documents.
The ring-backed journal is now the only implementation.
Removed rejected experiment notes that were no longer part of the supported
architecture.

Validation

cargo test --workspace --all-features --locked
cargo clippy --workspace --all-features --all-targets --locked -- -D warnings
RUSTFLAGS="--cfg loom" cargo test -p holt --lib journal::ring::loom --locked

Assets 2

07 Jun 07:53

feichai0017

v0.5.4

c25f5a9

v0.5.4

Removed

Removed the external-log state-machine surface from holt core:
Durability::StateMachine, DB::commit_durable,
Tree::commit_durable, durable_applied_index, DB::scatter,
DB::scatter_independent, and the file-store DurableManifest
trailer.
Checkpoint images are now pure DB archive/transfer images. They
contain family key/value data and no longer carry an external
applied_index.
Atomic DB/Tree batches always use the exclusive mutation gate again;
holt no longer has a StateMachine-only relaxed batch mode.

Assets 2

06 Jun 14:36

github-actions

v0.5.3

ed9ba55

v0.5.3

Fixed

Preserved checkpoint-owned cache images while copy-on-write snapshot reclaim,
DB-wide GC, direct blob deletes, or write-through paths run concurrently.
This fixes NoKV-style metadata pressure that could otherwise report
snapshot_dirty_versions: dirty entry lost cache image or
write_through_batch: flushing entry lost cache image.
Kept direct write-through from retiring another in-flight checkpoint epoch;
it now clears only unclaimed dirty state and leaves flushing ownership intact.

Validation

cargo fmt --all -- --check
cargo clippy --workspace --all-targets -- -D warnings
cargo test store::buffer_manager::tests -- --nocapture
NoKV sibling FUSE/RustFS/JuiceFS smoke with local Holt patch completed without
checkpoint invariant failures.

Assets 2

06 Jun 08:49

feichai0017

v0.5.2

99997b0

v0.5.2

Added

Added CheckpointImage::validate() to validate a full exported DB
checkpoint image before install or archive handoff, not just its header.
Added KeyScanOutcome and KeyRangeBuilder::visit_with_outcome so callers
can distinguish prefix-list cache hits from real ART walks without changing
the stable ScanStats field set.
Added PrefixCount, Tree::prefix_count, and View::prefix_count for
bounded DFS-style prefix cardinality checks. Non-zero limits scan at most one
entry past the limit and report whether the count is exact.
Added DB::scatter_independent for StateMachine-mode independent single-key
fan-out across named families. It rejects duplicate (tree, key) pairs and
applies unrelated writes concurrently through Holt's native per-key paths.

Changed

Refactored DB::scatter to share the same single-key apply helper as
scatter_independent, keeping ordered scatter semantics while avoiding a
second implementation of each operation kind.
Clarified DB::install_checkpoint as a fresh/wiped-DB install path; Holt does
not expose online live-DB checkpoint replacement.

Validation

cargo clippy --workspace --all-targets -- -D warnings
cargo test --test scan_stats --test scatter --test checkpoint

Assets 2

06 Jun 07:47

github-actions

v0.5.1

d2a7f31

v0.5.1

Fixed

Added durable recovery coverage for NoKV-style metadata stores using many
named families and multi-family DB::atomic batches under
Durability::StateMachine.
Verified that DB::commit_durable(applied_index) can reopen a
metadata-service-shaped checkpoint without a Holt WAL and retain the durable
applied index needed for external-log replay.

Validation

cargo test --test sm_durable durable_recovers_metadata_store_shaped_workload -- --exact --nocapture
cargo test --release --test sm_durable durable_recovers_metadata_store_shaped_workload -- --exact
NoKV sibling validation with a local Holt patch:
cargo test --config 'patch.crates-io.holt.path="../holt"' -p nokv-meta -p nokv-cluster -p nokv-server

Assets 2

06 Jun 05:58

github-actions

v0.5.0

2324407

v0.5.0

This release adds a two-axis durability model (who owns durability ×
where data lives) and the metadata-shaped fast paths a replicated
metadata service needs, plus crash-consistent on-disk recovery for the
state-machine mode. It contains breaking API and on-disk changes — see
Changed.

Added

Durability policy. Durability::Wal { sync } / Durability::StateMachine
replaces the ad-hoc wal_sync flag and is orthogonal to Storage. Wal is
single-node — holt's own write-ahead log is the durable record.
StateMachine is for a replicated state machine: an external log (e.g. Raft)
owns durability and replay, and holt attaches no WAL.
Durable state-machine recovery. Under Durability::StateMachine with file
storage, DB::commit_durable(applied_index) / Tree::commit_durable write a
crash-consistent on-disk checkpoint without a WAL: a copy-on-write snapshot
plus an atomic manifest rename recording the durable roots, applied_index,
and the resume next_seq. Reopen rehydrates from it and exposes
durable_applied_index(); the external log replays only the tail past that
index. Verified by fault injection and a SIGKILL crash soak.
DB::export_checkpoint / DB::install_checkpoint — a whole-DB
logical-KV snapshot image carrying applied_index, for shipping and
installing state-machine snapshots (Raft InstallSnapshot).
Tree::put_many_if_absent — create every absent key as one atomic batch
(single WAL record), reporting per key whether it was Created or
AlreadyExists.
DB::scatter — independent single-key conditional writes across families
with no cross-family atomic barrier; each runs on its own per-key concurrent
path so unrelated keys never serialize. StateMachine-only (the log owns
write ordering).
ScanStats — per-scan visited / returned / rollup / restarts
accounting on RangeIter / KeyRangeIter (read via .stats()), and the
return of KeyRangeBuilder::visit. Surfaces work-vs-yield so callers can spot
tombstone-bloated listings.
Copy-on-write snapshots. Tree::snapshot returns a stable
point-in-time Snapshot handle in O(1) — only the root frame is
copied; the rest is shared with the live tree and forked
copy-on-write only when a live write would overwrite a frame the
snapshot still references. Reads have 1× amplification and there is no
write overhead while no snapshot is live.
Tree::gc / DB::gc reclaim snapshot frames that a crash left
orphaned because it occurred while a snapshot was still live.

Changed

Breaking. TreeConfig.wal_sync and TreeBuilder::wal_sync() are removed;
use TreeConfig.durability / TreeBuilder::durability(Durability).
Breaking. KeyRangeBuilder::visit returns ScanStats instead of the
emitted count (use stats.returned + stats.rollup).
Breaking (on-disk). The file-store manifest is v2 (durable trailer); v1
manifests are not migrated.
Under Durability::StateMachine, atomic batches take the mutation gate shared
rather than exclusive — the external log serializes writes, so applies no
longer fence concurrent range scans. view / snapshot capture still fences,
so consistent point-in-time reads are unaffected.
DB::open gates the WAL on durability (attach_wal()), not just on storage,
so a file-backed StateMachine database no longer attaches a holt WAL.
Tree::view / DB::view are reimplemented on copy-on-write
snapshots: same API and point-in-time semantics, but capture is now
O(1) instead of eagerly copying every reachable blob frame, and holds
no second in-memory copy of the captured subtree.

Assets 2

02 Jun 01:30

github-actions

v0.4.2

64691b1

v0.4.2

Fixed

Fixed a DB checkpoint race where a concurrent pending delete could
remove an in-flight cache image after the checkpoint worker had
claimed it, causing write_through_batch: flushing entry lost cache image and blocking crash-safe checkpoint completion.
Kept pending-delete cleanup from reclaiming cache and route-resident
state until the delete has been applied to the inner blob store.

Assets 2

Releases: feichai0017/holt

v0.7.1

Fixed

Uh oh!

v0.5.5

Fixed

Uh oh!

v0.7.0

Added

Changed

Removed

Uh oh!

v0.6.0

Added

Changed

Removed

Validation

Uh oh!

v0.5.4

Removed

Uh oh!

v0.5.3

Fixed

Validation

Uh oh!

v0.5.2

Added

Changed

Validation

Uh oh!

v0.5.1

Fixed

Validation

Uh oh!

v0.5.0

Added

Changed

Uh oh!

v0.4.2

Fixed

Uh oh!