Releases: shuruheel/mnestic
v0.8.5 — flat parallel index builds (15×), snapshot read path
Sixth fork release — matches the crate published to crates.io and the wheel on PyPI (this tag is the exact publish commit).
Highlights
- Flat in-RAM parallel HNSW bulk build (
::hnsw create): contiguous vector slab + integer-ID adjacency + per-node-lock parallel insertion (the hnswlib/pgvector/Lucene layout), serialised once into the unchanged on-disk tuple format. Measured: 294 s → 19 s (15×) on the 40k×384 synthetic repro; 89 s → 8.1 s (11×) on a real-embedding corpus, recall unchanged.MNESTIC_INDEX_BUILD_THREADScontrols workers. - FTS bulk build (
::fts create): dead del-pass removed, parallel tokenisation, exact doc-stats seeding — ~2×. - Plain-snapshot read path (RocksDB): read-only scripts skip the pessimistic transaction and read through a plain snapshot (standard MVCC read pattern à la TiKV/CockroachDB). Keyed point read p50 28.5 → 23.9 µs (−16%), p99 −19%. Isolation semantics pinned by tests.
- Batched HNSW neighbour reads: search-path neighbour vectors fetch via one RocksDB
MultiGetper expansion step instead of serial point gets (wins on cold-cache / larger-than-RAM corpora). ::describefix: the op existed upstream but was never wired into the grammar — now reachable, and read-only-guarded like its sibling sys ops.- Bulk-build test coverage from a three-review post-ship audit (list-of-vectors columns, F64+Cosine recall guard, multi-column-PK FTS doc-stats equality).
Note: the corrupt-database tooling (::repair_corrupt, panic-free ::index create) landed after this publish and ships in 0.8.6.
Full detail in CHANGELOG-FORK.md.
v0.8.4 — per-leg RRF detail + 0.8.3 concurrency fix
Fifth fork release.
Highlights
- Per-leg retrieval detail:
ReciprocalRankFusion(..., detailed: true)(andHybridSearch) emit one row per (item, contributing list) —[item, fused_score, list_id, leg_rank, leg_score]— exposing exactly which legs retrieved each result and at what rank. Powers downstream "why retrieved" explanations. - Fixed a 0.8.3 concurrent-write regression: the durable FTS doc-stats counter introduced for O(1)
avgdlwas a single hot row that serialized all writes to FTS-indexed relations; it's now off the hot path. - PyPI family: the embedded Python wheel publishes as
mnestic.
Full detail in CHANGELOG-FORK.md.
v0.8.3 — BM25 FTS + native 3-way fused recall
Fourth fork release, validated end-to-end on the mnestic-benchmarks hybrid suite (vs SQLite/DuckDB/LanceDB/Kuzu).
Highlights
- Okapi BM25 is the new default FTS scorer (behaviour change —
tf/tf_idfstay selectable, byte-identical to upstream). Adds term-frequency saturation, document-length normalization, and OR-queries that sum per-term contributions instead of taking the max. Fused recall 0.75 → 0.954. - O(1)
avgdlvia a durable per-index doc-stats counter — removes a per-query full index scan; decomposed-path p50 927 → 175 ms, cold p99 2,900 → 258 ms. Legacy indexes self-migrate on first write. - Native 3-way fused recall: new typed
GraphLegonHybridSearchruns vector+FTS+graph in one call/one transaction with bounded-hop min-distance ranking — 41.55 ms p50 (~4× faster than the hand-decomposed path), injection-safe (seeds passed as params). - Read-path latency baseline (
benches/read_path.rs): parse/compile is a fixed ~20–85 µs — material for point reads, noise for retrieval. - Python:
hybrid_searchacceptsgraph_legs.
Full detail in CHANGELOG-FORK.md.
v0.8.2 — non-blocking HNSW index builds
Third fork release. Makes HNSW index builds non-blocking for readers.
Non-blocking HNSW index builds
Building/rebuilding an HNSW index (::hnsw create) used to hold the base relation's exclusive write lock for the entire build, so every concurrent read blocked until it finished — in production, 10–20+ minutes (76 min for a 151K × 1536 index). The stall was cozo's per-relation lock, not RocksDB.
The build is now done off-lock on RocksDB: the heavy graph construction runs under a read-only snapshot with no relation lock held, and the lock is taken only briefly to set up and to publish. The finished graph is bulk-published via SstFileWriter/IngestExternalFile, ingested before its metadata is committed (so a reader can never observe an index before its keys exist); rows mutated during the build are folded in by a short reconcile pass under a brief final lock.
Measured (release): a 40,000-vector build takes ~5.6 s, during which 90,507 concurrent reads of the same relation completed, the slowest in 0.8 ms — previously those reads would have queued behind the whole ~5.6 s build.
Default-on and transparent (same ::hnsw create). RocksDB only; other backends keep the in-transaction build unchanged via the new Storage::ingest_sorted fallback. No mnestic-rocks change (stays 0.1.8).
All 169 inherited lib tests + integration/feature suites pass; cargo clippy -p mnestic -- -D warnings clean. See CHANGELOG-FORK.md for full detail.
v0.8.1: mnestic 0.8.1
mnestic 0.8.1
One-call hybrid retrieval, a ~3x faster HNSW index build, the maintained
mnestic-rocks bridge fork, and a blocking clippy CI gate.
New
- HybridSearch: DbInstance::hybrid_search / Db::hybrid_search (+ *_script) run
HNSW + FTS (+ optional graph traversal), fuse with RRF, optionally diversify
with MMR — in one typed call. Read-only; values passed as params, identifiers
validated against injection.
Performance
- HNSW index build ~3.1x faster (20k x 128: 135s -> 43.6s, release): the build
now constructs the graph in the in-RAM temp store + shares one VectorCache
across the build, instead of round-tripping through the transaction's
WriteBatchWithIndex overlay. Built graph is byte-identical.
Bridge
- Forked cozorocks -> mnestic-rocks (v0.1.8); importable name stays
cozorocks.
Maintenance
- Blocking clippy CI gate (-D warnings); document-features future-incompat cleared.
Deferred (designed): lock-free out-of-transaction build + IngestExternalFile
atomic publish; native in-RAM graph; LangChain/LlamaIndex adapter.
Full changelog: CHANGELOG-FORK.md.
mnestic 0.8.0
First release of mnestic, an independently maintained fork of CozoDB tuned as a substrate for agentic memory. Built on upstream 481af05 — 30 commits ahead of cozo 0.7.6. The importable crate name stays cozo, so existing CozoDB code works unchanged.
[dependencies]
mnestic = "0.8.0"Fixes
- Equality pushdown —
*rel[k, ..], k == <value>now compiles to a keyedstored_prefix_joininstead of a full scan (~28–29× faster single-row primary-key lookups, measured at 5k rows). Numeric equalities keep cross-typeop_eqsemantics. - Parser fix (#281) — identifiers that start with a keyword literal (
nullable_column,trueValue,falsey) now parse correctly. - Unreleased upstream fixes for free — the fork point is 30 commits ahead of the published 0.7.6, including the
stored_prefix_joincorrectness fix. env_loggermoved to a dev-dependency for a slimmer dependency graph (#287).
New — hybrid retrieval for agentic memory
Datalog-composable fixed rules:
ReciprocalRankFusion(aliasRRF) — fuse vector (HNSW) + full-text (FTS) + graph-traversal result lists into one ranking.MaximalMarginalRelevance(aliasMMR) — diversity-aware reranking that avoids near-duplicate recalls.rand_ulid()/ulid_timestamp()— lexicographically-sortable identifiers for time-ordered scans (#296).
Full detail in CHANGELOG-FORK.md. mnestic is not the official CozoDB and is not affiliated with or endorsed by its original authors; all credit for the original design belongs to Ziyang Hu and the Cozo Project Authors.