Skip to content

v0.8.3 — BM25 FTS + native 3-way fused recall

Choose a tag to compare

@shuruheel shuruheel released this 12 Jun 22:10
· 19 commits to main since this release
cf789cd

Fourth fork release, validated end-to-end on the mnestic-benchmarks hybrid suite (vs SQLite/DuckDB/LanceDB/Kuzu).

Highlights

  • Okapi BM25 is the new default FTS scorer (behaviour change — tf/tf_idf stay selectable, byte-identical to upstream). Adds term-frequency saturation, document-length normalization, and OR-queries that sum per-term contributions instead of taking the max. Fused recall 0.75 → 0.954.
  • O(1) avgdl via a durable per-index doc-stats counter — removes a per-query full index scan; decomposed-path p50 927 → 175 ms, cold p99 2,900 → 258 ms. Legacy indexes self-migrate on first write.
  • Native 3-way fused recall: new typed GraphLeg on HybridSearch runs vector+FTS+graph in one call/one transaction with bounded-hop min-distance ranking — 41.55 ms p50 (~4× faster than the hand-decomposed path), injection-safe (seeds passed as params).
  • Read-path latency baseline (benches/read_path.rs): parse/compile is a fixed ~20–85 µs — material for point reads, noise for retrieval.
  • Python: hybrid_search accepts graph_legs.

Full detail in CHANGELOG-FORK.md.