v3.10.39 β ADR-147 entity arm + signal provenance
First implementation landed from the dream-cycle research cluster (#2316-#2324). Adds entity matching as a third RRF arm in hybridSearch alongside dense (HNSW/RaBitQ) and sparse (FTS5/BM25), plus per-result signal provenance.
What's new
@claude-flow/memory 3.0.0-alpha.20 β entity arm + signal provenance in the hybridSearch controller:
entity-tagger.tsβ regex extractor for emails, URLs, file paths (POSIX + Windows), quoted phrases, proper-noun 2-grams. Deliberately conservative: false negatives OK, false positives would dilute RRF.hybridSearchnow runs three arms in parallel: dense + sparse + entity (per-token keyword scan, gated onextractEntities(query).length > 0). Empty entity set drops the arm rather than passing[]to dilute fusion.signals: ('vector' | 'bm25' | 'entity')[]on every fused result. Computed by pre-fusion set membership; lets callers debug which arms surfaced an entry without re-running the search.
Capability smoke (end-to-end against built dist)
Corpus: 30 generic "authentication" entries + 1 "Alice Smith" needle. Query: "Alice Smith authentication":
score=0.0477 signals=["vector","bm25","entity"] key=alice-needle β #1
score=0.0323 signals=["vector","bm25"] key=generic-1
score=0.0323 signals=["vector","bm25"] key=generic-0
score=0.0313 signals=["vector","bm25"] key=generic-3
score=0.0301 signals=["vector","bm25"] key=generic-2
Alice ranks #1 with full triplet provenance β runners-up only fire on vector + sparse. ~47% RRF score boost from the entity signal.
Packages
| Package | Old | New | Tags |
|---|---|---|---|
@claude-flow/memory |
3.0.0-alpha.19 | 3.0.0-alpha.20 | latest, alpha, v3alpha |
@claude-flow/cli |
3.10.38 | 3.10.39 | latest, alpha, v3alpha |
claude-flow |
3.10.38 | 3.10.39 | latest, alpha, v3alpha |
ruflo |
3.10.38 | 3.10.39 | latest, alpha, v3alpha |
@claude-flow/cli's @claude-flow/memory dep pinned to ^3.0.0-alpha.20 so wrapper users get the entity arm automatically. v3/pnpm-lock.yaml regen included (lesson from #2311 β bumping a workspace dep without lockfile regen breaks pnpm install --frozen-lockfile).
What this implements vs the dream-cycle ADR
ADR-147 (#2317) split the work as P1 "wire FTS5 + RRF fusion" and P2 "entity arm + provenance". The investigation found P1 was already shipped in controller-registry.ts:713 before the ADR was filed β applyRRF(k=60) + applyMMR(Ξ»=0.7) over dense + sparse was already in. This release lands the actual gap, P2.
Tracking note for the dream-cycle process posted on #2324.
Tests
- 12 new
entity-tagger.test.ts(regex pinning β generic prose returns empty,and/orβ empty,"a" over "b"β empty, single capitalized words β empty) - 2 new
graceful-retrieval.test.tsADR-147 assertions (signal provenance on every fused result; needle-in-haystack) - Full memory suite: 416/420 (4 pre-existing Windows-env failures in
agent-memory-scope,auto-memory-bridge,benchmarkβ untouched files)
Out of scope (follow-ups)
- Dedicated SQL entity index β current per-entity
searchKeywordcalls are fine for typical query entity counts (1-3); unbounded if a query mentions 20+. A future ADR can add anentity_indextable for hard-bound latency. - Async writes by default (ADR-147 P3) β orthogonal; consolidator already handles HNSW background rebuild.
- LoCoMo benchmark publication (ADR-147 P4) β needs harness wiring + dataset access; separate workstream.