Skip to content

feat(retrieval): Reciprocal Rank Fusion for hybrid BM25+vector scoring#32

Merged
emson merged 2 commits intomainfrom
feature/rrf-fusion
Apr 26, 2026
Merged

feat(retrieval): Reciprocal Rank Fusion for hybrid BM25+vector scoring#32
emson merged 2 commits intomainfrom
feature/rrf-fusion

Conversation

@emson
Copy link
Copy Markdown
Owner

@emson emson commented Apr 26, 2026

Summary

  • Replaces naive BM25 deduplication (appending hits with similarity=0.0) with proper Reciprocal Rank Fusion (RRF, Cormack et al. 2009) as a new stage 2c in the hybrid retrieval pipeline
  • Blocks found by both vector and BM25 rankers now receive additive RRF contributions and score higher than blocks found by only one ranker — fixing the previous behaviour where BM25-only hits were actively penalised in composite scoring
  • Scores are normalized to [0.0, 1.0]; RRF_K=60 (standard damping constant)
  • Full backward compatibility: when BM25 is absent or all scores are zero, the pipeline returns raw cosine similarity unchanged — zero behavioural change for users without rank_bm25 installed

Changes

  • src/elfmem/memory/retrieval.py — new _fuse_candidates() helper; pipeline docstring updated to 7 stages; old inline dedup logic removed
  • CHANGELOG.md — added entry under [Unreleased] → Added

Test plan

  • 559 tests pass, 0 regressions (uv run pytest tests/ -x -q)
  • Fallback path: no BM25 signal → returns original vector_ranked unmodified
  • RRF normalization: top-ranked block always gets score 1.0
  • Unit tests for _fuse_candidates in isolation (not added — covered implicitly by integration suite)

Notes

The one gap is dedicated unit tests for _fuse_candidates (both-ranker block scores higher, all-zero BM25 fallback, normalization). The logic is exercised through the full integration suite but not locked down as explicit invariants. These could be added as a follow-up.

🤖 Generated with Claude Code

emson and others added 2 commits April 11, 2026 15:03
BM25-only blocks previously entered the composite scorer with similarity=0.0,
making them unable to compete with vector-found blocks. RRF (k=60, Cormack
et al. 2009) fuses both ranked lists so blocks found by both rankers score
higher, and BM25-only blocks receive proportional relevance scores.

Falls back to raw cosine similarity when BM25 is absent — zero behavioral
change for users without rank_bm25 installed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Design specification for the LearnedMemBench adapter that bridges elfmem
with the external learnedmembench benchmarking framework. Covers:

- Protocol → elfmem API mapping (all operations are direct passthrough)
- Capabilities declaration (supports all 9 LMB capabilities)
- Implementation structure (adapter.py + config.py)
- State introspection API requirements (3 new public methods on MemorySystem)
- Implementation order and patterns

The adapter is thin — elfmem already has direct API equivalents for every
protocol method. Main work is mapping types + exposing state introspection.
@emson emson merged commit 11a5524 into main Apr 26, 2026
6 checks passed
@emson emson deleted the feature/rrf-fusion branch April 26, 2026 15:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant