Skip to content

feat(retrieval): Reciprocal Rank Fusion for hybrid BM25+vector scoring#31

Merged
emson merged 1 commit intomainfrom
feature/rrf-fusion
Apr 11, 2026
Merged

feat(retrieval): Reciprocal Rank Fusion for hybrid BM25+vector scoring#31
emson merged 1 commit intomainfrom
feature/rrf-fusion

Conversation

@emson
Copy link
Copy Markdown
Owner

@emson emson commented Apr 11, 2026

Summary

  • RRF fusion (stage 2c) replaces the additive union in hybrid_retrieve(). When both vector search and BM25 produce results, Reciprocal Rank Fusion (k=60) merges the two ranked lists into a single relevance score per block. Blocks found by both rankers score higher; BM25-only blocks receive proportional scores instead of similarity=0.0.
  • Zero regression when BM25 is absent: when rank_bm25 is not installed or BM25 returns zero scores, the pipeline falls back to raw cosine similarity — identical to current behavior.
  • No changes to scoring formula, weights, or ScoredBlock datatype: the frozen compute_score() and ScoringWeights are untouched. RRF produces the similarity input; everything downstream is unchanged.

Before vs After

Scenario Before After
BM25-only block similarity=0.0 (penalized) RRF-derived score in [0, 1]
Block in both rankers cosine similarity only RRF boost from both rank positions
BM25 not installed cosine similarity cosine similarity (unchanged)

Test plan

  • All 559 existing tests pass (uv run pytest tests/ -x -q)
  • uv run ruff check src/ tests/ — all checks passed
  • uv run mypy --ignore-missing-imports src/elfmem/ — no issues
  • CI passes on this PR

🤖 Generated with Claude Code

BM25-only blocks previously entered the composite scorer with similarity=0.0,
making them unable to compete with vector-found blocks. RRF (k=60, Cormack
et al. 2009) fuses both ranked lists so blocks found by both rankers score
higher, and BM25-only blocks receive proportional relevance scores.

Falls back to raw cosine similarity when BM25 is absent — zero behavioral
change for users without rank_bm25 installed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@emson emson merged commit 2a4aa08 into main Apr 11, 2026
6 checks passed
@emson emson deleted the feature/rrf-fusion branch April 11, 2026 13:05
@emson emson restored the feature/rrf-fusion branch April 26, 2026 14:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant