# ðŸ”€ Hybrid Search & Reranking

> **Educational Notebook 03**: Deep dive into hybrid retrieval with RRF fusion.

---

## ðŸ“‹ Why Hybrid Search?

| Method | Strengths | Weaknesses |
|--------|-----------|------------|
| **Vector** | Semantic similarity, synonyms | Misses exact matches |
| **Keyword** | Exact matches, names, numbers | Misses paraphrases |
| **Hybrid** | Best of both | More complex |

In [1]:
import sys
sys.path.insert(0, '..')

from src.domain.entities import Chunk, TenantId, DocumentId
from src.application.services.fusion import rrf_fusion, weighted_fusion
from src.application.services.scoring import ScoredChunk

## ðŸ“Š Simulating Vector vs Keyword Results

Let's create sample results from both search methods.

In [2]:
# Helper to create chunks
def make_chunk(id: str, text: str) -> Chunk:
    return Chunk(
        id=id,
        tenant_id=TenantId("demo"),
        document_id=DocumentId("doc"),
        text=text
    )

# Simulated vector search results (semantic similarity)
vector_results = [
    ScoredChunk(make_chunk("v1", "Machine learning enables computers to learn from data."), 0.92),
    ScoredChunk(make_chunk("v2", "AI systems can improve through experience."), 0.88),
    ScoredChunk(make_chunk("v3", "Deep learning uses neural networks."), 0.85),
    ScoredChunk(make_chunk("v4", "Data science involves statistical analysis."), 0.80),
]

# Simulated keyword search results (BM25/FTS)
keyword_results = [
    ScoredChunk(make_chunk("k1", "Machine learning algorithms are widely used."), 15.2),
    ScoredChunk(make_chunk("v1", "Machine learning enables computers to learn from data."), 12.8),  # Same as v1!
    ScoredChunk(make_chunk("k2", "The term 'machine learning' was coined in 1959."), 10.5),
    ScoredChunk(make_chunk("k3", "Learning rate is an important hyperparameter."), 8.3),
]

print("Vector Results:")
for r in vector_results:
    print(f"  {r.chunk.id}: {r.score:.2f} - {r.chunk.text[:50]}...")

print("\nKeyword Results:")
for r in keyword_results:
    print(f"  {r.chunk.id}: {r.score:.2f} - {r.chunk.text[:50]}...")

Vector Results:
  v1: 0.92 - Machine learning enables computers to learn from d...
  v2: 0.88 - AI systems can improve through experience....
  v3: 0.85 - Deep learning uses neural networks....
  v4: 0.80 - Data science involves statistical analysis....

Keyword Results:
  k1: 15.20 - Machine learning algorithms are widely used....
  v1: 12.80 - Machine learning enables computers to learn from d...
  k2: 10.50 - The term 'machine learning' was coined in 1959....
  k3: 8.30 - Learning rate is an important hyperparameter....


## ðŸ”€ RRF Fusion

**Reciprocal Rank Fusion** merges results without needing to calibrate scores:

$$\text{RRF\_score}(d) = \sum_{r \in R} \frac{1}{k + \text{rank}_r(d)}$$

Where:
- $k$ is a constant (default 60)
- $\text{rank}_r(d)$ is the rank of document $d$ in result list $r$

In [3]:
# Apply RRF fusion
fused = rrf_fusion(
    vector_hits=vector_results,
    keyword_hits=keyword_results,
    k=60,
    out_limit=10
)

print("RRF Fused Results:")
print("=" * 60)
for i, r in enumerate(fused, 1):
    print(f"{i}. {r.chunk.id}: RRF={r.score:.4f}")
    print(f"   {r.chunk.text[:60]}...")

RRF Fused Results:
1. v1: RRF=0.0325
   Machine learning enables computers to learn from data....
2. k1: RRF=0.0164
   Machine learning algorithms are widely used....
3. v2: RRF=0.0161
   AI systems can improve through experience....
4. v3: RRF=0.0159
   Deep learning uses neural networks....
5. k2: RRF=0.0159
   The term 'machine learning' was coined in 1959....
6. v4: RRF=0.0156
   Data science involves statistical analysis....
7. k3: RRF=0.0156
   Learning rate is an important hyperparameter....


## ðŸŽ¯ Understanding RRF Scores

Notice that `v1` appears in BOTH result lists, so it gets boosted:

In [4]:
# Manual RRF calculation for v1
k = 60

# v1 is rank 1 in vector results
v1_vector_contribution = 1 / (k + 1)

# v1 is rank 2 in keyword results
v1_keyword_contribution = 1 / (k + 2)

v1_total = v1_vector_contribution + v1_keyword_contribution

print(f"v1 RRF Score Breakdown:")
print(f"  Vector (rank 1): 1/(60+1) = {v1_vector_contribution:.4f}")
print(f"  Keyword (rank 2): 1/(60+2) = {v1_keyword_contribution:.4f}")
print(f"  Total: {v1_total:.4f}")

# Compare to k1 (only in keyword, rank 1)
k1_total = 1 / (k + 1)
print(f"\nk1 RRF Score: {k1_total:.4f} (only in keyword list)")
print(f"\nv1 is higher because it appears in BOTH lists!")

v1 RRF Score Breakdown:
  Vector (rank 1): 1/(60+1) = 0.0164
  Keyword (rank 2): 1/(60+2) = 0.0161
  Total: 0.0325

k1 RRF Score: 0.0164 (only in keyword list)

v1 is higher because it appears in BOTH lists!


## ðŸŽ¯ Cross-Encoder Reranking

After fusion, we apply a Cross-Encoder to rerank by actual relevance.

**How it works:**
1. Take (query, passage) pairs
2. Cross-Encoder scores each pair
3. Sort by Cross-Encoder score

In [5]:
# Simulate Cross-Encoder scoring
# In production, use CrossEncoderReranker from src/adapters/rerank/

query = "What is machine learning?"

# Simulated Cross-Encoder scores (would come from the model)
cross_encoder_scores = {
    "v1": 0.95,  # Best match for the query
    "k1": 0.88,
    "v2": 0.72,
    "k2": 0.85,  # Historical context, decent match
    "v3": 0.65,
    "k3": 0.40,  # Poor match - about learning rate
    "v4": 0.50,
}

# Rerank by Cross-Encoder score
reranked = sorted(
    fused,
    key=lambda x: cross_encoder_scores.get(x.chunk.id, 0),
    reverse=True
)[:5]  # Top 5

print(f"Query: '{query}'")
print("\nAfter Cross-Encoder Reranking:")
print("=" * 60)
for i, r in enumerate(reranked, 1):
    ce_score = cross_encoder_scores.get(r.chunk.id, 0)
    print(f"{i}. {r.chunk.id}: CE={ce_score:.2f}")
    print(f"   {r.chunk.text[:60]}...")

Query: 'What is machine learning?'

After Cross-Encoder Reranking:
1. v1: CE=0.95
   Machine learning enables computers to learn from data....
2. k1: CE=0.88
   Machine learning algorithms are widely used....
3. k2: CE=0.85
   The term 'machine learning' was coined in 1959....
4. v2: CE=0.72
   AI systems can improve through experience....
5. v3: CE=0.65
   Deep learning uses neural networks....


## ðŸ“ˆ Comparison: Before vs After Reranking

Notice how reranking improves precision by demoting less relevant results.

In [6]:
print("BEFORE Reranking (RRF order):")
for i, r in enumerate(fused[:5], 1):
    ce_score = cross_encoder_scores.get(r.chunk.id, 0)
    print(f"  {i}. {r.chunk.id} (CE={ce_score:.2f})")

print("\nAFTER Reranking (CE order):")
for i, r in enumerate(reranked, 1):
    ce_score = cross_encoder_scores.get(r.chunk.id, 0)
    print(f"  {i}. {r.chunk.id} (CE={ce_score:.2f})")

BEFORE Reranking (RRF order):
  1. v1 (CE=0.95)
  2. k1 (CE=0.88)
  3. v2 (CE=0.72)
  4. v3 (CE=0.65)
  5. k2 (CE=0.85)

AFTER Reranking (CE order):
  1. v1 (CE=0.95)
  2. k1 (CE=0.88)
  3. k2 (CE=0.85)
  4. v2 (CE=0.72)
  5. v3 (CE=0.65)


## ðŸ“š Key Takeaways

1. **Hybrid Search** combines semantic (vector) and lexical (keyword) retrieval
2. **RRF Fusion** merges results without requiring score calibration
3. **Cross-Encoder Reranking** improves precision by scoring (query, passage) pairs
4. Items appearing in BOTH result lists get boosted by RRF
5. Reranking is crucial for production RAG quality

---

ðŸŽ‰ **Congratulations!** You've completed the RAG Engine Mini educational notebooks.