# 2. RAG-Fusion (with Reciprocal Rank Fusion)

**What:** Multi-Query + intelligent ranking using RRF

**Why:** Documents appearing highly across multiple queries get priority

**When:** When ranking quality is critical (biomedical, legal, scientific QA)

**Stage:** Pre-Retrieval (query generation) + Post-Retrieval (re-ranking)

**LLM Calls:** 2 (generate variations + final answer)

---

### RRF Formula

```
RRF_score = Σ[1 / (k + rank)]  where k=60
```

### Why RRF is Better Than Score Averaging

- Uses **rank position**, not absolute similarity scores
- Robust to **different scoring scales** (e.g., cosine 0-1 vs BM25 0-100)
- Boosts documents that rank **high across multiple queries**

### Example

A document appears at ranks [0, 2, 0] across 3 queries:

```
RRF = 1/(60+0) + 1/(60+2) + 1/(60+0)
    = 0.0167 + 0.0161 + 0.0167
    = 0.0495
```

In [None]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

from config import model, setup_vectorstore, get_retriever, format_docs

## Step 1: Generate Query Variations

Same as Multi-Query - generate different versions of the question.

In [None]:
def generate_multi_queries(question, num_variations=3):
    """Generate multiple query variations"""
    
    template = """Generate {num_variations} different versions of the given question.
Each version should ask the same thing but with different words/perspectives.

Original question: {question}

Provide numbered alternatives:"""
    
    prompt = ChatPromptTemplate.from_template(template)
    chain = prompt | model | StrOutputParser()
    
    response = chain.invoke({
        "question": question,
        "num_variations": num_variations
    })
    
    queries = [question]
    for line in response.split("\n"):
        line = line.strip()
        if line and len(line) > 5:
            if line[0].isdigit():
                line = line.split(". ", 1)[-1].split(") ", 1)[-1]
            queries.append(line)
    
    return queries[:num_variations + 1]


# Test
queries = generate_multi_queries("What is task decomposition?", 3)
for q in queries:
    print(f"  - {q}")

## Step 2: Reciprocal Rank Fusion (RRF)

Combine multiple result lists with rank-based scoring.

**Args:**
- `results_list`: List of document lists from different queries
- `k`: RRF constant (default=60)

**Returns:**
- List of (document, rrf_score) tuples, sorted by score descending

In [None]:
def reciprocal_rank_fusion(results_list, k=60):
    """Combine results using RRF scoring"""
    
    fused_scores = {}
    doc_map = {}
    
    for docs in results_list:
        for rank, doc in enumerate(docs):
            # Use first 100 chars as document ID
            doc_id = doc.page_content[:100]
            
            if doc_id not in fused_scores:
                fused_scores[doc_id] = 0
                doc_map[doc_id] = doc
            
            # RRF formula
            fused_scores[doc_id] += 1 / (k + rank)
    
    # Sort by score descending
    sorted_docs = sorted(fused_scores.items(), key=lambda x: x[1], reverse=True)
    
    return [(doc_map[doc_id], score) for doc_id, score in sorted_docs]

## Step 3: RAG-Fusion Retrieval

Generate queries → Retrieve for each → Apply RRF

In [None]:
def rag_fusion_retrieve(question, retriever, num_variations=3):
    """RAG-Fusion retrieval with RRF"""
    
    # Generate queries
    queries = generate_multi_queries(question, num_variations)
    
    print("Generated Queries:")
    for i, q in enumerate(queries):
        prefix = "(original)" if i == 0 else f"{i}."
        print(f"  {prefix} {q}")
    
    # Retrieve for each query
    results_list = [retriever.invoke(q) for q in queries]
    
    # Apply RRF
    ranked = reciprocal_rank_fusion(results_list)
    
    print(f"\nRRF Ranked Results (top 5):")
    for i, (doc, score) in enumerate(ranked[:5]):
        source = doc.metadata.get('filename', 'unknown')
        print(f"  {i+1}. Score: {score:.4f} | {source}")
        print(f"     Preview: {doc.page_content[:60]}...")
    
    return ranked, queries

## Step 4: Complete RAG-Fusion Pipeline

In [None]:
def rag_fusion_rag(question, retriever, num_variations=3, top_k=5):
    """Complete RAG-Fusion pipeline"""
    
    # Retrieve with RRF
    ranked, queries = rag_fusion_retrieve(question, retriever, num_variations)
    
    # Get top documents
    docs = [doc for doc, _ in ranked[:top_k]]
    context = format_docs(docs)
    
    # Generate answer
    answer_template = """Answer based ONLY on the context.
If not found, say "Information not found in documents."

Context:
{context}

Question: {question}

Answer:"""
    
    prompt = ChatPromptTemplate.from_template(answer_template)
    chain = prompt | model | StrOutputParser()
    
    answer = chain.invoke({"context": context, "question": question})
    
    print(f"\n{'='*70}")
    print("ANSWER:")
    print("="*70)
    print(answer)
    
    return answer

## Test

In [None]:
# Setup
vectorstore = setup_vectorstore()
retriever = get_retriever(vectorstore, k=5)

# Test questions
test_questions = [
    "Where did Otabek study?",
    "How to use DeMask?",
    "What is DMS?",
    "What is the difference between Graph DTA and Graph DF?"
]

for question in test_questions:
    print(f"\n{'='*70}")
    print(f"Question: {question}")
    print("="*70)
    answer = rag_fusion_rag(question, retriever)
    print("\n" + "-"*70)