**Reciprocal Rank Fusion (RRF)** is a technique used in information retrieval to combine the results of multiple ranking systems (e.g., different search algorithms or retrieval models) into a single, improved ranking. It is particularly useful in **hybrid search systems**, where you want to leverage the strengths of multiple retrieval methods (e.g., semantic search and syntactic search) to produce better results.

---

### **How Reciprocal Rank Fusion Works**

RRF assigns a score to each document based on its rank in multiple ranked lists (e.g., from different retrieval models). The key idea is that documents appearing in the top ranks of multiple lists should be ranked higher in the final combined list.

The formula for RRF is:

\[
\text{RRF Score}(d) = \sum_{i=1}^{n} \frac{1}{k + \text{rank}_i(d)}
\]

Where:
- \( d \): A document.
- \( n \): The number of ranked lists being combined.
- \( \text{rank}_i(d) \): The rank of document \( d \) in the \( i \)-th ranked list.
- \( k \): A constant (typically \( k = 60 \)) that controls the influence of lower-ranked documents.

The final ranking is determined by sorting documents based on their RRF scores in descending order.

---

### **Why Use Reciprocal Rank Fusion?**
1. **Combines Strengths of Multiple Models**: RRF allows you to combine results from different retrieval models (e.g., dense retrieval for semantic search and BM25 for syntactic search) to improve overall retrieval quality.
2. **Handles Disagreements Between Models**: If one model ranks a document highly but another ranks it low, RRF balances these disagreements by considering the reciprocal of the ranks.
3. **Simple and Effective**: RRF is computationally efficient and does not require training or additional parameters.

---

### **Example of RRF in Action**

Suppose you have two ranked lists from different retrieval models:

- **List 1 (Semantic Search)**:
  1. Document A
  2. Document B
  3. Document C

- **List 2 (Syntactic Search)**:
  1. Document B
  2. Document A
  3. Document D

Using RRF with \( k = 60 \), the scores for each document are calculated as follows:

- **Document A**:
  - Rank in List 1: 1 → \( \frac{1}{60 + 1} = \frac{1}{61} \)
  - Rank in List 2: 2 → \( \frac{1}{60 + 2} = \frac{1}{62} \)
  - Total RRF Score: \( \frac{1}{61} + \frac{1}{62} \approx 0.0326 \)

- **Document B**:
  - Rank in List 1: 2 → \( \frac{1}{60 + 2} = \frac{1}{62} \)
  - Rank in List 2: 1 → \( \frac{1}{60 + 1} = \frac{1}{61} \)
  - Total RRF Score: \( \frac{1}{62} + \frac{1}{61} \approx 0.0326 \)

- **Document C**:
  - Rank in List 1: 3 → \( \frac{1}{60 + 3} = \frac{1}{63} \)
  - Rank in List 2: Not present → \( 0 \)
  - Total RRF Score: \( \frac{1}{63} + 0 \approx 0.0159 \)

- **Document D**:
  - Rank in List 1: Not present → \( 0 \)
  - Rank in List 2: 3 → \( \frac{1}{60 + 3} = \frac{1}{63} \)
  - Total RRF Score: \( 0 + \frac{1}{63} \approx 0.0159 \)

The final combined ranking (sorted by RRF score) is:
1. Document A (0.0326)
2. Document B (0.0326)
3. Document C (0.0159)
4. Document D (0.0159)

---

### **When to Use RRF**
- **Hybrid Search Systems**: When combining results from semantic search (e.g., dense retrieval) and syntactic search (e.g., BM25).
- **Ensemble Retrieval**: When using multiple retrieval models to improve robustness and accuracy.
- **Cross-Modal Retrieval**: When combining results from different modalities (e.g., text and image search).

---

### **Advantages of RRF**
1. **No Training Required**: RRF is a simple, unsupervised method that does not require labeled data or training.
2. **Flexible**: Can be used to combine any number of ranked lists.
3. **Effective**: Often outperforms other rank fusion methods (e.g., score averaging or rank averaging).

---

### **Comparison with Other Rank Fusion Methods**

| Method | Description | Pros | Cons |
|----------------------|-----------------------------------------------------------------------------|---------------------------------------|---------------------------------------|
| **Reciprocal Rank Fusion (RRF)** | Combines ranks using reciprocal scores. | Simple, effective, no training needed | Requires tuning of \( k \) |
| **Score Averaging** | Averages the scores of documents from different models. | Simple to implement | Sensitive to score normalization |
| **Rank Averaging** | Averages the ranks of documents from different models. | Simple to implement | Ignores relative importance of ranks |
| **Weighted Fusion** | Combines scores or ranks using weighted sums (e.g., learned weights). | Can optimize for specific tasks | Requires training and labeled data |

---

### **Use Case in RAG**
In a **RAG system**, RRF can be used to combine results from:
- **Dense Retrieval** (semantic search using embeddings).
- **Sparse Retrieval** (syntactic search using BM25).

By fusing the results, RRF ensures that the generative model receives the most relevant documents, improving the quality of the final response.

---

### **Summary**
- **Reciprocal Rank Fusion (RRF)** is a rank fusion technique that combines multiple ranked lists into a single, improved ranking.
- It is particularly useful in hybrid search systems, where you want to leverage the strengths of different retrieval models.
- RRF is simple, effective, and does not require training, making it a popular choice for combining search results in systems like RAG.