# **Solution Notebook: Module 2 - Hybrid Search**

*This notebook contains the solutions for the guided hands-on exercise.*

-----

### **Module 2: Improving Recall with Hybrid Search**

**Objective:**
In our first module, we saw a critical **Recall Failure**. Our basic RAG system, using only semantic search, completely missed the correct document chunk for a query about "share repurchases." It failed to find the right information in the knowledge base.

The objective of this module is to solve that recall problem by implementing a more powerful **Hybrid Search** system. We will combine traditional keyword-based search with the semantic search we've already learned. This will create a much more reliable retriever.

**Learning Objectives:**
By the end of this module, you will be able to:
- Explain the core concept of Hybrid Search and understand the distinct roles of dense (semantic) and sparse (keyword) vectors.
- Implement a hybrid data strategy by creating both dense and sparse embeddings for your documents using open-source models.
- Configure and populate a Qdrant collection that handles a sophisticated hybrid search workload.
- Build a custom retrieval function that performs both dense and sparse searches and fuses the results.
- Diagnose a **Recall Failure** and understand why a narrow search (`k=4`) can cause the system to fail, even with a better algorithm.

**Core Concept: Hybrid Search with Qdrant**
We will create and store two types of vectors for each document chunk:
1.  **Dense Vector (from `bge-m3`):** Captures the *semantic meaning* and conceptual relationships.
2.  **Sparse Vector (from `Splade`):** Captures the *keyword importance*.

When a query comes in, our system will perform two separate searches—one for meaning and one for keywords—and then combine the results. This gives us the best of both worlds, making our system far more robust against the type of keyword-based failure we saw in Module 1.


### **Step 1: Install Dependencies**

In [1]:
# Install all required libraries
!pip install -q langchain langchain-community langchain-groq langchain_huggingface qdrant-client pypdf fastembed

# Ignore standard warnings
import warnings
warnings.filterwarnings('ignore')

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/2.5 MB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━[0m[91m╸[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.4/2.5 MB[0m [31m12.3 MB/s[0m eta [36m0:00:01[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m2.5/2.5 MB[0m [31m40.7 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.5/2.5 MB[0m [31m33.6 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/329.0 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m329.0/329.0 kB[0m [31m26.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m305.5/305.5 kB[0m [31m24.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m100.9/100.9 kB[0m [31m8.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━

-----

### **Step 2: Setup API Key & Document Loading**



This step remains the same as Module 1. In this module, we reuse our Module-1 API keys, we load the NVIDIA financial report PDF, and split it into chunks.

In [2]:
import os
from google.colab import userdata
from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

# --- Setup API Key ---
# Make sure you have added your GROQ_API_KEY to the Colab secrets manager
os.environ["GROQ_API_KEY"] = userdata.get('GROQ_API_KEY')

# --- Load and Split Document ---
# Make sure you have uploaded the NVIDIA Q1 FY26 PDF to your Colab session
pdf_path = "./NVIDIA-Q1-FY26-Financial-Results.pdf"
loader = PyPDFLoader(pdf_path)
documents = loader.load()

# Use the same chunking strategy as Module 1
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
docs = text_splitter.split_documents(documents)

print(f"Document loaded and split into {len(docs)} chunks.")

Document loaded and split into 191 chunks.


-----

### **Step 3: Initialize Qdrant for Hybrid Search**

This is a key step. We will create a Qdrant client and then create a new **collection** that is specifically configured to handle both dense and sparse vectors. This is different from Module 1 where we only had one type of vector.


In [3]:
from qdrant_client import QdrantClient, models

# Initialize an in-memory Qdrant client
client = QdrantClient(location=":memory:")

# Define the collection name
collection_name = "rag_foundations_m2_guided"

# Create the collection with configurations for both dense and sparse vectors
print(f"Creating Qdrant collection '{collection_name}' for hybrid search...")

# SOLUTION
client.recreate_collection(
    collection_name=collection_name,
    vectors_config={
        "dense": models.VectorParams(size=1024, distance=models.Distance.COSINE)
    },
    sparse_vectors_config={
        "text-sparse": models.SparseVectorParams(
            index=models.SparseIndexParams(
                on_disk=False
            )
        )
    }
)

print("Collection created successfully.")

Creating Qdrant collection 'rag_foundations_m2_guided' for hybrid search...
Collection created successfully.


-----

### **Step 4: Embed and Store Documents**


Now we will perform the main data processing. We will loop through every document chunk, create both a dense and a sparse vector for it, and then store them together in our new Qdrant collection.

In [4]:
from langchain_huggingface import HuggingFaceEmbeddings
#from langchain_community.embeddings import HuggingFaceBgeEmbeddings

from fastembed import SparseTextEmbedding
from tqdm.auto import tqdm

print("Initializing local embedding models...")
# 1. Initialize our embedding models
dense_embed_model = HuggingFaceEmbeddings(
    model_name="BAAI/bge-m3", model_kwargs={"device": "cpu"}, encode_kwargs={"normalize_embeddings": True}
)
sparse_embed_model = SparseTextEmbedding(model_name="prithivida/Splade_PP_en_v1")
print("Models initialized.")

# 2. Embed and prepare all documents for upsert
print("Embedding and preparing all documents for upsert...")
points_to_upsert = []
for i, doc in enumerate(tqdm(docs, desc="Processing All Documents")):
    doc_text = doc.page_content

    # SOLUTION (Part 1)
    # Create the dense vector for the doc_text.
    dense_vec = dense_embed_model.embed_query(doc_text)

    # SOLUTION (Part 2)
    # Create the sparse vector for the doc_text.
    sparse_vec = list(sparse_embed_model.embed([doc_text]))[0]

    # SOLUTION (Part 3)
    # Create a Qdrant PointStruct to hold all the data.
    point = models.PointStruct(
        id=i,
        payload={"text": doc_text, **doc.metadata},
        vector={
            "dense": dense_vec,
            "text-sparse": models.SparseVector(
                indices=sparse_vec.indices.tolist(),
                values=sparse_vec.values.tolist()
            ),
        },
    )

    points_to_upsert.append(point)

# 3. Upsert the points to Qdrant
# SOLUTION (Part 4)
# Upload the prepared points to your Qdrant collection.
client.upsert(
    collection_name=collection_name,
    points=points_to_upsert,
    wait=True
)

print(f"Successfully embedded and upserted all {len(docs)} documents.")

Initializing local embedding models...


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/123 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/54.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/687 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/2.27G [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/2.27G [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/444 [00:00<?, ?B/s]

sentencepiece.bpe.model:   0%|          | 0.00/5.07M [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/17.1M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/964 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/191 [00:00<?, ?B/s]

Fetching 5 files:   0%|          | 0/5 [00:00<?, ?it/s]

config.json:   0%|          | 0.00/755 [00:00<?, ?B/s]

model.onnx:   0%|          | 0.00/532M [00:00<?, ?B/s]

tokenizer_config.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/695 [00:00<?, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

Models initialized.
Embedding and preparing all documents for upsert...


Processing All Documents:   0%|          | 0/191 [00:00<?, ?it/s]

Successfully embedded and upserted all 191 documents.


-----

### **Step 5: Build the Hybrid RAG Chain**

Now we'll build our retrieval function. This function needs to perform two separate searches in Qdrant (one for dense vectors, one for sparse) and then intelligently combine the results before passing them to the LLM.

In [7]:
from langchain_groq import ChatGroq
from langchain.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser
from langchain_core.documents import Document

# Initialize the Groq LLM
llm = ChatGroq(temperature=0, model_name="meta-llama/llama-4-scout-17b-16e-instruct")

# --- Helper function to visualize the context ---
def pretty_print_docs(docs):
    print(f"Found {len(docs)} documents to pass to the LLM.\n")
    for i, doc in enumerate(docs):
        source = doc.metadata.get('source', 'Unknown Source'); page = doc.metadata.get('page', 'Unknown Page')
        print(f"  [{i+1}] Source: {source} (Page: {page})"); print(f"      Content: '{doc.page_content[:150]}...'")
    print("-" * 50)

# --- Custom Retrieval Function ---
def qdrant_hybrid_retrieve(query: str, top_k=4) -> list[Document]:
    """
    Performs hybrid search and returns a list of LangChain Document objects.
    We are deliberately changing k=2 or 4 to demonstrate recall failure.
    """
    # SOLUTION (Part 1)
    # Create the dense and sparse vectors for the input 'query'.

    # This small tweak was added after switching embedding wrappers.
    # Earlier, we were using `HuggingFaceBgeEmbeddings`, and the correct answer was retrieved even with k=4.
    # After switching to the newer `HuggingFaceEmbeddings`, the same query failed to retrieve the answer.
    # Adding the "query: " prefix resolved this issue and brought back the correct result at k=4.
    query = f"query: {query}"


    dense_query_vec = dense_embed_model.embed_query(query)
    sparse_query_vec = list(sparse_embed_model.embed([query]))[0]

    # SOLUTION (Part 2)
    # Perform the two separate searches (dense and sparse) using the client.search() method.
    dense_results = client.search(
        collection_name=collection_name,
        query_vector=models.NamedVector(name="dense", vector=dense_query_vec),
        limit=top_k,
        with_payload=True
    )
    sparse_results = client.search(
        collection_name=collection_name,
        query_vector=models.NamedSparseVector(
            name="text-sparse",
            vector=models.SparseVector(indices=sparse_query_vec.indices.tolist(), values=sparse_query_vec.values.tolist())
        ),
        limit=top_k,
        with_payload=True
    )

    # Print dense retrieval results
    print(f"\n--- Dense Search Results (k={top_k}) ---")
    dense_documents = []
    for result in dense_results:
        doc = Document(page_content=result.payload.get('text', ''), metadata={k: v for k, v in result.payload.items() if k != 'text'})
        dense_documents.append(doc)
    pretty_print_docs(dense_documents)

    # Print sparse retrieval results
    print(f"\n--- Sparse Search Results (k={top_k}) ---")
    sparse_documents = []
    for result in sparse_results:
        doc = Document(page_content=result.payload.get('text', ''), metadata={k: v for k, v in result.payload.items() if k != 'text'})
        sparse_documents.append(doc)
    pretty_print_docs(sparse_documents)

    # --- RRF Fusion Logic ---
    rrf_scores = {}
    doc_lookup = {}
    k_constant = 60  # The RRF constant 'k' dampens the influence of lower-ranked documents.


    # --- Process Dense Search Results ---
    # Iterate through each result from the dense (semantic) search, keeping track of its rank.

    for rank, result in enumerate(dense_results):
        # If this is the first time we've seen this document ID, initialize its score and store its content.
        if result.id not in rrf_scores:
            rrf_scores[result.id] = 0
            doc_lookup[result.id] = Document(page_content=result.payload.get('text', ''), metadata={k: v for k, v in result.payload.items() if k != 'text'})
        # Add the RRF score from the dense search results to the document's total score.
        # The score is calculated as 1 / (k + rank).
        rrf_scores[result.id] += 1 / (k_constant + rank + 1)

    # --- Process Sparse Search Results ---
    # Do the same for the sparse (keyword) search results.
    for rank, result in enumerate(sparse_results):
        # If we see a document for the first time, initialize it.
        if result.id not in rrf_scores:
            rrf_scores[result.id] = 0
            doc_lookup[result.id] = Document(page_content=result.payload.get('text', ''), metadata={k: v for k, v in result.payload.items() if k != 'text'})

        # Add the RRF score from the sparse search results to the document's total score.
        # If a document appeared in both searches, its score will now be the sum of both calculations.
        rrf_scores[result.id] += 1 / (k_constant + rank + 1)


    # Sort documents by RRF score
    sorted_ids = sorted(rrf_scores.keys(), key=lambda x: rrf_scores[x], reverse=True)
    combined_documents = [doc_lookup[doc_id] for doc_id in sorted_ids]

    print(f"\n--- RRF Fusion Results (Hybrid Search with k={top_k}) ---")
    pretty_print_docs(combined_documents)

    return combined_documents

# --- Build the RAG Chain (This part is provided for you) ---
prompt_template = "Answer the question based only on the following context:\n\nContext:\n{context}\n\nQuestion: {question}"
prompt = ChatPromptTemplate.from_template(prompt_template)
rag_chain = (
    {"context": qdrant_hybrid_retrieve, "question": RunnablePassthrough()} | prompt | llm | StrOutputParser()
)
print("RAG chain with Qdrant hybrid retrieval is ready.")

RAG chain with Qdrant hybrid retrieval is ready.


-----

### **Step 6: Test the Hybrid RAG Chain**

This is the moment of truth. First, we will test the query that failed in Module 1 to see if our new hybrid search retriever has solved the problem. Then, we will try a new, more difficult query to see if we can find the limits of our current system.

In [6]:
# --- Run the Test Queries ---
# This part is provided for you

# Query #1: The query that failed in Module 1
module_1_failure_query = "How much did NVIDIA spend on share repurchases in the first quarter of fiscal year 2026?"

# Query #2: Our new, more difficult query for this module
module_2_failure_query = "What was the exact value for \"Tax withholding related to common stock from stock plans\" for the period ending April 27, 2025?"

print("--- Testing Query #1 (The Module 1 Failure) ---")
print(f"Query: {module_1_failure_query}\n")
answer_1 = rag_chain.invoke(module_1_failure_query)
print('\033[92m' + f"Answer: {answer_1}\n" + '\033[0m')
print("-" * 100)


print("\n\n--- Testing Query #2 (Our New Challenge) ---")
print(f"Query: {module_2_failure_query}\n")
answer_2 = rag_chain.invoke(module_2_failure_query)
print('\033[91m' + f"Answer: {answer_2}\n" + '\033[0m')
print("-" * 100)

--- Testing Query #1 (The Module 1 Failure) ---
Query: How much did NVIDIA spend on share repurchases in the first quarter of fiscal year 2026?


--- Dense Search Results (k=4) ---
Found 4 documents to pass to the LLM.

  [1] Source: ./NVIDIA-Q1-FY26-Financial-Results.pdf (Page: 13)
      Content: 'NVIDIA CORPORATION AND SUBSIDIARIESNOTES TO CONDENSED CONSOLIDATED FINANCIAL STATEMENTS (Continued)
(Unaudited)
Property and Equipment:
Property, equi...'
  [2] Source: ./NVIDIA-Q1-FY26-Financial-Results.pdf (Page: 16)
      Content: 'NVIDIA CORPORATION AND SUBSIDIARIESNOTES TO CONDENSED CONSOLIDATED FINANCIAL STATEMENTS (Continued)
(Unaudited)
Total future purchase commitments as o...'
  [3] Source: ./NVIDIA-Q1-FY26-Financial-Results.pdf (Page: 9)
      Content: 'NVIDIA CORPORATION AND SUBSIDIARIESNOTES TO CONDENSED CONSOLIDATED FINANCIAL STATEMENTS (Continued)
(Unaudited)
Note 4 - Income Taxes
Income tax expen...'
  [4] Source: ./NVIDIA-Q1-FY26-Financial-Results.pdf (Page: 20)
      Conten

# Module 2 Conclusion: A Step Forward, and a Critical Failure

After completing the notebook, you should see that the results from this module are a fantastic real-world lesson in building RAG systems.

**1. A Major Success**: Our new Hybrid Search retriever has successfully solved the critical failure from Module 1. For the query about "share repurchases," the system correctly found the relevant chunks and provided the right answer ($14.5 billion).

This proves that by combining dense (semantic) and sparse (keyword) vectors, we can build a system with excellent recall—the ability to find relevant documents even when the query relies on specific keywords.

**2. A New, More Subtle Failure**: However, you will see when we test it with our second, more difficult query, the system fails in a critical way.

The Query: \"What was the exact value for 'Tax withholding related to common stock from stock plans' for the period ending April 27, 2025?\"

The Result: The system returns the wrong value: $1,752 million (the value from the wrong year).

**The Diagnosis**: A Recall Failure. This is not a case of the LLM getting confused. The root cause is that our retriever, with its narrow search of k=2 or 4, never finds the correct chunk of text from the document. The combination of a basic document loader (PyPDFLoader) that struggles with tables and a small k value means that the correct information from page 6 never passes to the LLM. The system retrieves other, less relevant chunks that happens to contain the wrong number.

### Key Takeaway

**Hybrid Search** is a powerful tool, but it's not a magic bullet. The performance of a RAG pipeline is only as strong as its weakest link. We've just proven that even with a strong search algorithm, a poor chunking strategy combined with an overly narrow retrieval setting (k=2) can cause the entire system to fail. We have not yet built a truly high-recall system capable of handling this difficult query.

### Next Up

**In Module 3**, we'll implement a robust, **two-stage Retrieve and Re-rank** architecture to fix our system's precision issues. First, we'll solve the recall problem by using our **fast retriever** to cast a wider net (increasing k to 10), ensuring the correct documents are found, even if they're buried in noise.

Then, we'll introduce a **Re-Ranker—an intelligent second stage** that analyzes these noisy results, promotes the single best answer to the top, and guarantees our LLM receives the cleanest possible context for generating an accurate response.

For our learning path, we're tackling the re-ranker first to demonstrate a powerful technique for fixing an imprecise retriever, a common real-world challenge.

However, it's critical to understand that the ideal production-grade solution is to use both a layout-aware parser and a re-ranker. The best practice is always to fix data quality at the source. Therefore, after mastering re-ranking, the perfect next step would be to replace our basic parser with a tool like **LlamaParse or Unstructured.io** to see how a clean data foundation can dramatically improve the entire system's efficiency and precision.

# Appendix A

## One Important Finding While Switching from Deprecated HuggingFaceBgeEmbeddings to the Newer HuggingFaceEmbeddings in LangChain ##

🧪 Issue Summary: Query Retrieval Failure After Changing Embedding Wrapper

Background
  - Initially, I used HuggingFaceBgeEmbeddings with the model "BAAI/bge-m3" and was able to retrieve the correct document even with k=4 during hybrid search
  - Later, I migrated to the newer recommended HuggingFaceEmbeddings wrapper from LangChain using the same model and parameters:

model_name = "BAAI/bge-m3"
model_kwargs = {"device": "cpu"}
encode_kwargs = {"normalize_embeddings": True}


However, after this change, the same query failed to retrieve the correct document even at k=6.

⸻

🔎 Investigation
  - HuggingFaceBgeEmbeddings is a wrapper that may be formatting queries internally — for example, it might be prepending the "query: " instruction prefix (especially if is_instruction=True, which is the default).
  - On the other hand, HuggingFaceEmbeddings is a generic wrapper and does not apply any such formatting. It simply passes the query string as-is to the model.
  - Since the BGE family of models (e.g., bge-m3) is instruction-tuned, they expect queries to be prefixed with "query: " in order to produce correct semantic embeddings.

⸻

✅ Fix

To align with the expected format of instruction-tuned models, we updated the code to explicitly prepend "query: " before embedding:

query = f"query: {query}"


After applying this tweak, the correct document was retrieved again at k=4, restoring similar behavior to the original wrapper (though not guaranteed to be identical).

⸻

🧠 **What’s the Difference Between the Two Wrappers?**

HuggingFaceBgeEmbeddings (Specialized Wrapper – Deprecated)

  - Tailored specifically for the BGE family of models from the Beijing Academy of Artificial Intelligence.
  - Likely includes model-specific behavior, such as auto-prepending instruction prefixes like "query: " to queries.
  - Optimized for ease-of-use when working with instruction-tuned embedding models.

HuggingFaceEmbeddings (General-Purpose Wrapper – Recommended)

  - Designed to work universally with any SentenceTransformer-compatible model from the Hugging Face Hub.
  - Does not make assumptions about the model’s formatting requirements.
  - Does not add instruction prefixes, making it more flexible but requiring the developer to handle formatting when needed (e.g., for instruction-tuned models like BGE).

## Understanding Reciprocal Rank Fusion (RRF)

### The "Two Movie Critics" Analogy for RRF

Imagine you have two expert movie critics who you trust:
* **Critic A (Our Dense Search):** This critic is great at understanding the *feeling* and *themes* of a movie.
* **Critic B (Our Sparse Search):** This critic is excellent at catching specific details and keywords in the dialogue.

You ask them both to recommend the top 3 movies about "space exploration."

#### Step 1: Get the Two Lists

The critics come back with slightly different ranked lists:

**Critic A (Dense/Semantic) Results:**
1.  *Galaxy Quest* (A great parody, captures the *feeling* of exploration)
2.  *Apollo 13* (About a real mission)
3.  *The Martian* (Focuses on survival)

**Critic B (Keyword) Results:**
1.  *Apollo 13* (Uses the exact keyword "space exploration")
2.  *Interstellar* (About exploring new galaxies)
3.  *The Martian* (Also about space)

#### Step 2: The RRF Code in Action

Now, let's see what our RRF code does with these two lists.

1.  **It creates an empty scoreboard (`rrf_scores`) and a library of the movies (`doc_lookup`).**

2.  **It processes Critic A's list:**
    * *Galaxy Quest* is ranked #1, so it gets a high score (e.g., `1 / (60 + 1)`).
    * *Apollo 13* is ranked #2, so it gets a slightly lower score (e.g., `1 / (60 + 2)`).
    * *The Martian* is ranked #3, so it gets an even lower score (e.g., `1 / (60 + 3)`).

3.  **It processes Critic B's list:**
    * *Apollo 13* is ranked #1. It's already on our scoreboard, so we **add** more points to its score. It now has a very high total score!
    * *Interstellar* is ranked #2. It's new, so it gets its first score (e.g., `1 / (60 + 2)`).
    * *The Martian* is ranked #3. It's already on our scoreboard, so we **add** more points to its existing score.

#### Step 3: The Final Fused Ranking

After adding up all the points, our final scoreboard looks something like this (higher score is better):

1.  **Apollo 13:** (High score from Critic A + Highest score from Critic B) -> **Highest Score**
2.  **The Martian:** (Medium score from Critic A + Medium score from Critic B) -> **High Score**
3.  **Galaxy Quest:** (Highest score from Critic A + No score from Critic B) -> **Good Score**
4.  **Interstellar:** (No score from Critic A + High score from Critic B) -> **Good Score**

The code then sorts the movies by this new RRF score. The final, fused list it sends to the LLM would be: `[Apollo 13, The Martian, Galaxy Quest, Interstellar]`.

This shows how RRF intelligently promotes the documents that **both** search methods agree are important, giving us a much more reliable and relevant final ranking.