### 📖 Where We Are

**In the previous sections**, we mastered the end-to-end RAG pipeline using different vector stores (ChromaDB, FAISS, Pinecone). All of those systems relied on a single method for retrieval: **dense (semantic) search**, which finds documents based on their conceptual meaning.

**In this new section on Hybrid Search Strategies**, we will learn how to make our retrieval process even more powerful. This notebook introduces **Hybrid Search**, a technique that combines the strengths of semantic search with traditional keyword search to create a more robust and accurate retrieval system.

### 1. Hybrid Search - Combining Dense and Sparse Retrievers

In [1]:
# --- LangChain Imports ---
# FAISS for our dense vector store.
from langchain_community.vectorstores import FAISS
# HuggingFace embeddings for our dense retriever.
from langchain_huggingface import HuggingFaceEmbeddings
# BM25Retriever for our sparse, keyword-based retriever.
from langchain_community.retrievers import BM25Retriever
# EnsembleRetriever to combine the results of multiple retrievers.
from langchain.retrievers import EnsembleRetriever
# Standard Document object.
from langchain.schema import Document

In [2]:
# --- Step 1: Create Sample Documents ---
docs = [
    Document(page_content="LangChain helps build LLM applications."),
    Document(page_content="Pinecone is a vector database for semantic search."),
    Document(page_content="The Eiffel Tower is located in Paris."),
    Document(page_content="Langchain can be used to develop agentic ai application."),
    Document(page_content="Langchain has many types of retrievers.")
]

# --- Step 2: Set up the Dense Retriever ---
# This is the semantic search component.
embedding_model = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
dense_vectorstore = FAISS.from_documents(docs, embedding_model)
dense_retriever = dense_vectorstore.as_retriever(search_kwargs={"k": 3})

  from .autonotebook import tqdm as notebook_tqdm


#### The Sparse Retriever: BM25

For our sparse retriever, we'll use **BM25 (Best Matching 25)**. This is a modern, industry-standard algorithm for keyword-based search. It's an evolution of TF-IDF and is highly effective at ranking documents based on the terms they share with a query.

In [3]:
# --- Step 3: Set up the Sparse Retriever ---
# This is the keyword search component.
sparse_retriever = BM25Retriever.from_documents(docs)
sparse_retriever.k = 3  # Set to retrieve the top 3 matching documents.

#### Combining Retrievers with `EnsembleRetriever`

The `EnsembleRetriever` is a powerful LangChain component that takes a list of different retrievers and combines their results using a weighted scoring method (Reciprocal Rank Fusion). This is how we implement the hybrid search formula, with the `weights` parameter acting as our **α** value.

In [4]:
# --- Step 4: Create the Hybrid Retriever ---
hybrid_retriever = EnsembleRetriever(
    # A list of the retrievers we want to combine.
    retrievers=[dense_retriever, sparse_retriever],
    # The weights determine the balance between the retrievers.
    # Here, we're giving more importance to the semantic (dense) search.
    # The weights must sum to 1.0.
    weights=[0.7, 0.3]
)

In [5]:
# --- Step 5: Query and Get Results ---
query = "How can I build an application using LLMs?"
# When we invoke the hybrid retriever, it runs both dense and sparse searches,
# combines the results, and returns a single, re-ranked list.
results = hybrid_retriever.invoke(query)

print(f"Query: {query}")
for i, doc in enumerate(results):
    print(f"🔹 Document {i+1}: {doc.page_content}")

Query: How can I build an application using LLMs?
🔹 Document 1: LangChain helps build LLM applications.
🔹 Document 2: Langchain can be used to develop agentic ai application.
🔹 Document 3: Pinecone is a vector database for semantic search.
🔹 Document 4: Langchain has many types of retrievers.


### 2. RAG Pipeline with the Hybrid Retriever

The beauty of LangChain's modular design is that we can now drop our new `hybrid_retriever` into the same RAG chain structure we've used before with no other changes.

In [6]:
import os
from dotenv import load_dotenv
from langchain_groq import ChatGroq
from langchain.prompts import PromptTemplate
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.chains.retrieval import create_retrieval_chain

# Load API keys.
load_dotenv()
os.environ["GROQ_API_KEY"] = os.getenv("GROQ_API_KEY")

In [10]:
# Initialize the LLM.
llm = ChatGroq(model="llama-3.1-8b-instant", temperature=0.2)

# Create the prompt template.
prompt = PromptTemplate.from_template(
    """You are an assistant for question-answering tasks. 
Use the following pieces of retrieved context to answer the question. 
If you don't know the answer, just say that you don't know. 
Keep the answer concise.
    Context: {context}
    Question: {input}
    """
)

In [8]:
# Create the two main components of the RAG chain.
document_chain = create_stuff_documents_chain(llm=llm, prompt=prompt)
rag_chain = create_retrieval_chain(
    retriever=hybrid_retriever, # Plug in our new hybrid retriever here!
    combine_docs_chain=document_chain
)

In [11]:
# Ask a question using the full RAG pipeline.
query = {"input": "How can I build an app using LLMs?"}
response = rag_chain.invoke(query)

# Print the results.
print("✅ Answer:", response["answer"])
print("\n📄 Source Documents:")
for i, doc in enumerate(response["context"]):
    print(f"Doc {i+1}: {doc.page_content}")

✅ Answer: To build an app using LLMs (Large Language Models), you can leverage LangChain, a powerful framework that helps you develop agentic AI applications. Here's a step-by-step guide to get you started:

1. **Choose a LangChain Retriever**: LangChain offers various types of retrievers, such as:
	* Embedding retriever: uses vector embeddings to retrieve relevant information.
	* SQL retriever: retrieves data from a SQL database.
	* HTTP retriever: retrieves data from an HTTP endpoint.
	* File retriever: retrieves data from a file.
	* Pinecone retriever: uses Pinecone, a vector database for semantic search.
2. **Select a LangChain Agent**: LangChain agents are the core components that interact with the retriever and generate responses. You can choose from:
	* Chain agent: a basic agent that chains multiple actions together.
	* LLM agent: uses a pre-trained LLM to generate responses.
	* Hybrid agent: combines multiple agents to create a more complex response.
3. **Design your App's Wor

### 🔑 Key Takeaways

* **Hybrid Search is Powerful**: It combines dense (semantic) and sparse (keyword) search to overcome their individual weaknesses, leading to more robust and relevant retrieval.
* **Dense vs. Sparse**: Dense retrieval understands meaning and context, while sparse retrieval excels at finding exact keywords, acronyms, and specific terms.
* **`BM25Retriever` for Keywords**: This is LangChain's standard and highly effective component for sparse, keyword-based search.
* **`EnsembleRetriever` for Combination**: This specialized retriever is the key to implementing hybrid search in LangChain. It combines results from multiple retrievers using a weighted scoring system.
* **Modular and Flexible**: A hybrid retriever can be used as a drop-in replacement for any other retriever in a LangChain RAG pipeline, demonstrating the framework's flexibility.