# Hybrid Retriever - Combining Dense And Sparse Retriever

## Hybrid Retrieval: What It Is and Why It Matters

### What Is Hybrid Retrieval?

**Hybrid retrieval** is an information retrieval approach that combines:

- **Sparse (lexical) search** ‚Äî e.g. BM25  
- **Dense (semantic) search** ‚Äî vector embeddings

The goal is to leverage the strengths of both methods to retrieve more relevant documents than either approach alone.

In practice, hybrid retrieval runs **both searches in parallel** and then **combines or re-ranks** the results using weighted scores or fusion techniques.

---

### Why Not Use Just One Method?

#### Sparse (BM25) Only
‚úÖ Great for exact keyword matches  
‚ùå Weak with synonyms, paraphrasing, and natural language queries  

#### Dense (Semantic) Only
‚úÖ Understands meaning and intent  
‚ùå Can miss rare or critical keywords  
‚ùå Less transparent scoring  

Hybrid retrieval solves these limitations by using **both signals together**.

---

### Benefits of Hybrid Retrieval

#### 1. Improved Recall
- BM25 catches exact keyword matches
- Semantic search captures meaning even when wording differs  
**Result:** fewer relevant documents are missed

---

#### 2. Handles Synonyms and Rephrasing
- Semantic search matches:
  - ‚Äúcreate app‚Äù ‚Üí ‚Äúbuild LLM system‚Äù
- BM25 still ensures exact terms like `"LLM"` or `"agent"` are respected

---

#### 3. More Robust to Query Styles
Supports:
- Short keyword queries: `"LangChain agent"`
- Natural language questions:  
  `"How do I use LangChain to talk to tools?"`

---

#### 4. Preserves Lexical Importance
- BM25 emphasizes **rare and critical terms**
- Essential in:
  - Medical
  - Legal
  - Technical domains  
Example: rare terms like `"osteoporosis"` should strongly influence ranking

---

#### 5. Adapts to Document Diversity
Works well across:
- Structured docs (APIs, specs)
- Unstructured text (blogs, PDFs, notes)

Hybrid retrieval adapts to varying writing styles and formats.

---

#### 6. Easy to Tune
You can control influence with weights:

```text
final_score = 0.7 * dense_score + 0.3 * sparse_score
```

## When to Use BM25 + Semantic Search Together?

Hybrid retrieval (BM25 + semantic search) is most useful when queries, documents, or users vary in how precisely they express intent. Below are common use cases and why hybrid retrieval helps.

---

### Use Cases and Benefits

#### RAG Pipelines
**Why it helps:**  
Prevents retrieval hallucinations by ensuring both **exact keyword matches** (BM25) and **fuzzy semantic matches** (dense) are considered.

---

#### Technical Documentation Search
**Why it helps:**  
Developers may search for *"how to use API"* while the documentation says *"API usage"*.  
Using BM25 and semantic search together significantly improves hit rate.

---

#### Legal / Medical Question Answering
**Why it helps:**  
Some queries require **precise term matching** (BM25), while others rely on **general semantic understanding** (dense embeddings).  
Hybrid retrieval supports both safely.

---

#### E-commerce / Product Search
**Why it helps:**  
Queries like *"cheap noise-canceling headphones"* can match  
*"affordable ANC earbuds"* via semantic search, while BM25 confirms critical terms like `"ANC"`.

---

#### Multilingual or Cross-Lingual Retrieval
**Why it helps:**  
Semantic models bridge language differences, while BM25 ensures exact matching when queries and documents share the same language.

---

#### Customer Support
**Why it helps:**  
Real users often type vague, keyword-heavy, or inconsistent queries.  
Hybrid retrieval improves reliability in chatbots and FAQ systems.

---



In [1]:
from langchain_community.vectorstores import FAISS
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_community.retrievers import BM25Retriever
from langchain_classic.retrievers import EnsembleRetriever
from langchain_classic.schema import Document


In [2]:
# Step 1: Sample Documents
docs = [
    Document(page_content="Langchain helps build LLM applications."),
    Document(page_content="Pinecone is a vector database for semantic search."),
    Document(page_content="The Eiffel Tower is located in Paris."),
    Document(page_content="Langchain can be used to develop agentic ai application."),
    Document(page_content="Langchain has many types of retrievers."),
]

# Step 2: Dense Retriever(FAISS + HuggingFace)
embedding_model = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
dense_vector_store = FAISS.from_documents(docs, embedding_model)
dense_retriever = dense_vector_store.as_retriever()



In [4]:
### Step 3 : Sparce Retriever(BM25)
sparse_retriever = BM25Retriever.from_documents(docs)
sparse_retriever.k=3 ## top -k documents to retriever

## Step 4: Combine with Ensemble Retriver
hybrid_retriever = EnsembleRetriever(
    retrievers=[dense_retriever, sparse_retriever],
    weights=[0.7, 0.3]
)

In [5]:
hybrid_retriever

EnsembleRetriever(retrievers=[VectorStoreRetriever(tags=['FAISS', 'HuggingFaceEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x000002136DB47140>, search_kwargs={}), BM25Retriever(vectorizer=<rank_bm25.BM25Okapi object at 0x0000021323DAC140>, k=3)], weights=[0.7, 0.3])

In [None]:
# Step 5: Query and get results
query = "How can I build an application using LLM?"
results = hybrid_retriever.invoke(query)

# Step 6: Print results
for i, doc in enumerate(results):
    print(f"\n üîπ Document {i+1}: \n {doc.page_content}")



 üîπ Document 1: 
 Langchain helps build LLM applications.

 üîπ Document 2: 
 Langchain can be used to develop agentic ai application.

 üîπ Document 3: 
 Langchain has many types of retrievers.

 üîπ Document 4: 
 Pinecone is a vector database for semantic search.


### RAG Pipeline with hybrid retriever

In [10]:
from langchain.chat_models import init_chat_model
from langchain_classic.prompts import PromptTemplate
from langchain_classic.chains.combine_documents import create_stuff_documents_chain
from langchain_classic.chains.retrieval import create_retrieval_chain

In [11]:
# Step 5: Prompt Template
prompt = PromptTemplate.from_template(
    """Answer the question based on the context below.
    Context:{context}
    Question: {input}                                      
""")

# Step 6: LLM
llm = init_chat_model(
    model="groq:openai/gpt-oss-20b",
    temperature = 0.4
)

In [None]:
### Step 7: Create Stuff Document Chain
document_chain = create_stuff_documents_chain(llm=llm , prompt=prompt)

## Step 8: Create Full rag Chain
rag_chain = create_retrieval_chain(retriever=hybrid_retriever,combine_docs_chain=document_chain)
rag_chain

RunnableBinding(bound=RunnableAssign(mapper={
  context: RunnableBinding(bound=RunnableLambda(lambda x: x['input'])
           | EnsembleRetriever(retrievers=[VectorStoreRetriever(tags=['FAISS', 'HuggingFaceEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x000002136DB47140>, search_kwargs={}), BM25Retriever(vectorizer=<rank_bm25.BM25Okapi object at 0x0000021323DAC140>, k=3)], weights=[0.7, 0.3]), kwargs={}, config={'run_name': 'retrieve_documents'}, config_factories=[])
})
| RunnableAssign(mapper={
    answer: RunnableBinding(bound=RunnableBinding(bound=RunnableAssign(mapper={
              context: RunnableLambda(format_docs)
            }), kwargs={}, config={'run_name': 'format_inputs'}, config_factories=[])
            | PromptTemplate(input_variables=['context', 'input'], input_types={}, partial_variables={}, template='Answer the question based on the context below.\n    Context:{context}\n    Question: {input}                                     

In [13]:
# Step 9: Ask a question
query = {"input": "How can I build an app using LLMs?"}
response = rag_chain.invoke(query)

# Step 10 : Output
print("‚úîÔ∏è Answer: \n", response["answer"])

print("\nüìÉ Source Documents: ")
for i, doc in enumerate(response["context"]):
    print(f"\n Doc {i+1}: {doc.page_content}")

‚úîÔ∏è Answer: 
 **Building an LLM‚Äëpowered app with LangChain (and optional Pinecone)**  
Below is a practical, step‚Äëby‚Äëstep guide that shows how you can turn the concepts in the context into a working application.  
Feel free to skip or expand any section depending on your skill level and the complexity you want.

---

## 1. Define the Problem & Scope

| Question | Example |
|----------|---------|
| What is the app‚Äôs purpose? | ‚ÄúA chatbot that answers product‚Äërelated questions.‚Äù |
| Who are the users? | Customer support agents. |
| What data do we need? | Product FAQs, manuals, support tickets. |
| What LLM do we want to use? | OpenAI GPT‚Äë4, Anthropic Claude, or a local Llama‚Äë2. |

Having a clear problem statement lets you pick the right tools and architecture.

---

## 2. Set Up the Development Environment

```bash
# Create a virtual environment
python -m venv llm_app
source llm_app/bin/activate   # Windows: llm_app\Scripts\activate

# Install core libraries
pip ins