## Hybrid Retriever- Combining Dense And Sparse Retriever

In [14]:
from langchain_core.documents import Document
from langchain_community.vectorstores import FAISS
from langchain_community.retrievers import BM25Retriever
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_classic.retrievers import EnsembleRetriever

In [32]:
# Step 1: Sample documents
docs = [
    Document(page_content="LangChain helps build LLM applications."),
    Document(page_content="Pinecone is a vector database for semantic search."),
    Document(page_content="The Eiffel Tower is located in Paris."),
    Document(page_content="Langchain can be used to develop agentic ai application."),
    Document(page_content="Langchain has many types of retrievers.")
]

# Step 2: Dense Retriever (FAISS + HuggingFace)
embedding_model = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
dense_vectorstore = FAISS.from_documents(docs, embedding_model)
# limit dense retriever to top-1 to match sparse retriever
dense_retriever = dense_vectorstore.as_retriever(search_kwargs={"k":1})

**Why combine dense + sparse retrievers?**
- BM25 (sparse) excels at exact keyword matches; dense vectors capture semantic meaning.
- Blending both with an ensemble reduces missed results: sparse catches exact terms, dense handles paraphrases.
- We weight dense a bit higher (`0.7` vs `0.3`) because semantic similarity matters more for this example, while still keeping keyword recall.
----
- `BM25Retriever` (sparse): scores documents by exact/keyword overlap using BM25 (term frequency, inverse doc frequency, length normalization). Great for precision on literal matches; weak on paraphrases.
- `EnsembleRetriever`: wrapper that runs multiple retrievers (here dense + BM25) and blends their scores with weights you choose. This boosts semantic coverage from dense vectors while keeping exact-match recall from BM25.

In [39]:
### Sparse Retriever(BM25)
sparse_retriever=BM25Retriever.from_documents(docs)
sparse_retriever.k=3 ##top- k documents to retriever

## step 4 : Combine with Ensemble Retriever
hybrid_retriever=EnsembleRetriever(
    retrievers=[dense_retriever,sparse_retriever],
    weights=[0.7,0.3],
    top_k=1  # return only the top-1 overall
)

In [40]:
hybrid_retriever

EnsembleRetriever(retrievers=[VectorStoreRetriever(tags=['FAISS', 'HuggingFaceEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x17bf28320>, search_kwargs={'k': 1}), BM25Retriever(vectorizer=<rank_bm25.BM25Okapi object at 0x17f56d6d0>, k=3)], weights=[0.7, 0.3])

In [48]:
# Step 5: Query and get results
query = "Where is the Eiffel Tower located?"
results = hybrid_retriever.invoke(query)

# Step 6: Print results
for i, doc in enumerate(results):
    print(f"\nðŸ”¹ Document {i+1}:\n{doc.page_content}")


ðŸ”¹ Document 1:
The Eiffel Tower is located in Paris.

ðŸ”¹ Document 2:
Pinecone is a vector database for semantic search.

ðŸ”¹ Document 3:
Langchain has many types of retrievers.


### RAG Pipeline with hybrid retriever

In [49]:
from langchain.chat_models import init_chat_model
from langchain_core.prompts import PromptTemplate
from langchain_classic.chains.combine_documents import create_stuff_documents_chain
from langchain_classic.chains.retrieval import create_retrieval_chain

In [54]:
# Step 5: Prompt Template
prompt = PromptTemplate.from_template(
    """
        Answer the question based on the context below.

        Context:
        {context}

        Question: {input}
    """
)

## step 6-llm
llm=init_chat_model("openai:gpt-4o-mini",temperature=0.2)
llm

ChatOpenAI(profile={'max_input_tokens': 128000, 'max_output_tokens': 16384, 'image_inputs': True, 'audio_inputs': False, 'video_inputs': False, 'image_outputs': False, 'audio_outputs': False, 'video_outputs': False, 'reasoning_output': False, 'tool_calling': True, 'structured_output': True, 'image_url_inputs': True, 'pdf_inputs': True, 'pdf_tool_message': True, 'image_tool_message': True, 'tool_choice': True}, client=<openai.resources.chat.completions.completions.Completions object at 0x17edcabd0>, async_client=<openai.resources.chat.completions.completions.AsyncCompletions object at 0x17edc84a0>, root_client=<openai.OpenAI object at 0x17edcbef0>, root_async_client=<openai.AsyncOpenAI object at 0x17edca4e0>, model_name='gpt-4o-mini', temperature=0.2, model_kwargs={}, openai_api_key=SecretStr('**********'), stream_usage=True)

In [57]:
### Create stuff Docuemnt Chain
document_chain=create_stuff_documents_chain(llm=llm,prompt=prompt)

## create Full RAG chain
rag_chain=create_retrieval_chain(retriever=hybrid_retriever,
                                 combine_docs_chain=document_chain)
rag_chain

RunnableBinding(bound=RunnableAssign(mapper={
  context: RunnableBinding(bound=RunnableLambda(lambda x: x['input'])
           | EnsembleRetriever(retrievers=[VectorStoreRetriever(tags=['FAISS', 'HuggingFaceEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x17bf28320>, search_kwargs={'k': 1}), BM25Retriever(vectorizer=<rank_bm25.BM25Okapi object at 0x17f56d6d0>, k=3)], weights=[0.7, 0.3]), kwargs={}, config={'run_name': 'retrieve_documents'}, config_factories=[])
})
| RunnableAssign(mapper={
    answer: RunnableBinding(bound=RunnableBinding(bound=RunnableAssign(mapper={
              context: RunnableLambda(format_docs)
            }), kwargs={}, config={'run_name': 'format_inputs'}, config_factories=[])
            | PromptTemplate(input_variables=['context', 'input'], input_types={}, partial_variables={}, template='\n        Answer the question based on the context below.\n\n        Context:\n        {context}\n\n        Question: {input}\n    ')
    

In [59]:
# Step 9: Ask a question
query = {"input": "Where is the Eiffel Tower located and how tall is it?"}
response = rag_chain.invoke(query)

# Step 10: Output
print("âœ… Answer:\n", response["answer"])

print("\nðŸ“„ Source Documents:")
for i, doc in enumerate(response["context"]):
    print(f"\nDoc {i+1}: {doc.page_content}")

âœ… Answer:
 The Eiffel Tower is located in Paris. However, the context does not provide information about its height.

ðŸ“„ Source Documents:

Doc 1: The Eiffel Tower is located in Paris.

Doc 2: Pinecone is a vector database for semantic search.

Doc 3: Langchain has many types of retrievers.
