### Multi-query Retriever

Sometimes a single query might not capture all the ways information is phrased in your documents.

##### For example:
Query: "How can I stay healthy?"

Could mean: 
* What should i eat?
* How often should i exercise?
* How can I manage stress?

A simple similarity search might miss documents that talk about those things but don't use the word "healthy."

1. Takes your original query
2. Uses an LLM to generate multiple semantically different versions of that query
3. Performs retrieval for each sub-query
4. Combines and deduplicates the results

In [2]:
from langchain_community.vectorstores import FAISS
from langchain_google_genai import GoogleGenerativeAIEmbeddings
from langchain_core.documents import Document
from langchain_groq import ChatGroq
from langchain.retrievers.multi_query import MultiQueryRetriever
from dotenv import load_dotenv
load_dotenv()

True

##### Relevant health & wellness documents

In [3]:
all_docs = [
    Document(page_content="Regular walking boosts heart health and can reduce symptoms of depression.", metadata={"source": "H1"}),
    Document(page_content="Consuming leafy greens and fruits helps detox the body and improve longevity.", metadata={"source": "H2"}),
    Document(page_content="Deep sleep is crucial for cellular repair and emotional regulation.", metadata={"source": "H3"}),
    Document(page_content="Mindfulness and controlled breathing lower cortisol and improve mental clarity.", metadata={"source": "H4"}),
    Document(page_content="Drinking sufficient water throughout the day helps maintain metabolism and energy.", metadata={"source": "H5"}),
    Document(page_content="The solar energy system in modern homes helps balance electricity demand.", metadata={"source": "I1"}),
    Document(page_content="Python balances readability with power, making it a popular system design language.", metadata={"source": "I2"}),
    Document(page_content="Photosynthesis enables plants to produce energy by converting sunlight.", metadata={"source": "I3"}),
    Document(page_content="The 2022 FIFA World Cup was held in Qatar and drew global energy and excitement.", metadata={"source": "I4"}),
    Document(page_content="Black holes bend spacetime and store immense gravitational energy.", metadata={"source": "I5"}),
]

##### Initialize Google GenAI embeddings

In [4]:
embedding_model = GoogleGenerativeAIEmbeddings(model="models/embedding-001")

##### Create FAISS vector store

In [5]:
vectorstore = FAISS.from_documents(documents=all_docs, embedding=embedding_model)

##### Create retrievers

In [9]:
similarity_retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 5})

.as_retriever():
* This method transforms the VectorStore object into a VectorStoreRetriever. A retriever is a LangChain component responsible for fetching relevant documents based on a given query.

search_type="similarity":
* This argument specifies the type of search to be performed by the retriever. "similarity" indicates that the retriever will find documents most similar to the query based on their vector embeddings (e.g., using cosine similarity). Other common search_type options include "mmr" (Maximal Marginal Relevance) for diversity-aware retrieval and "similarity_score_threshold" to filter results based on a relevance score.

search_kwargs={"k": 5}:
* This argument provides keyword arguments to pass to the underlying search function of the VectorStore. In this specific case, {"k": 5} limits the number of retrieved documents to the top 5 most similar results. The k parameter is a common way to control the number of documents returned by a retriever.

In [10]:
multiquery_retriever = MultiQueryRetriever.from_llm(
    retriever=vectorstore.as_retriever(search_kwargs={"k": 5}),
    llm=ChatGroq(model="qwen/qwen3-32b")
)

In [14]:
# Query
query = "How to improve energy levels and maintain balance in body?"

##### Retrieve results

In [15]:
similarity_results = similarity_retriever.invoke(query)
multiquery_results= multiquery_retriever.invoke(query)

In [16]:
for i, doc in enumerate(similarity_results):
    print(f"\n--- Result {i+1} ---")
    print(doc.page_content)

print("*"*150)

for i, doc in enumerate(multiquery_results):
    print(f"\n--- Result {i+1} ---")
    print(doc.page_content)


--- Result 1 ---
Drinking sufficient water throughout the day helps maintain metabolism and energy.

--- Result 2 ---
Consuming leafy greens and fruits helps detox the body and improve longevity.

--- Result 3 ---
Mindfulness and controlled breathing lower cortisol and improve mental clarity.

--- Result 4 ---
Deep sleep is crucial for cellular repair and emotional regulation.

--- Result 5 ---
Photosynthesis enables plants to produce energy by converting sunlight.
******************************************************************************************************************************************************

--- Result 1 ---
Black holes bend spacetime and store immense gravitational energy.

--- Result 2 ---
Drinking sufficient water throughout the day helps maintain metabolism and energy.

--- Result 3 ---
Python balances readability with power, making it a popular system design language.

--- Result 4 ---
Mindfulness and controlled breathing lower cortisol and improve mental c