# Multi-Query Retriever

In [1]:
# Build a sample vectorDB
from langchain_chroma import Chroma
from langchain_community.document_loaders import WebBaseLoader
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter

# Load blog post
loader = WebBaseLoader("https://lilianweng.github.io/posts/2023-06-23-agent/")
data = loader.load()

# Split
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
splits = text_splitter.split_documents(data)

# VectorDB
embedding = OpenAIEmbeddings()
vectordb = Chroma.from_documents(documents=splits, embedding=embedding)

USER_AGENT environment variable not set, consider setting it to identify your requests.


In [2]:
from langchain.retrievers.multi_query import MultiQueryRetriever
from langchain_openai import ChatOpenAI

question = "What are the approaches to Task Decomposition?"
llm = ChatOpenAI(model="gpt-4.1-nano", temperature=0)
retriever_from_llm = MultiQueryRetriever.from_llm(
    retriever=vectordb.as_retriever(), llm=llm
)

In [3]:
# Set logging for the queries
import logging

logging.basicConfig()
logging.getLogger("langchain.retrievers.multi_query").setLevel(logging.INFO)

In [4]:
unique_docs = retriever_from_llm.invoke(question)
len(unique_docs)

INFO:langchain.retrievers.multi_query:Generated queries: ['1. What are the different methods used for breaking down complex tasks into smaller components?  ', '2. How can tasks be systematically decomposed to improve understanding and execution?  ', '3. What techniques exist for dividing large tasks into manageable sub-tasks in project planning?']


6

In [5]:
for i in range(len(unique_docs)):
    print(f"Document {i+1}:")
    print(unique_docs[i].page_content)
    print()

Document 1:
Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.

Document 2:
Component One: Planning#
A complicated task usually involves many steps. An agent needs to know what they are and plan ahead.
Task Decomposition#

Document 3:
Planning

Subgoal and decomposition: The agent breaks down large tasks into smaller, manageable subgoals, enabling efficient handling of complex tasks.
Reflection and refinement: The agent can do self-criticism and self-reflection over past actions, learn from mistakes and refine them for future steps, thereby improving the quality of final results.


Memory

Document 4:
Chain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utiliz

# Customize

In [6]:
from typing import List

from langchain_core.output_parsers import BaseOutputParser
from langchain_core.prompts import PromptTemplate
from pydantic import BaseModel, Field


# Output parser will split the LLM result into a list of queries
class LineListOutputParser(BaseOutputParser[List[str]]):
    """Output parser for a list of lines."""

    def parse(self, text: str) -> List[str]:
        lines = text.strip().split("\n")
        return list(filter(None, lines))  # Remove empty lines


output_parser = LineListOutputParser()

QUERY_PROMPT = PromptTemplate(
    input_variables=["question"],
    template="""You are an AI language model assistant. Your task is to generate 10 
    different versions of the given user question to retrieve relevant documents from a vector 
    database. By generating multiple perspectives on the user question, your goal is to help
    the user overcome some of the limitations of the distance-based similarity search. 
    Provide these alternative questions separated by newlines.
    Original question: {question}""",
)
llm = ChatOpenAI(model="gpt-4.1-nano", temperature=0)

# Chain
llm_chain = QUERY_PROMPT | llm | output_parser

# Other inputs
question = "What are the approaches to Task Decomposition?"

In [7]:
llm_chain.invoke({"question": question})

['1. How can tasks be broken down into smaller components or sub-tasks?',
 '2. What methods are used for decomposing complex tasks into manageable parts?',
 '3. What strategies exist for dividing a task into simpler, more focused steps?',
 '4. How do different approaches facilitate the decomposition of tasks in project management?',
 '5. What techniques are available for analyzing and splitting tasks into sub-tasks?',
 '6. How can task decomposition improve workflow efficiency and clarity?',
 '7. What are common frameworks or models for breaking down tasks in problem-solving?',
 '8. How do various approaches to task decomposition differ across industries or disciplines?',
 '9. What role does task decomposition play in AI and automation processes?',
 '10. What are the best practices for systematically decomposing tasks in software development?']

In [19]:
# Run
retriever = MultiQueryRetriever(
    retriever=vectordb.as_retriever(), llm_chain=llm_chain, parser_key="lines"
)  # "lines" is the key (attribute name) of the parsed output

# Results
unique_docs = retriever.invoke("What does the course say about regression?")
len(unique_docs)

INFO:langchain.retrievers.multi_query:Generated queries: ['1. How is regression explained in the course material?  ', '2. What topics related to regression are covered in the course?  ', "3. Can you summarize the course's discussion on regression analysis?  ", '4. What insights does the course provide about different types of regression?  ', '5. How does the course approach teaching regression techniques?  ', '6. What are the key points the course makes about regression models?  ', '7. Does the course include practical examples of regression?  ', "8. What is the course's perspective on the importance of regression in data analysis?  ", '9. How does the course differentiate between various regression methods?  ', '10. What does the course say about the applications and limitations of regression?']


12

In [20]:
for i in range(len(unique_docs)):
    print(f"Document {i+1}:")
    print(unique_docs[i].page_content)
    print()

Document 1:
They did an experiment on fine-tuning LLM to call a calculator, using arithmetic as a test case. Their experiments showed that it was harder to solve verbal math problems than explicitly stated math problems because LLMs (7B Jurassic1-large model) failed to extract the right arguments for the basic arithmetic reliably. The results highlight when the external symbolic tools can work reliably, knowing when to and how to use the tools are crucial, determined by the LLM capability.

Document 2:
\dots \geq r_1$ The process is supervised fine-tuning where the data is a sequence in the form of $\tau_h = (x, z_i, y_i, z_j, y_j, \dots, z_n, y_n)$, where $\leq i \leq j \leq n$. The model is finetuned to only predict $y_n$ where conditioned on the sequence prefix, such that the model can self-reflect to produce better output based on the feedback sequence. The model can optionally receive multiple rounds of instructions with human annotators at test time.

Document 3:
After fine-tunin