# ðŸ§  Query Decomposition

Query decomposition is the process of taking a complex, multi-part question and breaking into simpler, atomic sub-questions that can each be retrieved and answered individually.

#### Why use Query Decomposition?
- Complex queries often involve multiple concepts
- LLMs or retrievers may miss parts of the original question
- It enables multi-hop reasoning (answering in steps)
- Allows parallelism (especially multi-agent frameworks)

In [1]:
from langchain.chat_models import init_chat_model
from langchain_core.prompts import PromptTemplate
from langchain_community.document_loaders import TextLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_core.output_parsers import StrOutputParser
from langchain_classic.chains.combine_documents import create_stuff_documents_chain
from langchain_core.runnables import RunnableSequence

In [2]:
# Step 1: Load documents
loader=TextLoader("langchain_crewai_dataset.txt")
docs=loader.load()

splitter=RecursiveCharacterTextSplitter(chunk_size=300, chunk_overlap=50)
chunks=splitter.split_documents(docs)

embedding_model=HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
vectorstore=FAISS.from_documents(chunks,embedding_model)
retriever=vectorstore.as_retriever(search_type="mmr", search_kwargs={"k":4, "lambda_mult":0.7})


In [None]:
# Step 2: Initialize Groq LLM

import os
from dotenv import load_dotenv
load_dotenv()

os.environ["GROQ_API_KEY"]=os.getenv("GROQ_API_KEY")
llm=init_chat_model(model="groq:llama-3.1-8b-instant")


In [4]:
# Step 3: Query decomposition
decomposition_prompt=PromptTemplate.from_template("""
You are an AI assistant. Decompose the following question into 2 to 4 similar sub-questions for better
document retrieval.

Question: "{question}"

Sub-questions:
""")
decomposition_chain=decomposition_prompt | llm | StrOutputParser()

In [5]:
query = "How does LangChain use memory and agents compared to CrewAI?"
decompostion_query=decomposition_chain.invoke({"question": query})
print(decompostion_query)

To facilitate better document retrieval, I can break down the original question into the following sub-questions:

1. **What is LangChain's memory architecture?** This sub-question focuses on understanding how LangChain utilizes memory, which will help in retrieving documents that discuss its memory-related features.

2. **How do LangChain agents interact with the environment?** This sub-question will allow us to retrieve documents that discuss LangChain's agent capabilities, such as decision-making and execution.

3. **What is CrewAI's memory architecture compared to LangChain?** This sub-question compares the memory architectures of LangChain and CrewAI, which might help in retrieving documents that highlight their respective design choices.

4. **How do CrewAI agents interact with the environment in comparison to LangChain?** This sub-question will allow us to retrieve documents that compare the agent capabilities of LangChain and CrewAI.

By breaking down the original question into

In [6]:
# Step 4: QA chain per sub-question
qa_prompt=PromptTemplate.from_template(""" 
Use the context below to answer the question.

Context: {context}

Question: {input}
""")
qa_chain=create_stuff_documents_chain(llm=llm, prompt=qa_prompt)


In [9]:
# Step 5: Complete RAG pipeline logic
def query_decomposition_rag_pipeline(user_query):
    # Decompose the query
    sub_qs_text = decomposition_chain.invoke({"question":user_query})
    sub_questions=[q.strip("-*1234567890. ").strip() for q in sub_qs_text.split("\n") if q.strip()]

    results=[]
    for subq in sub_questions:
        docs=retriever.invoke(subq)
        result=qa_chain.invoke({"input":subq, "context":docs})
        results.append(f"Q: {subq}\nA: {result}")
    
    return "\n\n".join(results)

In [10]:
# Step 6: Execute
query="How does LangChain use memory and agents compared to CrewAI?"
final_answer=query_decomposition_rag_pipeline(query)
print(f"âœ… Final answer:\n")
print(final_answer)

âœ… Final answer:

Q: To better retrieve relevant documents, I can decompose the given question into the following sub-questions:
A: It seems like the text is describing a tool called LangChain, which is used for semantic search and retrieval. Based on the provided context, it appears that you're looking to break down a question into sub-questions to better use LangChain for retrieval. 

However, the original text does not explicitly mention how to decompose a question into sub-questions. But in general, the following steps can be taken:

1. Identify the main entities or keywords in the question. 
2. Break down each entity or keyword into its components or synonyms, as this can help in broadening the search scope and catching semantically similar content.
3. Identify any specific requirements or constraints mentioned in the question, such as date ranges, locations, or specific domains.
4. Consider any specific tasks or goals mentioned in the question, such as finding a definition, comp

#### Major Disadvantage:
Lots of LLM and retrieval calls