### 🧠 What is Query Decomposition?
Query decomposition is the process of taking a complex, multi-part question and breaking it into simpler, atomic sub-questions that can each be retrieved and answered individually.

#### ✅ Why Use Query Decomposition?

- Complex queries often involve multiple concepts

- LLMs or retrievers may miss parts of the original question

- It enables multi-hop reasoning (answering in steps)

- Allows parallelism (especially in multi-agent frameworks)

In [1]:
from langchain.chat_models import init_chat_model
from langchain.prompts import PromptTemplate
from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_core.output_parsers import StrOutputParser
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.runnables import RunnableSequence

  from .autonotebook import tqdm as notebook_tqdm


In [2]:
# Step 1: Load and embed the document
loader = TextLoader("langchain_crewai_dataset.txt")
docs = loader.load()

splitter = RecursiveCharacterTextSplitter(chunk_size=300, chunk_overlap=50)
chunks = splitter.split_documents(docs)

embedding = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
vectorstore = FAISS.from_documents(chunks, embedding)
retriever = vectorstore.as_retriever(search_type="mmr", search_kwargs={"k": 4, "lambda_mult": 0.7})

In [3]:
import os
from dotenv import load_dotenv
load_dotenv()

llm=init_chat_model(model="gpt-3.5-turbo",temperature=0)
llm

ChatOpenAI(client=<openai.resources.chat.completions.completions.Completions object at 0x7bb9a9779e10>, async_client=<openai.resources.chat.completions.completions.AsyncCompletions object at 0x7bb9a8f679d0>, root_client=<openai.OpenAI object at 0x7bb9a97798d0>, root_async_client=<openai.AsyncOpenAI object at 0x7bb9a8f676d0>, temperature=0.0, model_kwargs={}, openai_api_key=SecretStr('**********'))

In [4]:
# Step 3: Query decomposition
decomposition_prompt = PromptTemplate.from_template("""
You are an AI assistant. Decompose the following complex question into 2 to 4 smaller sub-questions for better document retrieval.

Question: "{question}"

Sub-questions:
""")
decomposition_chain = decomposition_prompt | llm | StrOutputParser()

In [5]:
query = "How does LangChain use memory and agents compared to CrewAI?"
decomposition_question=decomposition_chain.invoke({"question": query})


In [6]:
print(decomposition_question)

1. How does LangChain utilize memory in its operations?
2. How does LangChain employ agents in its system?
3. How does CrewAI utilize memory in its operations?
4. How does CrewAI employ agents in its system?


In [7]:
# Step 4: QA chain per sub-question
qa_prompt = PromptTemplate.from_template("""
Use the context below to answer the question.

Context:
{context}

Question: {input}
""")
qa_chain = create_stuff_documents_chain(llm=llm, prompt=qa_prompt)

In [8]:
# Step 5: Full RAG pipeline logic
def full_query_decomposition_rag_pipeline(user_query):
    # Decompose the query
    sub_qs_text = decomposition_chain.invoke({"question": user_query})
    sub_questions = [q.strip("-•1234567890. ").strip() for q in sub_qs_text.split("\n") if q.strip()]
    
    results = []
    for subq in sub_questions:
        docs = retriever.invoke(subq)
        result = qa_chain.invoke({"input": subq, "context": docs})
        results.append(f"Q: {subq}\nA: {result}")
    
    return "\n\n".join(results)

In [9]:
# Step 6: Run
query = "How does LangChain use memory and agents compared to CrewAI?"
final_answer = full_query_decomposition_rag_pipeline(query)
print("✅ Final Answer:\n")
print(final_answer)

✅ Final Answer:

Q: How does LangChain utilize memory in its operations?
A: LangChain utilizes memory modules like ConversationBufferMemory and ConversationSummaryMemory to allow the LLM to maintain awareness of previous conversation turns or summarize long interactions to fit within token limits.

Q: How does LangChain employ agents in its system?
A: LangChain employs agents in its system by using LLMs to reason about which tool to call, what input to provide, and how to process the output. The agents can execute multi-step tasks, integrating with tools like web search, calculators, and code execution. They operate using a planner-executor model, planning out a sequence of tool invocations to achieve a goal, including dynamic decision-making, branching logic, and context-aware memory use across steps.

Q: How does CrewAI utilize memory in its operations?
A: CrewAI utilizes memory in its operations by enabling agents to share context and dynamically communicate with one another. This a

In [10]:
# step 7 uses a LLM to reformulate the final answer
final_prompt = PromptTemplate.from_template("""
You are an AI assistant. Given the following series of question and answer pairs, provide a concise summary answer to the original question.
Q&A Pairs:
{input}
Final Answer:
""")
final_chain = final_prompt | llm | StrOutputParser()
final_summary = final_chain.invoke({"input": final_answer})
print("✅ Final Summary:\n")
print(final_summary)

✅ Final Summary:

LangChain utilizes memory modules like ConversationBufferMemory and ConversationSummaryMemory to maintain awareness of previous conversation turns and summarize interactions, while CrewAI enables agents to share context and communicate dynamically to work together efficiently in organized crews. Both systems employ agents to execute multi-step tasks and leverage collective memory for enhanced performance.
