### 🧠 What is Query Decomposition?
Query decomposition is the process of taking a complex, multi-part question and breaking it into simpler, atomic sub-questions that can each be retrieved and answered individually.

#### ✅ Why Use Query Decomposition?

- Complex queries often involve multiple concepts

- LLMs or retrievers may miss parts of the original question

- It enables multi-hop reasoning (answering in steps)

- Allows parallelism (especially in multi-agent frameworks)

In [1]:
from langchain.chat_models import init_chat_model
from langchain.prompts import PromptTemplate
from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_core.output_parsers import StrOutputParser
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.runnables import RunnableSequence

  from .autonotebook import tqdm as notebook_tqdm


In [2]:
# Step 1: Load and embed the document
loader = TextLoader("langchain_crewai_dataset.txt")
docs = loader.load()

splitter = RecursiveCharacterTextSplitter(chunk_size=300, chunk_overlap=50)
chunks = splitter.split_documents(docs)

embedding = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
vectorstore = FAISS.from_documents(chunks, embedding)
retriever = vectorstore.as_retriever(search_type="mmr", search_kwargs={"k": 4, "lambda_mult": 0.7})

In [3]:
import os
from dotenv import load_dotenv
load_dotenv()

os.environ["GROQ_API_KEY"]=os.getenv("GROQ_API_KEY")

llm=init_chat_model(model="groq:openai/gpt-oss-120b")
llm

ChatGroq(client=<groq.resources.chat.completions.Completions object at 0x0000025600274470>, async_client=<groq.resources.chat.completions.AsyncCompletions object at 0x00000256001B9160>, model_name='openai/gpt-oss-120b', model_kwargs={}, groq_api_key=SecretStr('**********'))

In [4]:
# Step 3: Query decomposition
decomposition_prompt = PromptTemplate.from_template("""
You are an AI assistant. Decompose the following complex question into 2 to 4 smaller sub-questions for better document retrieval.

Question: "{question}"

Sub-questions:
""")
decomposition_chain = decomposition_prompt | llm | StrOutputParser()

In [5]:
query = "How does LangChain use memory and agents compared to CrewAI?"
decomposition_question=decomposition_chain.invoke({"question": query})


In [6]:
print(decomposition_question)

**Sub‑questions**

1. What memory architectures and capabilities does LangChain offer for managing conversational context?  
2. How are agents designed and used within the LangChain framework (e.g., tool‑calling, routing, planning)?  
3. How does CrewAI implement memory handling and agent orchestration in its workflow?  
4. In what ways do LangChain’s memory and agent approaches differ from or resemble those of CrewAI (e.g., architecture, flexibility, scalability, integration with external tools)?


In [7]:
# Step 4: QA chain per sub-question
qa_prompt = PromptTemplate.from_template("""
Use the context below to answer the question.

Context:
{context}

Question: {input}
""")
qa_chain = create_stuff_documents_chain(llm=llm, prompt=qa_prompt)

In [8]:
# Step 5: Full RAG pipeline logic
def full_query_decomposition_rag_pipeline(user_query):
    # Decompose the query
    sub_qs_text = decomposition_chain.invoke({"question": user_query})
    sub_questions = [q.strip("-•1234567890. ").strip() for q in sub_qs_text.split("\n") if q.strip()]
    
    results = []
    for subq in sub_questions:
        docs = retriever.invoke(subq)
        result = qa_chain.invoke({"input": subq, "context": docs})
        results.append(f"Q: {subq}\nA: {result}")
    
    return "\n\n".join(results)

In [9]:
# Step 6: Run
query = "How does LangChain use memory and agents compared to CrewAI?"
final_answer = full_query_decomposition_rag_pipeline(query)
print("✅ Final Answer:\n")
print(final_answer)

✅ Final Answer:

Q: **Sub‑questions**
A: Below are several concrete sub‑questions you can ask to explore the concepts mentioned in the context more deeply:

1. **Knowledge Injection**
   - How does injecting external knowledge into the LLM prompt reduce hallucinations?
   - What mechanisms does CrewAI use to fetch, validate, and inject knowledge into prompts?
   - Are there best‑practice patterns for formatting injected knowledge (e.g., JSON, bullet lists, tables)?

2. **Agent Context‑Sharing in CrewAI**
   - What data structures are used to pass intermediate results between agents?
   - How does CrewAI decide which agent should handle a given piece of context (delegation logic)?
   - Can agents request clarification or additional data from one another (consultation flow)?
   - What safeguards exist to prevent circular dependencies or infinite loops in context‑sharing?

3. **Emergent Behaviors (Delegation, Consultation, Review)**
   - How does an agent recognize that a task is beyond i