## üß† What is Query Decompostion?
Query decomposition is the process of taking a complex, multipart question and breaking it into simpler, atomic sub-questions that can each be retrieved and answered individually.

### ‚úÖ Why use Query Decompostion?
- Complex queries oftern involve multiple concepts
- LLMs or retrievers may miss parts of the original Question
- It enables multi-hop reasoning (answering in steps)
- Allow parallelism (especially in multi-agent frameworks)


In [1]:
from langchain_classic.document_loaders import TextLoader
from langchain_classic.text_splitter import RecursiveCharacterTextSplitter
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS
from langchain.chat_models import init_chat_model
from langchain_classic.prompts import PromptTemplate
from langchain_classic.chains.combine_documents import create_stuff_documents_chain
from langchain_classic.chains.retrieval import create_retrieval_chain
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnableMap


In [None]:
## Step 1: Setup Retriver
# TextLoader
loader = TextLoader("langchain.txt", encoding="utf-8", autodetect_encoding=True)
raw_docs = loader.load()

# Split the documents
splitter = RecursiveCharacterTextSplitter(
    chunk_size = 500,
    chunk_overlap = 50
)
chunks = splitter.split_documents(raw_docs)

# Embedding model and vectore store
embedding_model = HuggingFaceEmbeddings(model="all-MiniLM-L6-v2")
vector_store = FAISS.from_documents(chunks, embedding_model)

# make MMR retriever
retriever = vector_store.as_retriever(
    search_type = "mmr",
    search_kwargs = {"k":4, "lambda_mult": 0.7}
)

In [None]:
## Step 2: LLM
# LLM and Prompt
from dotenv import load_dotenv
load_dotenv()

llm = init_chat_model("groq:openai/gpt-oss-120b")
llm

ChatGroq(profile={'max_input_tokens': 131072, 'max_output_tokens': 32768, 'image_inputs': False, 'audio_inputs': False, 'video_inputs': False, 'image_outputs': False, 'audio_outputs': False, 'video_outputs': False, 'reasoning_output': True, 'tool_calling': True}, client=<groq.resources.chat.completions.Completions object at 0x000001C055C0EF60>, async_client=<groq.resources.chat.completions.AsyncCompletions object at 0x000001C055C2BE00>, model_name='openai/gpt-oss-120b', model_kwargs={}, groq_api_key=SecretStr('**********'))

In [9]:
## Step 3:  Query decomposition
decomposition_prompt = PromptTemplate.from_template(
    """
You are and AI assitant. Decompose the following complex question into 2 to 4 smaller sub-question for better document retrival.
Question : {question}
Sub-questions:
"""
)
decompostion_chain = decomposition_prompt | llm | StrOutputParser()

In [7]:
query = "How does Langchain use memory and agents compared to CrewAI?"
decompostion_question = decompostion_chain.invoke({"question":query})


In [8]:
print(decompostion_question)

**Sub‚Äëquestions**

1. What memory mechanisms and types does LangChain offer, and how are they used within LangChain applications?  
2. How does LangChain implement and manage agents (e.g., tool‚Äëusing agents, planning agents, routing agents)?  
3. What memory capabilities and agent architectures are provided by CrewAI, and how are they applied in CrewAI workflows?  
4. In what ways do the memory handling and agent designs of LangChain differ from those of CrewAI (e.g., architecture, extensibility, integration with LLMs, runtime behavior)?


In [10]:
# Step 4: QA chain per sub-question

qa_prompt = PromptTemplate.from_template(
    """
    Use the context below to answer the question.
    Context: {context}

    Question: {input}
    """
)
qa_chain = create_stuff_documents_chain(llm=llm , prompt=qa_prompt)


In [17]:
# Step 5: Full RAG pipeline login
def full_query_decomposition_rag_pipeline(user_query):
    # Decompose the query
    sub_qs_text = decompostion_chain.invoke({"question":user_query})
    sub_questions = [q.strip("-.1234567890. ").strip() for q in sub_qs_text.split("\n") if q.strip().startswith(("1.","2.","3.","4."))]
    
    results = []
    for subq in sub_questions:
        docs = retriever.invoke(subq)
        result = qa_chain.invoke({"input":subq, "context":docs})
        results.append(f"Q: {subq} \n : {result}")
    return "\n\n".join(results)



In [18]:
# Step 6: Run
query = "How does Langchain use memory and agents compared to CrewAI?"
final_answer = full_query_decomposition_rag_pipeline(query)
print("‚úÖ Final Answer: \n")
print(final_answer)

‚úÖ Final Answer: 

Q: **What types of memory mechanisms does LangChain provide, and how are they implemented in its workflows?** 
 : **LangChain‚Äôs memory layer** is the part of the library that lets a chain or an agent keep track of what has happened earlier in the conversation (or in a multi‚Äëstep workflow) and feed that context back into the LLM on subsequent calls.  
LangChain ships with a handful of ready‚Äëmade memory classes, each of which implements the same `BaseMemory` interface (methods `load_memory_variables`, `save_context`, and `clear`).  By swapping one of these objects into a chain/agent you change how the historic information is stored, summarized, or retrieved without touching the rest of the workflow.

Below is a concise catalogue of the **main memory mechanisms** that LangChain provides today (as of the 2024 release), together with a short description of **how they are wired into a workflow**.

| Memory class (type) | What it does / when to use it | Key implement