### Query Enhancement ‚Äì Query Expansion Techniques

In a RAG pipeline, the quality of the query sent to the retriever determines how good the retrieved context is ‚Äî and therefore, how accurate the LLM‚Äôs final answer will be.

That‚Äôs where Query Expansion / Enhancement comes in.

#### üéØ What is Query Enhancement?
Query enhancement refers to techniques used to improve or reformulate the user query to retrieve better, more relevant documents from the knowledge base.
It is especially useful when:

- The original query is short, ambiguous, or under-specified
- You want to broaden the scope to catch synonyms, related phrases, or spelling variants

In [4]:
from langchain_community.document_loaders import TextLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS

from langchain_openai import ChatOpenAI  # or ChatGroq
from langchain_core.prompts import PromptTemplate

from langchain_classic.chains import create_retrieval_chain
from langchain_classic.chains.combine_documents import create_stuff_documents_chain

from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnableMap


In [5]:
#Step 1: Load and split the dataset
loader = TextLoader("langchain_crewai_dataset.txt")
raw_docs =  loader.load()
splitter = RecursiveCharacterTextSplitter(chunk_size=300, chunk_overlap=50)
chunks = splitter.split_documents(raw_docs)

In [6]:
#Step 2: Vector Store

embedding_model=HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
vectorstore=FAISS.from_documents(chunks,embedding_model)

## step 3:MMR Retriever
retriever=vectorstore.as_retriever(search_type="mmr",search_kwargs={"k":5})
retriever


VectorStoreRetriever(tags=['FAISS', 'HuggingFaceEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x0000025E5D3CD550>, search_type='mmr', search_kwargs={'k': 5})

In [8]:
## step 4 : LLM and Prompt
from langchain.chat_models import init_chat_model
import os
from dotenv import load_dotenv
load_dotenv()

os.environ["OPENAI_API_KEY"]=os.getenv("OPENAI_API_KEY")

llm=init_chat_model("openai:o4-mini")
llm


ChatOpenAI(profile={'max_input_tokens': 200000, 'max_output_tokens': 100000, 'image_inputs': True, 'audio_inputs': False, 'video_inputs': False, 'image_outputs': False, 'audio_outputs': False, 'video_outputs': False, 'reasoning_output': True, 'tool_calling': True, 'structured_output': True, 'image_url_inputs': True, 'pdf_inputs': True, 'pdf_tool_message': True, 'image_tool_message': True, 'tool_choice': True}, client=<openai.resources.chat.completions.completions.Completions object at 0x0000025E5EFF8440>, async_client=<openai.resources.chat.completions.completions.AsyncCompletions object at 0x0000025E5EFF8EC0>, root_client=<openai.OpenAI object at 0x0000025E5EC31550>, root_async_client=<openai.AsyncOpenAI object at 0x0000025E5EFF8C20>, model_name='o4-mini', model_kwargs={}, openai_api_key=SecretStr('**********'), stream_usage=True)

In [9]:
# Query expansion
query_expansion_prompt = PromptTemplate.from_template("""
You are a helpful assistant. Expand the following query to improve document retrieval by adding relevant synonyms, technical terms, and useful context.

Original query: "{query}"

Expanded query:
""")

query_expansion_chain=query_expansion_prompt| llm | StrOutputParser()
query_expansion_chain

PromptTemplate(input_variables=['query'], input_types={}, partial_variables={}, template='\nYou are a helpful assistant. Expand the following query to improve document retrieval by adding relevant synonyms, technical terms, and useful context.\n\nOriginal query: "{query}"\n\nExpanded query:\n')
| ChatOpenAI(profile={'max_input_tokens': 200000, 'max_output_tokens': 100000, 'image_inputs': True, 'audio_inputs': False, 'video_inputs': False, 'image_outputs': False, 'audio_outputs': False, 'video_outputs': False, 'reasoning_output': True, 'tool_calling': True, 'structured_output': True, 'image_url_inputs': True, 'pdf_inputs': True, 'pdf_tool_message': True, 'image_tool_message': True, 'tool_choice': True}, client=<openai.resources.chat.completions.completions.Completions object at 0x0000025E5EFF8440>, async_client=<openai.resources.chat.completions.completions.AsyncCompletions object at 0x0000025E5EFF8EC0>, root_client=<openai.OpenAI object at 0x0000025E5EC31550>, root_async_client=<openai

In [10]:
query_expansion_chain.invoke({"query": "LangChain memory"})

'(LangChain AND (memory OR ‚Äúmemory management‚Äù OR ‚Äústate management‚Äù OR ‚Äúsession memory‚Äù OR ‚Äúbuffer memory‚Äù OR ‚Äúsemantic memory‚Äù OR ‚Äúsummary memory‚Äù OR ‚Äúconversation memory‚Äù OR ‚Äúchat context‚Äù OR ‚Äúconversation history‚Äù OR ‚Äúcontext tracking‚Äù OR ‚Äústate persistence‚Äù OR ‚Äúcontextual embeddings‚Äù OR ‚Äúvector memory‚Äù)) AND (retriever OR ‚Äúvector store‚Äù OR embeddings OR FAISS OR Pinecone OR Milvus OR Weaviate OR Chroma OR Redis OR ‚ÄúMongoDB‚Äù OR ‚ÄúCassandra‚Äù)'

In [11]:
# RAG answering prompt
answer_prompt = PromptTemplate.from_template("""
Answer the question based on the context below.

Context:
{context}

Question: {input}
""")

document_chain=create_stuff_documents_chain(llm=llm,prompt=answer_prompt)

In [12]:
# Step 5: Full RAG pipeline with query expansion
rag_pipeline = (
    RunnableMap({
        "input": lambda x: x["input"],
        "context": lambda x: retriever.invoke(query_expansion_chain.invoke({"query": x["input"]}))
    })
    | document_chain
)

In [13]:
# Step 6: Run query
query = {"input": "What types of memory does LangChain support?"}
print(query_expansion_chain.invoke({"query":query}))
response = rag_pipeline.invoke(query)
print("‚úÖ Answer:\n", response)

Expanded query:  
‚ÄúWhat types of memory does LangChain support‚Äîincluding all its memory modules and capabilities for storing and retrieving context in LLM-based agents? For example:  
‚Ä¢ Short-term ‚Äòbuffer‚Äô memory (ConversationBufferMemory)  
‚Ä¢ Long-term or semantic memory (ConversationSummaryMemory, EntityMemory)  
‚Ä¢ Vector-store memory (VectorStoreMemory, VectorStoreRetrieverMemory) using FAISS, Chroma, Weaviate, Redis, etc.  
‚Ä¢ CombinedMemory and custom memory chains  
Also consider synonyms and related terms such as cache, state persistence, context window, external memory, RAG (retrieval-augmented generation), memory API, conversation history, knowledge retrieval.‚Äù
‚úÖ Answer:
 LangChain today ships at least two core memory implementations:

‚Ä¢ ConversationBufferMemory ‚Äì keeps a full transcript of the back-and-forth in memory.  
‚Ä¢ ConversationSummaryMemory ‚Äì compacts older turns into a running summary so you stay within LLM token limits.


In [15]:
# Step 6: Run query
query = {"input": "CrewAI Agents?"}
print(query_expansion_chain.invoke({"query":query}))
response = rag_pipeline.invoke(query)
print("‚úÖ Answer:\n", response)

Expanded query:

(
  ‚ÄúCrewAI Agents‚Äù OR 
  ‚ÄúCrew AI Agents‚Äù OR 
  ‚ÄúAI-driven crew agents‚Äù OR 
  ‚Äúdigital crew assistants‚Äù OR 
  ‚Äúvirtual crew assistants‚Äù OR 
  ‚Äúautonomous crew scheduling agents‚Äù OR 
  ‚Äúintelligent staffing bots‚Äù OR 
  ‚Äúworkforce automation agents‚Äù OR 
  ‚Äúteam coordination agents‚Äù OR 
  ‚Äúmulti-agent system‚Äù OR 
  ‚Äúagent-based modeling‚Äù OR 
  ‚Äúautonomous agent framework‚Äù
)
AND
(
  platform OR framework OR API OR architecture OR documentation OR ‚Äúbest practices‚Äù OR tutorial OR ‚Äúcase study‚Äù
)
AND
(
  ‚Äúcrew management‚Äù OR scheduling OR ‚Äúresource allocation‚Äù OR ‚Äútask assignment‚Äù OR ‚Äúreal-time collaboration‚Äù OR ‚Äúhuman-AI collaboration‚Äù OR ‚Äúreinforcement learning‚Äù OR ‚Äúdeep learning‚Äù OR ‚Äúnatural language processing‚Äù OR chatbot
)
‚úÖ Answer:
 CrewAI agents are autonomous ‚Äúcrew members‚Äù that collaborate in structured workflows to solve complex tasks. Key characteristics include:  
1. Defi