### Query Enhancement ‚Äì Query Expansion Techniques

In a RAG pipeline, the quality of the query sent to the retriever determines how good the retrieved context is ‚Äî and therefore, how accurate the LLM‚Äôs final answer will be.

That‚Äôs where Query Expansion / Enhancement comes in.

#### üéØ What is Query Enhancement?
Query enhancement refers to techniques used to improve or reformulate the user query to retrieve better, more relevant documents from the knowledge base.
It is especially useful when:

- The original query is short, ambiguous, or under-specified
- You want to broaden the scope to catch synonyms, related phrases, or spelling variants

In [1]:
from langchain_classic.document_loaders import TextLoader
from langchain_classic.text_splitter import RecursiveCharacterTextSplitter
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS
from langchain.chat_models import init_chat_model
from langchain_classic.prompts import PromptTemplate
from langchain_classic.chains.combine_documents import create_stuff_documents_chain
from langchain_classic.chains.retrieval import create_retrieval_chain
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnableMap

In [2]:
## step1 : Load and split the dataset
loader = TextLoader("langchain_crewai_dataset.txt")
raw_docs = loader.load()
splitter = RecursiveCharacterTextSplitter(chunk_size=300, chunk_overlap=50)
chunks = splitter.split_documents(raw_docs)


In [3]:
### step 2: Vector Store
embedding_model=HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
vectorstore=FAISS.from_documents(chunks,embedding_model)

## step 3:MMR Retriever
retriever=vectorstore.as_retriever(search_type="mmr",search_kwargs={"k":5})
retriever


VectorStoreRetriever(tags=['FAISS', 'HuggingFaceEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x13ff96c00>, search_type='mmr', search_kwargs={'k': 5})

In [4]:
## step 4 : LLM and Prompt

import os
from dotenv import load_dotenv
load_dotenv()

os.environ["OPENAI_API_KEY"]=os.getenv("OPENAI_API_KEY")

llm=init_chat_model("openai:o4-mini")
llm


ChatOpenAI(profile={'max_input_tokens': 200000, 'max_output_tokens': 100000, 'image_inputs': True, 'audio_inputs': False, 'video_inputs': False, 'image_outputs': False, 'audio_outputs': False, 'video_outputs': False, 'reasoning_output': True, 'tool_calling': True, 'structured_output': True, 'image_url_inputs': True, 'pdf_inputs': True, 'pdf_tool_message': True, 'image_tool_message': True, 'tool_choice': True}, client=<openai.resources.chat.completions.completions.Completions object at 0x154b73c20>, async_client=<openai.resources.chat.completions.completions.AsyncCompletions object at 0x155068440>, root_client=<openai.OpenAI object at 0x14373c860>, root_async_client=<openai.AsyncOpenAI object at 0x154b99e50>, model_name='o4-mini', model_kwargs={}, openai_api_key=SecretStr('**********'), stream_usage=True)

In [5]:
# Query expansion
query_expansion_prompt = PromptTemplate.from_template("""
You are a helpful assistant. Expand the following query to improve document retrieval by adding relevant synonyms, technical terms, and useful context.

Original query: "{query}"

Expanded query:
""")

query_expansion_chain=query_expansion_prompt| llm | StrOutputParser()
query_expansion_chain

PromptTemplate(input_variables=['query'], input_types={}, partial_variables={}, template='\nYou are a helpful assistant. Expand the following query to improve document retrieval by adding relevant synonyms, technical terms, and useful context.\n\nOriginal query: "{query}"\n\nExpanded query:\n')
| ChatOpenAI(profile={'max_input_tokens': 200000, 'max_output_tokens': 100000, 'image_inputs': True, 'audio_inputs': False, 'video_inputs': False, 'image_outputs': False, 'audio_outputs': False, 'video_outputs': False, 'reasoning_output': True, 'tool_calling': True, 'structured_output': True, 'image_url_inputs': True, 'pdf_inputs': True, 'pdf_tool_message': True, 'image_tool_message': True, 'tool_choice': True}, client=<openai.resources.chat.completions.completions.Completions object at 0x154b73c20>, async_client=<openai.resources.chat.completions.completions.AsyncCompletions object at 0x155068440>, root_client=<openai.OpenAI object at 0x14373c860>, root_async_client=<openai.AsyncOpenAI object a

In [6]:
query_expansion_chain.invoke({"query":"Langchain memory"})

'Expanded query:\n\n‚ÄúLangChain memory‚Äù OR ‚ÄúLangChain memory management‚Äù OR ‚ÄúLangChain memory module‚Äù OR ‚ÄúLangChain memory store‚Äù OR ‚ÄúLLM context persistence‚Äù OR ‚Äúsession state persistence‚Äù OR ‚Äústateful agent memory‚Äù OR ‚Äúconversational memory‚Äù OR ‚Äúchatbot memory‚Äù OR ‚Äúsession memory‚Äù OR ‚Äúconversation buffer memory‚Äù OR ‚ÄúConversationBufferMemory‚Äù OR ‚ÄúConversationSummaryMemory‚Äù OR ‚ÄúLongTermMemory‚Äù OR ‚ÄúKnowledgeGraphMemory‚Äù OR ‚Äúvector store memory‚Äù OR ‚Äúembedding store memory‚Äù OR ‚ÄúRAG memory‚Äù OR ‚Äúretrieval-augmented generation memory‚Äù OR ‚Äúcontext window extension‚Äù OR ‚Äútoken caching‚Äù OR ‚Äúin-memory vs. disk-based memory‚Äù OR ‚Äúpersistent memory store‚Äù OR ‚ÄúRedisMemory‚Äù OR ‚ÄúFaissMemory‚Äù OR ‚ÄúChromaMemory‚Äù OR ‚ÄúPineconeMemory‚Äù OR ‚ÄúMilvusMemory‚Äù OR ‚ÄúMongoDBMemory‚Äù OR ‚ÄúSQLiteMemory‚Äù OR ‚Äúmemory plugin‚Äù OR ‚Äúmemory component‚Äù OR ‚Äúmemory API‚Äù'

In [7]:
# RAG answering prompt
answer_prompt = PromptTemplate.from_template("""
Answer the question based on the context below.

Context:
{context}

Question: {input}
""")

document_chain=create_stuff_documents_chain(llm=llm,prompt=answer_prompt)

In [8]:
# Step 5: Full RAG pipeline with query expansion
rag_pipeline = (
    RunnableMap({
        "input": lambda x: x["input"],
        "context": lambda x: retriever.invoke(query_expansion_chain.invoke({"query": x["input"]}))
    })
    | document_chain
)

In [9]:
# Step 6: Run query
query = {"input": "What types of memory does LangChain support?"}
print(query_expansion_chain.invoke({"query":query}))
response = rag_pipeline.invoke(query)
print("‚úÖ Answer:\n", response)

Expanded query:  
‚ÄúLangChain supported memory types and modules‚Äîwhat memory architectures, storage backends and interfaces does the LangChain framework provide? For example: conversation memory, buffer memory (ConversationBufferMemory), summary memory (ConversationSummaryMemory), windowed or sliding-window/token-window memory, vector-store memory, callback memory, long-term vs short-term memory, in-memory vs persistent (Redis, SQL, file-based) memory stores, custom memory implementations; memory management strategies, memory backends, memory modules and memory interfaces in LangChain.‚Äù
‚úÖ Answer:
 LangChain‚Äôs built-in memory modules fall into two main flavors:  
1. ConversationBufferMemory ‚Äì keeps the full history of the back-and-forth in memory, and  
2. ConversationSummaryMemory ‚Äì rolls up earlier turns into a concise summary so you can stay within token limits.


In [10]:
# Step 6: Run query
query = {"input": "CrewAI agents?"}
print(query_expansion_chain.invoke({"query":query}))
response = rag_pipeline.invoke(query)
print("‚úÖ Answer:\n", response)

Expanded query:

("CrewAI" OR "Crew AI" OR "CrewAI platform" OR "Crew AI agents")  
AND  
("AI agents" OR "autonomous agents" OR "intelligent agents" OR "virtual assistants" OR "digital assistants" OR "software agents" OR "conversational agents" OR "multi-agent system")  
AND  
("features" OR "capabilities" OR "architecture" OR "design patterns" OR "framework" OR "integration" OR "APIs" OR "SDK" OR "technical specifications" OR "documentation")  
AND  
("use cases" OR "applications" OR "workflows" OR "automation" OR "customer support" OR "operations" OR "logistics" OR "team collaboration")  
AND  
("natural language processing" OR "NLP" OR "reinforcement learning" OR "deep learning" OR "multi-modal AI" OR "agent orchestration" OR "decision making" OR "LLM" OR "GPT-4" OR "transformers")  
AND  
("deployment" OR "scalability" OR "security" OR "performance" OR "best practices")
‚úÖ Answer:
 CrewAI agents are autonomous AI ‚Äúworkers‚Äù that you assemble into a team (a ‚Äúcrew‚Äù) to tackl