### Query Enhancement ‚Äì Query Expansion Techniques

In a RAG pipeline, the quality of the query sent to the retriever determines how good the retrieved context is ‚Äî and therefore, how accurate the LLM‚Äôs final answer will be.

That‚Äôs where Query Expansion / Enhancement comes in.

#### üéØ What is Query Enhancement?
Query enhancement refers to techniques used to improve or reformulate the user query to retrieve better, more relevant documents from the knowledge base.
It is especially useful when:

- The original query is short, ambiguous, or under-specified
- You want to broaden the scope to catch synonyms, related phrases, or spelling variants

In [1]:
from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS
from langchain.chat_models import init_chat_model
from langchain.prompts import PromptTemplate
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.chains.retrieval import create_retrieval_chain
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnableMap

  from .autonotebook import tqdm as notebook_tqdm


In [2]:
## step1 : Load and split the dataset
loader = TextLoader("langchain_crewai_dataset.txt")
raw_docs = loader.load()
splitter = RecursiveCharacterTextSplitter(chunk_size=300, chunk_overlap=50)
chunks = splitter.split_documents(raw_docs)


In [3]:
chunks

[Document(metadata={'source': 'langchain_crewai_dataset.txt'}, page_content='LangChain is an open-source framework designed for developing applications powered by large language models (LLMs). It simplifies the process of building, managing, and scaling complex chains of thought by abstracting prompt management, retrieval, memory, and agent orchestration. Developers can use'),
 Document(metadata={'source': 'langchain_crewai_dataset.txt'}, page_content='and agent orchestration. Developers can use LangChain to create end-to-end pipelines that connect LLMs with tools, APIs, vector databases, and other knowledge sources. (v1)'),
 Document(metadata={'source': 'langchain_crewai_dataset.txt'}, page_content='At the heart of LangChain lies the concept of chains, which are sequences of calls to LLMs and other tools. Chains can be simple, such as a single prompt fed to an LLM, or complex, involving multiple conditionally executed steps. LangChain makes it easy to compose and reuse chains using st

In [4]:
### step 2: Vector Store
embedding_model=HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
vectorstore=FAISS.from_documents(chunks,embedding_model)

## step 3:MMR Retriever
retriever=vectorstore.as_retriever(search_type="mmr",search_kwargs={"k":5})
retriever


VectorStoreRetriever(tags=['FAISS', 'HuggingFaceEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x000002702501AED0>, search_type='mmr', search_kwargs={'k': 5})

In [5]:
## step 4 : LLM and Prompt

import os
from dotenv import load_dotenv
load_dotenv()

os.environ["OPENAI_API_KEY"]=os.getenv("OPENAI_API_KEY")

llm=init_chat_model("openai:o4-mini")
llm


ChatOpenAI(client=<openai.resources.chat.completions.completions.Completions object at 0x0000027025B7A7E0>, async_client=<openai.resources.chat.completions.completions.AsyncCompletions object at 0x0000027026115DF0>, root_client=<openai.OpenAI object at 0x0000027025AB4500>, root_async_client=<openai.AsyncOpenAI object at 0x0000027025B7A5D0>, model_name='o4-mini', model_kwargs={}, openai_api_key=SecretStr('**********'), stream_usage=True)

In [6]:
# Query expansion
query_expansion_prompt = PromptTemplate.from_template("""
You are a helpful assistant. Expand the following query to improve document retrieval by adding relevant synonyms, technical terms, and useful context.

Original query: "{query}"

Expanded query:
""")

query_expansion_chain=query_expansion_prompt| llm | StrOutputParser()
query_expansion_chain

PromptTemplate(input_variables=['query'], input_types={}, partial_variables={}, template='\nYou are a helpful assistant. Expand the following query to improve document retrieval by adding relevant synonyms, technical terms, and useful context.\n\nOriginal query: "{query}"\n\nExpanded query:\n')
| ChatOpenAI(client=<openai.resources.chat.completions.completions.Completions object at 0x0000027025B7A7E0>, async_client=<openai.resources.chat.completions.completions.AsyncCompletions object at 0x0000027026115DF0>, root_client=<openai.OpenAI object at 0x0000027025AB4500>, root_async_client=<openai.AsyncOpenAI object at 0x0000027025B7A5D0>, model_name='o4-mini', model_kwargs={}, openai_api_key=SecretStr('**********'), stream_usage=True)
| StrOutputParser()

In [7]:
query_expansion_chain.invoke({"query":"Langchain memory"})

'LangChain memory OR ‚ÄúLang chain‚Äù memory management OR conversational memory in LangChain OR LangChain state management OR long-term memory in LangChain OR short-term memory in LangChain OR context management in LangChain OR memory buffers OR ConversationBufferMemory OR ConversationBufferWindowMemory OR ConversationTokenBufferMemory OR CombinedMemory OR summary memory OR retrieval-augmented generation memory OR memory retriever OR memory store OR embeddings vector store OR vector database integration (Pinecone, Chroma, FAISS, Qdrant, Redis) OR session state persistence OR RAG workflows OR memory module best practices OR code examples (Python SDK, JavaScript API) OR performance optimization, debugging, and use-case patterns.'

In [8]:
# RAG answering prompt
answer_prompt = PromptTemplate.from_template("""
Answer the question based on the context below.

Context:
{context}

Question: {input}
""")

document_chain=create_stuff_documents_chain(llm=llm,prompt=answer_prompt)

In [9]:
# Step 5: Full RAG pipeline with query expansion
rag_pipeline = (
    RunnableMap({
        "input": lambda x: x["input"],
        "context": lambda x: retriever.invoke(query_expansion_chain.invoke({"query": x["input"]}))
    })
    | document_chain
)

In [10]:
# Step 6: Run query
query = {"input": "What types of memory does LangChain support?"}
print(query_expansion_chain.invoke({"query":query}))
response = rag_pipeline.invoke(query)
print("‚úÖ Answer:\n", response)

Expanded query:  
What types of memory does LangChain support for managing LLM context and state‚Äîe.g. built-in memory modules like ConversationBufferMemory, ConversationSummaryMemory, ConversationBufferWindowMemory, EntityMemory or KnowledgeGraphMemory, and retriever-augmented memory‚Äîas well as supported backends (in-memory dicts, RedisMemory, MongoDBMemory, SQLMemory, vector-store memories using Pinecone, FAISS, Chroma, Weaviate, etc.)? How do these map to short-term (working/session memory) vs. long-term (persistent/semantic/episodic) memory patterns in LangChain chains and agents?
‚úÖ Answer:
 LangChain currently ships with two main conversational memory modules:

1. ConversationBufferMemory  
   ‚Äì Keeps a running ‚Äúbuffer‚Äù of all past turns in the session.  
2. ConversationSummaryMemory  
   ‚Äì As the chat grows, it distills older messages into a concise summary so you stay within token limits.


In [11]:
# Step 6: Run query
query = {"input": "CrewAI agents?"}
print(query_expansion_chain.invoke({"query":query}))
response = rag_pipeline.invoke(query)
print("‚úÖ Answer:\n", response)

Here‚Äôs one way to turn  
  {'input': 'CrewAI agents?'}  
into a richer Boolean-style retrieval query that adds synonyms, technical terms and context:

(‚ÄúCrewAI‚Äù OR ‚ÄúCrew AI‚Äù OR ‚ÄúAI Crew‚Äù OR ‚ÄúCrew Intelligence‚Äù OR ‚ÄúCrewAI platform‚Äù OR ‚ÄúCrewAI system‚Äù)  
AND  
(‚Äúagent‚Äù OR ‚Äúagents‚Äù OR ‚Äúintelligent agent‚Äù OR ‚Äúsoftware agent‚Äù OR ‚Äúautonomous agent‚Äù OR ‚ÄúAI assistant‚Äù OR ‚Äúvirtual assistant‚Äù OR ‚Äúdigital agent‚Äù OR ‚Äúbot‚Äù OR ‚Äúautomated assistant‚Äù)  
AND  
(‚Äúmulti-agent system‚Äù OR ‚ÄúMAS‚Äù OR ‚Äúagent orchestration‚Äù OR ‚Äúagent framework‚Äù OR ‚Äúagent architecture‚Äù OR ‚ÄúAPI integration‚Äù OR ‚Äúplatform features‚Äù OR ‚Äúplatform capabilities‚Äù OR ‚Äúdeployment‚Äù OR ‚Äúintegration‚Äù OR ‚Äúuse cases‚Äù OR ‚Äúworkflow automation‚Äù OR ‚Äútask automation‚Äù OR ‚Äúcollaborative AI‚Äù OR ‚Äúteam collaboration‚Äù OR ‚Äúreinforcement learning‚Äù OR ‚Äúnatural language processing‚Äù)

You can tailor the last line (use cases/tec