# **Query Enhancement – Query Expansion Techniques**
In a RAG pipeline, the quality of the query sent to the retriever determines how good the retrieved context is — and therefore, how accurate the LLM’s final answer will be.

That’s where Query Expansion / Enhancement comes in.

**What is Query Enhancement?**
Query enhancement refers to techniques used to improve or reformulate the user query to retrieve better, more relevant documents from the knowledge base.
It is especially useful when:

- The original query is short, ambiguous, or under-specified
- You want to broaden the scope to catch synonyms, related phrases, or spelling variants

In [1]:
from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS
from langchain.chat_models import init_chat_model
from langchain.prompts import PromptTemplate
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.chains.retrieval import create_retrieval_chain
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnableMap

In [None]:
## step1 : Load and split the dataset
loader = TextLoader("langchain_crewai_dataset.txt")
raw_docs = loader.load()
splitter = RecursiveCharacterTextSplitter(chunk_size=300, chunk_overlap=50)
chunks = splitter.split_documents(raw_docs)

In [None]:
'''chunks be like
[Document(metadata={'source': 'langchain_crewai_dataset.txt'}, page_content='LangChain is an open-source framework designed for developing applications powered by large language models (LLMs). It simplifies the process of building, managing, and scaling complex chains of thought by abstracting prompt management, retrieval, memory, and agent orchestration. Developers can use'),
 Document(metadata={'source': 'langchain_crewai_dataset.txt'}, page_content='and agent orchestration. Developers can use LangChain to create end-to-end pipelines that connect LLMs with tools, APIs, vector databases, and other knowledge sources. (v1)'),
 
'''

In [None]:
### step 2: Vector Store
embedding_model=HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
vectorstore=FAISS.from_documents(chunks,embedding_model)

## step 3:MMR Retriever
retriever=vectorstore.as_retriever(search_type="mmr",search_kwargs={"k":5})

In [None]:
## step 4 : LLM and Prompt

import os
from dotenv import load_dotenv
load_dotenv()

os.environ["OPENAI_API_KEY"]=os.getenv("OPENAI_API_KEY")

llm=init_chat_model("openai:o4-mini")
llm

ChatOpenAI(client=<openai.resources.chat.completions.completions.Completions object at 0x0000029BF1766510>, async_client=<openai.resources.chat.completions.completions.AsyncCompletions object at 0x0000029BF1766F90>, root_client=<openai.OpenAI object at 0x0000029BF058B620>, root_async_client=<openai.AsyncOpenAI object at 0x0000029BF1766CF0>, model_name='o4-mini', model_kwargs={}, openai_api_key=SecretStr('**********'))

In [6]:
# Query expansion
query_expansion_prompt = PromptTemplate.from_template("""
You are a helpful assistant. Expand the following query to improve document retrieval by adding relevant synonyms, technical terms, and useful context.

Original query: "{query}"

Expanded query:
""")

query_expansion_chain=query_expansion_prompt| llm | StrOutputParser()
query_expansion_chain

PromptTemplate(input_variables=['query'], input_types={}, partial_variables={}, template='\nYou are a helpful assistant. Expand the following query to improve document retrieval by adding relevant synonyms, technical terms, and useful context.\n\nOriginal query: "{query}"\n\nExpanded query:\n')
| ChatOpenAI(client=<openai.resources.chat.completions.completions.Completions object at 0x0000029BF1766510>, async_client=<openai.resources.chat.completions.completions.AsyncCompletions object at 0x0000029BF1766F90>, root_client=<openai.OpenAI object at 0x0000029BF058B620>, root_async_client=<openai.AsyncOpenAI object at 0x0000029BF1766CF0>, model_name='o4-mini', model_kwargs={}, openai_api_key=SecretStr('**********'))
| StrOutputParser()

In [7]:
query_expansion_chain.invoke({"query":"Langchain memory"})

'Expanded query:\n\n("LangChain memory" OR "LangChain memory management" OR "LangChain memory module" OR "persistent conversation memory" OR "embeddings-based memory" OR "stateful chatbots" OR "session context persistence" OR "RAG memory store" OR "LLM context window" OR "prompt memory" OR "memory retriever")  \nAND  \n("VectorStoreMemory" OR "ConversationBufferMemory" OR "ConversationSummaryMemory" OR "RedisMemory" OR "SQLMemory" OR "MemoryChain" OR "MemoryRouter")  \nAND  \n(FAISS OR Chroma OR Pinecone OR Weaviate OR Milvus OR "vector embedding store" OR "document chunking" OR "knowledge retrieval")'

In [8]:
# RAG answering prompt
answer_prompt = PromptTemplate.from_template("""
Answer the question based on the context below.

Context:
{context}

Question: {input}
""")

document_chain=create_stuff_documents_chain(llm=llm,prompt=answer_prompt)

In [9]:
# Step 5: Full RAG pipeline with query expansion
rag_pipeline = (
    RunnableMap({
        "input": lambda x: x["input"],
        "context": lambda x: retriever.invoke(query_expansion_chain.invoke({"query": x["input"]}))
    })
    | document_chain
)

In [10]:
# Step 6: Run query
query = {"input": "What types of memory does LangChain support?"}
print(query_expansion_chain.invoke({"query":query}))
response = rag_pipeline.invoke(query)
print("✅ Answer:\n", response)

Here’s an expanded search query that adds synonyms, technical terms and useful context for better recall of LangChain’s memory features:

“LangChain memory support” OR “LangChain memory types” OR “LangChain memory modules” OR “LangChain memory classes”  
AND (“ConversationBufferMemory” OR “ConversationSummaryMemory” OR “CombinedMemory” OR “DynamicMemory” OR “ConversationTokenBufferMemory”)  
OR (“short-term memory” OR “long-term memory” OR “ephemeral memory” OR “session state” OR “context window”)  
OR (“persistent memory” OR “stateful memory store” OR “memory retriever”)  
OR (“vector store memory” OR “embedding store” OR “semantic memory” OR “RAG”)  
OR (“Chroma” OR “FAISS” OR “Pinecone” OR “Weaviate” OR “Redis” OR “SQLite” OR “PostgreSQL” OR “MongoDB”)  
OR (“memory API” OR “memory variable” OR “memory backend” OR “cache” OR “in-memory” OR “file-based”)  
AND (“LangChain Python” OR “LLMChain” OR “chatbot context management” OR “agent state management”)
✅ Answer:
 LangChain currently

In [12]:
# Step 6: Run query
query = {"input": "CrewAI agents?"}
print(query_expansion_chain.invoke({"query":query}))
response = rag_pipeline.invoke(query)
print("✅ Answer:\n", response)

Expanded query:

("CrewAI agents" OR "Crew AI agents" OR "CrewAI bots" OR "Crew AI assistants" OR "autonomous crew agents" OR "AI-driven crew management assistants" OR "virtual crew members" OR "digital crew agents")  
AND  
("crew scheduling" OR "workforce management" OR "staff rostering" OR "resource allocation" OR "employee roster optimization")  
AND  
("multi-agent system" OR "autonomous agents" OR "distributed AI" OR "agent-based modeling")  
AND  
("machine learning" OR "reinforcement learning" OR "predictive analytics" OR "optimization algorithms" OR "real-time planning")  
AND  
("airline" OR "maritime" OR "hospitality" OR "logistics" OR "field service")
✅ Answer:
 CrewAI agents are semi-autonomous, role-specialized AI “workers” that team up in a predefined workflow to tackle complex, multi-step tasks.  Key points:  
1. Defined Roles  
   • Researcher – gathers data and insights  
   • Planner – lays out strategy, timelines, and dependencies  
   • Executor – carries out concr

## **Querry Expansions**

> **Query Expansion (QE)** is a technique used in **information retrieval (IR)** and **vector-based search systems** (like FAISS, Pinecone, or Elasticsearch) to improve search accuracy and recall by **broadening the user’s query** with additional, semantically related terms or concepts.

It helps the system retrieve **more relevant documents** — even if they don’t contain the *exact words* used in the original query — by introducing synonyms, related entities, or paraphrased expressions.

---

**1. Why Query Expansion is Needed**

When users search for information, they often use **limited or ambiguous terms**. For example:

| Original Query  | Problem                      | Example Missed Results                                       |
| --------------- | ---------------------------- | ------------------------------------------------------------ |
| “car insurance” | Doesn’t include synonyms     | Misses “automobile insurance”, “vehicle coverage”            |
| “heart attack”  | Doesn’t include medical term | Misses “myocardial infarction”                               |
| “AI jobs”       | Ambiguous                    | Misses “machine learning engineer”, “data science positions” |

Query expansion solves this by **adding semantically related words**, improving **recall** (more relevant results) without losing **precision** (accuracy of retrieved documents).

---

**2. Types of Query Expansion**

There are several ways to expand a query depending on the source and intent of expansion:

| Type                             | Description                                                           | Example                                                            |
| -------------------------------- | --------------------------------------------------------------------- | ------------------------------------------------------------------ |
| **Synonym Expansion**            | Adds synonyms or alternate terms                                      | “doctor” → “physician”, “medical practitioner”                     |
| **Semantic Expansion**           | Uses embeddings or knowledge graphs to add conceptually related terms | “AI” → “machine learning”, “deep learning”, “neural networks”      |
| **Stemming / Lemmatization**     | Expands to different forms of the same word                           | “run” → “running”, “ran”                                           |
| **Contextual Expansion**         | Uses LLMs or context understanding to expand query meaningfully       | “COVID vaccines” → “Pfizer”, “Moderna”, “vaccine efficacy”         |
| **Relevance Feedback Expansion** | Expands based on previous user feedback or clicked results            | If user clicked “Tesla”, add “EV”, “electric vehicle”, “Elon Musk” |
| **Statistical Expansion**        | Expands using co-occurrence frequency in corpus                       | “bank” → “river”, “loan”, depending on surrounding words           |

---

**3. How Query Expansion Works**

The general workflow of Query Expansion includes:

1. **Receive the original query**
   e.g., “AI in healthcare”

2. **Analyze or embed the query**
   Convert it into an embedding vector or parse keywords.

3. **Generate related terms** using one of the following:

   * Predefined thesaurus (like WordNet)
   * Semantic embedding similarity
   * LLMs (e.g., GPT models)
   * Statistical models or co-occurrence analysis

4. **Expand the query** by adding similar or related terms
   → “AI in healthcare” → “machine learning in medicine”, “deep learning for diagnostics”

5. **Search using the expanded query set**
   Combine results and rank them based on similarity, relevance, or diversity.

---

**4. Query Expansion in Vector Databases**

In **vector search systems** like FAISS, Pinecone, or Chroma, query expansion is often done by:

* **Embedding expansion**:
  Create multiple embeddings (vectors) for semantically similar queries and perform a joint search.

* **Weighted retrieval**:
  Give higher weights to original query terms but still include related embeddings.

* **Re-ranking or MMR integration**:
  Combine expanded query results with MMR (Maximal Marginal Relevance) for diversity and precision.

---

**Example: Query Expansion with LangChain and OpenAI**

```python
from langchain.chat_models import ChatOpenAI
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain

# Step 1: Define the LLM
llm = ChatOpenAI(model="gpt-4o-mini")

# Step 2: Create a prompt for query expansion
template = """
Expand the following query by adding relevant and semantically related search terms:
Query: {query}
Return a list of related phrases or keywords.
"""
prompt = PromptTemplate(input_variables=["query"], template=template)

# Step 3: Build the chain
expand_chain = LLMChain(llm=llm, prompt=prompt)

# Step 4: Run the expansion
expanded_query = expand_chain.run("AI in healthcare")
print(expanded_query)
```

**Output Example:**

```
["artificial intelligence in medicine", "machine learning for diagnosis", 
 "deep learning in healthcare", "medical AI applications", 
 "health data analytics"]
```

Then these terms can be embedded and searched together in a **vector database** like FAISS or Pinecone.

---

**5. Query Expansion Techniques by Source**

| Technique                 | Source                             | Description                                             |
| ------------------------- | ---------------------------------- | ------------------------------------------------------- |
| **Thesaurus-Based**       | WordNet, domain glossary           | Adds linguistic synonyms                                |
| **Embedding-Based**       | OpenAI, BERT, SentenceTransformers | Adds semantically related terms using vector similarity |
| **Knowledge Graph-Based** | Wikidata, ConceptNet               | Adds conceptually linked entities                       |
| **LLM-Based**             | GPT, Claude, Gemini                | Dynamically generates context-aware expansions          |
| **Relevance Feedback**    | User interactions                  | Uses click history to improve expansion accuracy        |

---

**6. Example: FAISS + Query Expansion**

You can use query expansion in FAISS like this:

```python
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import FAISS

# Create embeddings
embeddings = OpenAIEmbeddings()

# Original query
query = "AI in healthcare"

# Expanded queries (manually or via LLM)
expanded_queries = [
    "AI in healthcare",
    "machine learning in medicine",
    "deep learning in diagnostics"
]

# Search results from multiple queries
results = []
for q in expanded_queries:
    docs = vectorstore.similarity_search(q, k=3)
    results.extend(docs)

# Optionally deduplicate and rerank results
unique_results = list({doc.page_content: doc for doc in results}.values())
```

---

**7. Benefits of Query Expansion**

- **Improved Recall** — Retrieves more relevant documents.
- **Enhanced Understanding** — Captures broader meaning of user intent.
- **Bridges Vocabulary Gap** — Connects different terminology (e.g., “COVID” vs. “coronavirus”).
- **Adaptive Searching** — Works well with dynamic, domain-specific language.

---

**8. Drawbacks and Challenges**

- **Loss of Precision** — Adding too many terms may include irrelevant results.
- **Computational Overhead** — More queries → higher processing time.
- **Domain Dependence** — Requires domain-specific expansion sources for best accuracy.
- **Ambiguity** — Incorrect expansions can mislead retrieval (e.g., “bank” → “river” instead of “finance”).

---

**9. Balancing Expansion with Relevance**

Often, systems use **weighted query expansion**, where:

* Original query terms are given **higher weight**.
* Expanded terms are given **lower weight** (controlled by α or β).

[
\text{Final Query Vector} = \alpha \times \text{Original Query} + \beta \times \sum(\text{Expanded Terms})
]

This ensures results remain faithful to the user’s intent while exploring related areas.

---

**10. Real-World Applications**

* **Search Engines** – Expands user queries for better results (e.g., Google “Did you mean?”).
* **RAG Pipelines** – Enhances retrieval coverage for LLMs.
* **Recommendation Systems** – Finds similar items even with different keywords.
* **Healthcare / Legal Search** – Identifies synonymic or technical terminology.
* **Chatbots** – Interprets varied user phrasing for intent matching.

---

**11. Integrating Query Expansion with MMR and Reranking**

A modern **retrieval pipeline** often combines:

1. **Query Expansion** → Improves recall
2. **MMR (Maximal Marginal Relevance)** → Adds diversity
3. **Reranking (Cross-Encoder)** → Improves precision

This results in **highly relevant, diverse, and accurate** retrieval — the foundation for **high-quality RAG systems**.

---

**Key Takeaways**

* Query Expansion enhances retrieval quality by adding **semantically related search terms**.
* It can be implemented using **LLMs, embeddings, or linguistic databases**.
* Works best when **combined with reranking and MMR**.
* Crucial for **semantic search, RAG, and large-scale AI retrieval systems**.

---

**Formula Summary**

$$\text{Expanded Query} = Q + \sum_{i=1}^{n} \text{RelatedTerms}_i$$