#### **Query Enhancement**

Other than (optimizing chunking we can also look at QueryEnhancement techniques that just explain or add some more details to our query to make it better - Better the query Better the retrieval)

We have 3 different approaches for this : 

- Query rewriting : In which query is reformulated using LLM to add more details

- Query expansion : Generate broader queries containing same intent and semantics as your current query

- Sub-query Decomposition : This technique can be use to break a complex query into sub-queries.

**Main Goal** - Getting comprehensive retrieval

#### **LLM used**

In [1]:
from langchain_ollama import ChatOllama 

llm = ChatOllama(
    model='llama3.2',
    temperature=0,
    verbose=True
)

llm.invoke("Hey How are you?")

  from .autonotebook import tqdm as notebook_tqdm


AIMessage(content="I'm just a language model, so I don't have emotions or feelings like humans do. However, I'm functioning properly and ready to help with any questions or tasks you may have! How can I assist you today?", additional_kwargs={}, response_metadata={'model': 'llama3.2', 'created_at': '2025-12-05T09:59:44.650567Z', 'done': True, 'done_reason': 'stop', 'total_duration': 20997181625, 'load_duration': 3630311333, 'prompt_eval_count': 30, 'prompt_eval_duration': 12778679959, 'eval_count': 46, 'eval_duration': 3182832337, 'logprobs': None, 'model_name': 'llama3.2', 'model_provider': 'ollama'}, id='lc_run--03c27077-a03e-4033-b4a7-c9fe5e37bdf5-0', usage_metadata={'input_tokens': 30, 'output_tokens': 46, 'total_tokens': 76})

#### **Embedding Model**

In [2]:
from langchain_huggingface import HuggingFaceEmbeddings 

embedding_model = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")

text = "This is a test document."
query_result = embedding_model.embed_query(text)

# show only the first 100 characters of the stringified vector
print(f"Dimension of embeddings : {len(query_result)}")
print(str(query_result)[:100] + "...")

Dimension of embeddings : 384
[-0.0383385606110096, 0.1234646886587143, -0.02864295430481434, 0.05365273356437683, 0.0088453618809...


---

#### **1. Query Rewriting**

In this we tell LLM to make the query detailed and concise (reformulate it), so that retrieval can be improved 

In [5]:
from langchain_core.prompts import PromptTemplate
from pydantic import BaseModel, Field 
from typing import Annotated 

# making a Output Data Schema for LLM 
class QueryRewriting(BaseModel):
    """ 
    Return the re-written query from LLM
    """
    rewritten_query: Annotated[str, Field(description="Rewritten query")]

# configuring LLM for this re-written query output 
llm_query_rewriting = llm.with_structured_output(QueryRewriting)

query_rewriting_prompt = PromptTemplate.from_template("You are an AI assistant tasked with reformulating the queries" \
"to improve RAG retrieval. Given the original query, rewrite it to be more specific, detailed, and likely to retrieve relevant information. " \
"Original query : {query}" \
"Rewritten query : ")

# making a chain
rewriting_query_chain = query_rewriting_prompt | llm_query_rewriting

In [8]:
## lets try to re-write a query
from pprint import pprint

query = "What are the impacts of climate change on the environment?"
response = rewriting_query_chain.invoke({'query' : {query}})
pprint(f"Re-written query : {response.rewritten_query}")

('Re-written query : What are the most significant environmental impacts of '
 'climate change, including changes in temperature, sea level rise, and '
 'extreme weather events, as well as their effects on biodiversity, '
 'ecosystems, and human health?')


---

#### **2. Query Expansion**

In this we try to generate more queries which contains the same semantic and intent as the original query.

In [9]:
from langchain_core.prompts import PromptTemplate 
from pydantic import BaseModel, Field 
from typing import Annotated 

class QueryExpansion(BaseModel):
    """
    This will create new query, which contains same semantic and intent as original query.
    """
    expanded_query: Annotated[str, Field(description="The new query after applying query expansion.")]

## configuring LLM with the structured output 
llm_query_exp = llm.with_structured_output(QueryExpansion)

## writing system prompt for Query expansion 
query_exp_template = """ 
You are an AI assistant tasked with generating broader, more general queries to improve context retrieval in a RAG system.
Given the original query, generate an expanded query that is more general and can help retrieve relevant background information.

Original query: {original_query}

expanded_query : 
"""

query_expansion_prompt = PromptTemplate(
    template=query_exp_template, 
    input_variables=['original_query']
)

# query expansion chain 
query_expansion_chain = query_expansion_prompt | llm_query_exp 

In [10]:
# lets try this using a dummy query
import time

query = "What are the impacts of climate change on the environment?"
print(f"Original_query : {query}")
start = time.time()
expanded_query = query_expansion_chain.invoke({'original_query' : query})
end = time.time()
print(f"Expanded query : {expanded_query}")
print(f"Time taken : {end-start :.2f} sec")

Original_query : What are the impacts of climate change on the environment?
Expanded query : expanded_query='What are the effects of global warming on ecosystems, biodiversity, and natural resources?'
Time taken : 21.39 sec


---

#### **3. Subquery Decomposition** 

If query is really complex then we can use an LLM to break this complex query into subqueries 

In [16]:
from langchain_core.prompts import PromptTemplate
from pydantic import BaseModel, Field 
from typing import Annotated, List

# we need to make a data output class that LLM can use to give structured output 
class SubqueryDecomposition(BaseModel):
    """
    This will return a List of simple queries that are break-down from one complex query. 
    """
    subqueries: Annotated[List[str], Field(description="List of subqueries breaked down from a complex query.")]

# configuring LLM with data class
llm_for_subquery_decomp = llm.with_structured_output(SubqueryDecomposition)

subquery_decomposition_template = """You are an AI assistant tasked with breaking down complex queries into simpler sub-queries for a RAG system.
Given the original query, decompose it into 2-4 simpler sub-queries that, when answered together, would provide a comprehensive response to the original query.

Original query: {original_query}
subqueries: (You need to generate)

example: What are the impacts of climate change on the environment?

subqueries:["What are the impacts of climate change on biodiversity?", "How does climate change affect the oceans?", "What are the effects of climate change on agriculture?", "What are the impacts of climate change on human health?"]"""

subquery_decomposition_prompt = PromptTemplate(
    template=subquery_decomposition_template,
    input_variables=['original_query']
)

# llm chain for subquery decomposition 
chain_for_subquery_decomp = subquery_decomposition_prompt | llm_for_subquery_decomp

In [17]:
# Lets try this subquery decomposition
original_query = "What are the impacts of climate change on the environment?"
start = time.time()
response = chain_for_subquery_decomp.invoke({'original_query' : original_query})
end = time.time()

print(f"Sub-queries : {response.subqueries}")
print(f"Time taken : {end-start :.2f} sec")

Sub-queries : ['What is the current state of global greenhouse gas emissions?', 'How do changes in temperature and precipitation patterns affect ecosystems?', 'What are the most significant economic costs associated with climate change?', 'What role do climate change mitigation strategies play in reducing environmental impacts?']
Time taken : 24.67 sec
