# Query Transformations for Improved Retrieval in RAG Systems

## Overview

This code implements three query transformation techniques to enhance the retrieval process in Retrieval-Augmented Generation (RAG) systems:

1. Query Rewriting
2. Step-back Prompting
3. Sub-query Decomposition

Each technique aims to improve the relevance and comprehensiveness of retrieved information by modifying or expanding the original query.

## Motivation

RAG systems often face challenges in retrieving the most relevant information, especially when dealing with complex or ambiguous queries. These query transformation techniques address this issue by reformulating queries to better match relevant documents or to retrieve more comprehensive information.

## Key Components

1. Query Rewriting: Reformulates queries to be more specific and detailed.
2. Step-back Prompting: Generates broader queries for better context retrieval.
3. Sub-query Decomposition: Breaks down complex queries into simpler sub-queries.

## Method Details

### 1. Query Rewriting

- **Purpose**: To make queries more specific and detailed, improving the likelihood of retrieving relevant information.
- **Implementation**:
  - Uses a GPT-4 model with a custom prompt template.
  - Takes the original query and reformulates it to be more specific and detailed.

### 2. Step-back Prompting

- **Purpose**: To generate broader, more general queries that can help retrieve relevant background information.
- **Implementation**:
  - Uses a GPT-4 model with a custom prompt template.
  - Takes the original query and generates a more general "step-back" query.

### 3. Sub-query Decomposition

- **Purpose**: To break down complex queries into simpler sub-queries for more comprehensive information retrieval.
- **Implementation**:
  - Uses a GPT-4 model with a custom prompt template.
  - Decomposes the original query into 2-4 simpler sub-queries.

## Benefits of these Approaches

1. **Improved Relevance**: Query rewriting helps in retrieving more specific and relevant information.
2. **Better Context**: Step-back prompting allows for retrieval of broader context and background information.
3. **Comprehensive Results**: Sub-query decomposition enables retrieval of information that covers different aspects of a complex query.
4. **Flexibility**: Each technique can be used independently or in combination, depending on the specific use case.

## Implementation Details

- All techniques use OpenAI's GPT-4 model for query transformation.
- Custom prompt templates are used to guide the model in generating appropriate transformations.
- The code provides separate functions for each transformation technique, allowing for easy integration into existing RAG systems.

## Example Use Case

The code demonstrates each technique using the example query:
"What are the impacts of climate change on the environment?"

- **Query Rewriting** expands this to include specific aspects like temperature changes and biodiversity.
- **Step-back Prompting** generalizes it to "What are the general effects of climate change?"
- **Sub-query Decomposition** breaks it down into questions about biodiversity, oceans, weather patterns, and terrestrial environments.

## Conclusion

These query transformation techniques offer powerful ways to enhance the retrieval capabilities of RAG systems. By reformulating queries in various ways, they can significantly improve the relevance, context, and comprehensiveness of retrieved information. These methods are particularly valuable in domains where queries can be complex or multifaceted, such as scientific research, legal analysis, or comprehensive fact-finding tasks.

# Package Installation and Imports

The cell below installs all necessary packages required to run this notebook.


In [1]:
# Install required packages
#!uv pip install langchain langchain-openai langchain-core python-dotenv

In [2]:
from langchain_openai import ChatOpenAI
from langchain_core.prompts import PromptTemplate

import os
from dotenv import load_dotenv

# Load environment variables from a .env file
load_dotenv()

# Set the OpenAI API key environment variable
os.environ["OPENAI_API_KEY"] = os.getenv('OPENAI_API_KEY')

### 1 - Query Rewriting: Reformulating queries to improve retrieval.

In [3]:
re_write_llm = ChatOpenAI(temperature=0, model_name="gpt-4o", max_tokens=4000)

# Create a prompt template for query rewriting
query_rewrite_template = """You are an AI assistant tasked with reformulating user queries to improve retrieval in a RAG system. 
Given the original query, rewrite it to be more specific, detailed, and likely to retrieve relevant information.

Original query: {original_query}

Rewritten query:"""

query_rewrite_prompt = PromptTemplate(
    input_variables=["original_query"],
    template=query_rewrite_template
)

# Create an LLMChain for query rewriting
query_rewriter = query_rewrite_prompt | re_write_llm

def rewrite_query(original_query):
    """
    Rewrite the original query to improve retrieval.
    
    Args:
    original_query (str): The original user query
    
    Returns:
    str: The rewritten query
    """
    response = query_rewriter.invoke(original_query)
    return response.content

### Demonstrate on a use case

In [4]:
# example query over the understanding climate change dataset
original_query = "What are the impacts of climate change on the environment?"
rewritten_query = rewrite_query(original_query)
print("Original query:", original_query)
print("\nRewritten query:", rewritten_query)

Original query: What are the impacts of climate change on the environment?

Rewritten query: How does climate change affect various aspects of the environment, such as biodiversity, ecosystems, weather patterns, and sea levels?


### 2 - Step-back Prompting: Generating broader queries for better context retrieval.



In [5]:
step_back_llm = ChatOpenAI(temperature=0, model_name="gpt-4o", max_tokens=4000)


# Create a prompt template for step-back prompting
step_back_template = """You are an AI assistant tasked with generating broader, more general queries to improve context retrieval in a RAG system.
Given the original query, generate a step-back query that is more general and can help retrieve relevant background information.

Original query: {original_query}

Step-back query:"""

step_back_prompt = PromptTemplate(
    input_variables=["original_query"],
    template=step_back_template
)

# Create an LLMChain for step-back prompting
step_back_chain = step_back_prompt | step_back_llm

def generate_step_back_query(original_query):
    """
    Generate a step-back query to retrieve broader context.
    
    Args:
    original_query (str): The original user query
    
    Returns:
    str: The step-back query
    """
    response = step_back_chain.invoke(original_query)
    return response.content

### Demonstrate on a use case

In [6]:
# example query over the understanding climate change dataset
original_query = "What are the impacts of climate change on the environment?"
step_back_query = generate_step_back_query(original_query)
print("Original query:", original_query)
print("\nStep-back query:", step_back_query)

Original query: What are the impacts of climate change on the environment?

Step-back query: What are the general effects of climate change on natural systems and ecosystems?


### 3- Sub-query Decomposition: Breaking complex queries into simpler sub-queries.

In [7]:
sub_query_llm = ChatOpenAI(temperature=0, model_name="gpt-4o", max_tokens=4000)

# Create a prompt template for sub-query decomposition
subquery_decomposition_template = """You are an AI assistant tasked with breaking down complex queries into simpler sub-queries for a RAG system.
Given the original query, decompose it into 2-4 simpler sub-queries that, when answered together, would provide a comprehensive response to the original query.

Original query: {original_query}

example: What are the impacts of climate change on the environment?

Sub-queries:
1. What are the impacts of climate change on biodiversity?
2. How does climate change affect the oceans?
3. What are the effects of climate change on agriculture?
4. What are the impacts of climate change on human health?"""


subquery_decomposition_prompt = PromptTemplate(
    input_variables=["original_query"],
    template=subquery_decomposition_template
)

# Create an LLMChain for sub-query decomposition
subquery_decomposer_chain = subquery_decomposition_prompt | sub_query_llm

def decompose_query(original_query: str):
    """
    Decompose the original query into simpler sub-queries.
    
    Args:
    original_query (str): The original complex query
    
    Returns:
    List[str]: A list of simpler sub-queries
    """
    response = subquery_decomposer_chain.invoke(original_query).content
    sub_queries = [q.strip() for q in response.split('\n') if q.strip() and not q.strip().startswith('Sub-queries:')]
    return sub_queries

### Demonstrate on a use case

In [8]:
# example query over the understanding climate change dataset
original_query = "What are the impacts of climate change on the environment?"
sub_queries = decompose_query(original_query)
print("\nSub-queries:")
for i, sub_query in enumerate(sub_queries, 1):
    print(sub_query)


Sub-queries:
1. How does climate change affect weather patterns and extreme weather events?
2. What are the impacts of climate change on ecosystems and wildlife habitats?
3. How does climate change influence sea levels and coastal areas?
4. What are the effects of climate change on freshwater resources and availability?


# Query Transformations in RAG

## Query Rewriting/Refinement
- **Query expansion** - adding related terms or synonyms
- **Query decomposition** - breaking complex queries into sub-queries
- **Query simplification** - making queries more concise
- **Query rephrasing** - restating the question in different ways

## Multi-Query Generation
- Generating multiple variations of the same query
- Creating parallel queries from different perspectives
- Producing diverse phrasings for better coverage

## Step-Back Prompting
- Abstracting the query to a higher-level question
- Creating broader context queries before specific ones

## HyDE (Hypothetical Document Embeddings)
- Generating hypothetical answers/documents
- Using these as queries instead of the original question

## Query Routing
- Classifying query type (factual, analytical, conversational)
- Determining which knowledge source to query
- Selecting appropriate retrieval strategy

## Query Compression
- Removing unnecessary words or context
- Focusing on key semantic elements

## Contextual Query Enhancement
- Adding conversation history context
- Incorporating user metadata or preferences
- Appending temporal or domain-specific context

## Query Translation
- Converting natural language to structured queries (e.g., SQL, metadata filters)
- Transforming to domain-specific terminology

---

**Note:** These transformations can be used individually or combined depending on your RAG architecture and use case.

## Let's Explore More Transformations

In [1]:
import os
from openai import OpenAI
from typing import List, Dict
import json
from IPython.display import display, Markdown, HTML

# Set your API key here or use environment variable
# os.environ["OPENAI_API_KEY"] = "your-api-key-here"

# Or load from .env file
from dotenv import load_dotenv
load_dotenv()

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

print("‚úÖ Setup complete!")

‚úÖ Setup complete!


## Multi-Query Generation

**Purpose:** Generate multiple variations of a query to improve retrieval recall.

**Why it matters:** Different phrasings can retrieve different relevant documents.

In [2]:
def multi_query_generation(original_query: str, num_queries: int = 3) -> List[str]:
    prompt = f"""You are an AI assistant helping to improve search queries.
    
Generate {num_queries} different versions of the following question. 
Each version should:
- Ask the same thing but from a different perspective
- Use different wording and phrasing
- Help retrieve relevant information from a vector database

Original question: {original_query}

Return ONLY a JSON array of {num_queries} alternative questions, nothing else.
Example format: ["question 1", "question 2", "question 3"]"""

    response = client.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": prompt}],
        temperature=0.7
    )
    
    queries = json.loads(response.choices[0].message.content)
    return queries

In [3]:
# Try it out!
original_query = "What are the benefits of using RAG in LLM applications?"

print(f"üìù Original Query: {original_query}\n")
variations = multi_query_generation(original_query, num_queries=3)

print("üîÑ Generated Variations:")
for i, var in enumerate(variations, 1):
    print(f"  {i}. {var}")

üìù Original Query: What are the benefits of using RAG in LLM applications?

üîÑ Generated Variations:
  1. What advantages does using RAG provide in LLM applications?
  2. How does utilizing RAG enhance LLM applications?
  3. What positive impacts does the RAG system have on LLM applications?


## Query Routing

**Purpose:** Route queries to the most appropriate data source(s).

**Why it matters:** Different data sources excel at different types of questions.

In [16]:
def query_routing(query: str, available_sources: List[Dict[str, str]]) -> Dict:
    sources_text = "\n".join([
        f"{i+1}. {source['name']}: {source['description']}" 
        for i, source in enumerate(available_sources)
    ])
    
    prompt = f"""You are a query router. Given a user question and available data sources, 
select the MOST appropriate source(s) to answer the question.

Available Sources:
{sources_text}

User Question: {query}

Return your answer as JSON with this structure:
{{
    "selected_sources": ["source_name1", "source_name2"],
    "reasoning": "brief explanation of why these sources"
}}"""

    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": prompt}],
        temperature=0.3,
        response_format={"type": "json_object"}
    )
    
    result = json.loads(response.choices[0].message.content)
    return result

In [17]:
# Define available data sources
sources = [
    {"name": "ml_docs", "description": "Machine learning tutorials and best practices"},
    {"name": "deployment_guide", "description": "Cloud deployment and DevOps documentation"},
    {"name": "api_reference", "description": "API specifications and code examples"},
    {"name": "case_studies", "description": "Real-world implementation case studies"}
]

query = "How do I deploy a machine learning model to production?"

print(f"üìù Query: {query}\n")
print(f"üóÇÔ∏è Available Sources: {[s['name'] for s in sources]}\n")

routing_result = query_routing(query, sources)

print("üéØ Routing Decision:")
print(f"  Selected: {routing_result['selected_sources']}")
print(f"  Reasoning: {routing_result['reasoning']}")

üìù Query: How do I deploy a machine learning model to production?

üóÇÔ∏è Available Sources: ['ml_docs', 'deployment_guide', 'api_reference', 'case_studies']

üéØ Routing Decision:
  Selected: ['deployment_guide', 'ml_docs']
  Reasoning: The 'deployment_guide' is the most relevant source for information on deploying applications, including machine learning models, to production environments. It provides cloud deployment and DevOps documentation that is essential for this task. The 'ml_docs' source is also selected because it contains tutorials and best practices for machine learning, which may include specific considerations for deploying models effectively.


## Query Compression

**Purpose:** Remove unnecessary words while preserving search intent.

**Why it matters:** Embedding models work better with focused, keyword-rich queries.

In [7]:
def query_compression(query: str, conversation_history: List[Dict[str, str]] = None) -> str:
    context = ""
    if conversation_history:
        context = "\n".join([
            f"{msg['role']}: {msg['content']}" 
            for msg in conversation_history[-3:]
        ])
        context = f"\nConversation History:\n{context}\n"
    
    prompt = f"""You are a query optimization expert. Compress the following query into a concise search query.

{context}
Current Query: {query}

Rules:
- Remove filler words and unnecessary context
- Keep only the core semantic concepts
- Maintain the search intent
- Output should be 3-8 words maximum
- Focus on keywords that would match relevant documents

Return ONLY the compressed query, nothing else."""

    response = client.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": prompt}],
        temperature=0.3
    )
    
    compressed_query = response.choices[0].message.content.strip()
    return compressed_query

In [8]:
verbose_query = "I was wondering if you could help me understand what the main advantages are of implementing a Retrieval Augmented Generation system compared to just using a standard language model?"

print(f"üìù Verbose Query ({len(verbose_query)} chars):\n{verbose_query}\n")

compressed = query_compression(verbose_query)

print(f"‚úÇÔ∏è Compressed Query ({len(compressed)} chars):\n{compressed}\n")
print(f"Compression Ratio: {len(compressed)/len(verbose_query)*100:.1f}%")

üìù Verbose Query (182 chars):
I was wondering if you could help me understand what the main advantages are of implementing a Retrieval Augmented Generation system compared to just using a standard language model?

‚úÇÔ∏è Compressed Query (73 chars):
"Advantages of Retrieval Augmented Generation vs standard language model"

Compression Ratio: 40.1%


### Compression with Conversation History

In [9]:
# Test with conversation context
conversation = [
    {"role": "user", "content": "Tell me about RAG systems"},
    {"role": "assistant", "content": "RAG systems combine retrieval with generation..."}
]

query_with_context = "What about the implementation details?"

compressed_with_context = query_compression(query_with_context, conversation)
print(f"Original: {query_with_context}")
print(f"Compressed: {compressed_with_context}")

Original: What about the implementation details?
Compressed: "RAG systems implementation details"


## Contextual Query Enhancement

**Purpose:** Enrich queries with user context and conversation history.

**Why it matters:** Personalization leads to more relevant results.

In [10]:
def contextual_query_enhancement(
    query: str, 
    user_context: Dict[str, str],
    conversation_history: List[Dict[str, str]] = None
) -> str:
    context_text = "\n".join([f"- {k}: {v}" for k, v in user_context.items()])
    
    history_text = ""
    if conversation_history:
        history_text = "\n".join([
            f"{msg['role']}: {msg['content']}" 
            for msg in conversation_history[-3:]
        ])
    
    prompt = f"""You are enhancing a search query with contextual information.

User Context:
{context_text}

{f"Recent Conversation:{chr(10)}{history_text}" if history_text else ""}

Original Query: {query}

Task: Enhance this query by:
1. Adding relevant context from user profile
2. Incorporating conversation history if relevant
3. Making implicit information explicit
4. Keeping it natural and search-friendly

Return ONLY the enhanced query, nothing else."""

    response = client.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": prompt}],
        temperature=0.5
    )
    
    enhanced_query = response.choices[0].message.content.strip()
    return enhanced_query

In [11]:
query = "What are the best practices?"
user_context = {
    "role": "Senior Data Engineer",
    "experience": "5 years",
    "focus_area": "MLOps and model deployment",
    "current_stack": "Python, Docker, Kubernetes"
}
conv_history = [
    {"role": "user", "content": "Tell me about deploying ML models"},
    {"role": "assistant", "content": "There are several approaches to model deployment..."}
]

print(f"üìù Original Query: {query}\n")
print(f"üë§ User Context: {user_context}\n")

enhanced = contextual_query_enhancement(query, user_context, conv_history)

print(f"‚ú® Enhanced Query: {enhanced}")

üìù Original Query: What are the best practices?

üë§ User Context: {'role': 'Senior Data Engineer', 'experience': '5 years', 'focus_area': 'MLOps and model deployment', 'current_stack': 'Python, Docker, Kubernetes'}

‚ú® Enhanced Query: "What are the best practices for a Senior Data Engineer with 5 years of experience focused on MLOps and model deployment using Python, Docker, and Kubernetes?"


### Personalize for Different Users
Try different user contexts and see how the query changes!

In [12]:
# Try with different user profiles
profiles = [
    {"role": "Data Scientist", "experience": "2 years", "focus_area": "Deep Learning"},
    {"role": "ML Engineer", "experience": "7 years", "focus_area": "Production Systems"},
    {"role": "Research Scientist", "experience": "10 years", "focus_area": "NLP Research"}
]

query = "How do I optimize model performance?"

for profile in profiles:
    enhanced = contextual_query_enhancement(query, profile)
    print(f"\n{profile['role']}: {enhanced}")


Data Scientist: "How do I optimize deep learning model performance with 2 years of data science experience?"

ML Engineer: What are best practices for optimizing machine learning model performance in production systems for an engineer with 7 years of experience?

Research Scientist: What are the best practices for optimizing performance of NLP models in research, considering a decade of experience in the field?


## Query Translation (Natural Language to Filters)

**Purpose:** Extract semantic query and metadata filters from natural language.

**Why it matters:** Combines semantic search with precise filtering.

In [18]:
def query_translation_to_filters(query: str, available_filters: Dict[str, List]) -> Dict:
    filters_description = "\n".join([
        f"- {field}: {', '.join(map(str, values))}" 
        for field, values in available_filters.items()
    ])
    
    prompt = f"""You are a query translator. Extract the semantic query and metadata filters from a natural language query.

Available Metadata Filters:
{filters_description}

Natural Language Query: {query}

Parse this into:
1. A semantic search query (the conceptual question)
2. Metadata filters (specific attributes to filter on)

Return JSON in this format:
{{
    "semantic_query": "the core question without filter terms",
    "filters": {{
        "field_name": "value or condition"
    }}
}}

Only include filters that are explicitly mentioned in the query."""

    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": prompt}],
        temperature=0.2,
        response_format={"type": "json_object"}
    )
    
    result = json.loads(response.choices[0].message.content)
    return result

In [19]:
query = "Show me research papers about transformers published after 2020 with high citations"
available_filters = {
    "year": list(range(2018, 2025)),
    "topic": ["transformers", "CNN", "RNN", "attention", "RAG"],
    "citation_count": ["low", "medium", "high"],
    "author": ["various"]
}

print(f"üìù Natural Language Query:\n{query}\n")
print(f"üîß Available Filters:\n{json.dumps(available_filters, indent=2)}\n")

translation = query_translation_to_filters(query, available_filters)

print("üéØ Translation Result:")
print(json.dumps(translation, indent=2))

üìù Natural Language Query:
Show me research papers about transformers published after 2020 with high citations

üîß Available Filters:
{
  "year": [
    2018,
    2019,
    2020,
    2021,
    2022,
    2023,
    2024
  ],
  "topic": [
    "transformers",
    "CNN",
    "RNN",
    "attention",
    "RAG"
  ],
  "citation_count": [
    "low",
    "medium",
    "high"
  ],
  "author": [
    "various"
  ]
}

üéØ Translation Result:
{
  "semantic_query": "research papers about transformers",
  "filters": {
    "year": "after 2020",
    "citation_count": "high"
  }
}


### E-commerce Search
Try translating product search queries!

In [20]:
# E-commerce example
ecommerce_filters = {
    "category": ["electronics", "clothing", "home", "sports"],
    "price_range": ["under_50", "50-100", "100-500", "over_500"],
    "brand": ["Apple", "Samsung", "Nike", "various"],
    "rating": ["1-2", "3", "4", "5"],
    "in_stock": ["yes", "no"]
}

ecommerce_query = "Find me highly rated Apple laptops under $1000 that are in stock"

result = query_translation_to_filters(ecommerce_query, ecommerce_filters)
print(json.dumps(result, indent=2))

{
  "semantic_query": "Find Apple laptops",
  "filters": {
    "rating": "4, 5",
    "price_range": "under_1000",
    "in_stock": "yes"
  }
}


# TASK

# Query Transformations in LangChain and LlamaIndex

## LangChain Query Transformations

### 1. **Multi-Query Retrieval**
- **Class**: `MultiQueryRetriever`
- **Purpose**: Generates multiple variations of the user query from different perspectives
- **Use Case**: Improves retrieval recall when queries are ambiguous or when single query can't cover complete information
- **Implementation**: Automatically generates 3-5 alternative questions using LLM prompts

### 2. **Query Decomposition**
- **Module**: Query Analysis techniques
- **Purpose**: Breaks down complex queries into distinct sub-questions
- **Use Case**: When user input contains multiple questions that need separate retrieval
- **Implementation**: Uses LLM function-calling to generate multiple sub-queries

### 3. **Query Rewriting (Rewrite-Retrieve-Read)**
- **Purpose**: Rewrites poorly phrased queries into better retrieval queries
- **Use Case**: When original query is not optimal for embedding/retrieval
- **Implementation**: Uses LLM to rephrase query before passing to retriever

### 4. **Step-Back Prompting**
- **Module**: `StepBackRetriever` (if available) or custom implementation
- **Purpose**: Generates abstract, higher-level "step back" questions
- **Use Case**: When specifics of a question trip up search quality
- **Implementation**: First generates broader question, then queries using both original and step-back

### 5. **HyDE (Hypothetical Document Embeddings)**
- **Purpose**: Generates hypothetical answer/document first, then uses it for retrieval
- **Use Case**: When query embeddings aren't similar to relevant document embeddings
- **Implementation**: LLM generates hypothetical relevant document, uses that for similarity search

### 6. **Query Routing**
- **Purpose**: Routes queries to appropriate data sources/indexes
- **Use Case**: Multiple indexes exist and only subset is relevant for given query
- **Implementation**: LLM classifies query and selects appropriate retriever(s)

### 7. **Query Structuring / Self-Query Retrieval**
- **Class**: `SelfQueryRetriever`
- **Purpose**: Converts natural language to structured queries with metadata filters
- **Use Case**: When documents have searchable/filterable attributes
- **Implementation**: 
  - Separates semantic query from metadata filters
  - Translates to vector store-specific filter syntax
  - Supports text-to-SQL, text-to-Cypher conversions

### 8. **Conversational Query Transformation**
- **Purpose**: Transforms conversation history into standalone search query
- **Use Case**: Multi-turn conversations where context from history is needed
- **Implementation**: Combines current query with conversation history

### 9. **Query Expansion**
- **Purpose**: Adds related terms, synonyms, or paraphrases to query
- **Use Case**: When index is sensitive to query phrasing
- **Implementation**: Generates expanded versions to increase retrieval chances

### 10. **Reciprocal Rank Fusion (with Multi-Query)**
- **Purpose**: Reorders documents retrieved from multiple query variants
- **Use Case**: Combining results from multiple sub-queries effectively
- **Implementation**: Uses RRF algorithm to merge and rank results

---

## LlamaIndex Query Transformations

### 1. **HyDE Query Transform**
- **Class**: `HyDEQueryTransform`
- **Purpose**: Generates hypothetical document/answer, uses it for embedding
- **Use Case**: When natural language queries don't embed well
- **Implementation**: Wraps query engine with `TransformQueryEngine`
- **Parameter**: `include_original` - whether to include original query

### 2. **Multi-Step Query Decomposition**
- **Class**: `StepDecomposeQueryTransform` + `MultiStepQueryEngine`
- **Purpose**: Sequentially decomposes complex query into sub-questions
- **Use Case**: Complex queries requiring iterative refinement against single knowledge source
- **How it works**:
  - Transforms query ‚Üí executes ‚Üí retrieves response
  - Uses response + previous context to generate follow-up
  - Continues until query is satisfied
- **Difference from Sub-Question**: Sequential vs. parallel execution

### 3. **Sub-Question Query Engine**
- **Class**: `SubQuestionQueryEngine`
- **Purpose**: Breaks complex query into multiple sub-questions executed in parallel
- **Use Case**: "Compare and contrast" queries requiring multiple data sources
- **How it works**:
  - Generates sub-questions with target tools/indexes
  - Executes all in parallel (or sequentially)
  - Synthesizes responses into final answer
- **Key Component**: `QuestionGenerator` to create sub-questions

### 4. **Query Routing**

#### a. **RouterQueryEngine**
- **Purpose**: Routes to appropriate query engine based on query
- **Selectors Available**:
  - `LLMSingleSelector` - selects one query engine
  - `LLMMultiSelector` - selects multiple query engines
  - `PydanticSingleSelector` - uses OpenAI function calling (single)
  - `PydanticMultiSelector` - uses OpenAI function calling (multiple)
- **Use Case**: Different query engines for different query types (summarization vs. semantic search)

#### b. **RouterRetriever**
- **Purpose**: Routes to appropriate retriever based on query
- **Same selector types as RouterQueryEngine**
- **Use Case**: Multiple retrieval strategies (vector, keyword, etc.)

#### c. **RetrieverRouterQueryEngine** (deprecated ‚Üí use ToolRetrieverRouterQueryEngine)
- **Purpose**: Uses retriever to select query engines dynamically
- **Use Case**: Large set of choices that need indexing

#### d. **ToolRetrieverRouterQueryEngine**
- **Purpose**: Retrieval-augmented routing when choices are too large
- **How it works**: Indexes query engine tools, retrieves relevant ones, executes

#### e. **SQL Router Query Engine**
- **Purpose**: Routes between SQL query engine and vector query engines
- **Use Case**: Hybrid queries needing both structured (SQL) and semantic search

### 5. **Single-Step Query Decomposition**
- **Purpose**: Transform query into single sub-question that's easier to answer
- **Use Case**: Simplifying complex queries
- **Implementation**: Part of broader decomposition framework

### 6. **Query Rewriting/Expansion**
- **Module**: Available in Query Transform Cookbook
- **Purpose**: Rewrites queries for better retrieval
- **Use Case**: Multiple phrasings of same query

### 7. **Custom Query Transforms**
- **Base Class**: `BaseQueryTransform`
- **Purpose**: Create custom transformations
- **Implementation**: Extend base class, implement transform logic

---

## Key Differences Between Frameworks

| Feature | LangChain | LlamaIndex |
|---------|-----------|------------|
| **Multi-Query** | `MultiQueryRetriever` generates parallel queries | Sub-Question Query Engine generates parallel sub-queries |
| **Sequential Decomposition** | Limited built-in support | `MultiStepQueryEngine` with sequential execution |
| **Routing** | Query routing in query analysis | Extensive routing (Router Query Engine, Router Retriever) |
| **HyDE** | Available but less emphasized | `HyDEQueryTransform` as core component |
| **Self-Query** | `SelfQueryRetriever` with metadata filtering | Query structuring through routing |
| **Composition** | Chain-based composition | Query engine composition with tools |
| **Step-Back** | Explicitly mentioned in docs | Can be implemented with custom transforms |

---

## Implementation Patterns

### LangChain Pattern
```python
# Multi-Query
from langchain.retrievers.multi_query import MultiQueryRetriever
retriever = MultiQueryRetriever.from_llm(
    llm=llm,
    retriever=vectorstore.as_retriever(),
    num_queries=3
)

# Self-Query
from langchain.retrievers.self_query.base import SelfQueryRetriever
retriever = SelfQueryRetriever.from_llm(
    llm=llm,
    vectorstore=vectorstore,
    document_contents="description",
    metadata_field_info=metadata_info
)
```

### LlamaIndex Pattern
```python
# HyDE
from llama_index.core.indices.query.query_transform.base import HyDEQueryTransform
hyde = HyDEQueryTransform(include_original=True)
query_engine = TransformQueryEngine(base_engine, query_transform=hyde)

# Multi-Step
from llama_index.core.indices.query.query_transform.base import StepDecomposeQueryTransform
step_decompose = StepDecomposeQueryTransform(llm=llm, verbose=True)
query_engine = MultiStepQueryEngine(base_engine, query_transform=step_decompose)

# Sub-Question
from llama_index.core.query_engine import SubQuestionQueryEngine
query_engine = SubQuestionQueryEngine.from_defaults(
    query_engine_tools=tools,
    llm=llm
)

# Routing
from llama_index.core.query_engine import RouterQueryEngine
from llama_index.core.selectors import PydanticSingleSelector
query_engine = RouterQueryEngine(
    selector=PydanticSingleSelector.from_defaults(),
    query_engine_tools=tools
)
```

---

## Summary

**LangChain** focuses on:
- Retriever-level transformations
- Multi-query generation and fusion
- Self-querying with metadata filtering
- Integration with various retrievers

**LlamaIndex** focuses on:
- Query engine composition
- Sequential and parallel decomposition
- Extensive routing capabilities
- Transform pipelines with custom components

Both frameworks support the core query transformation patterns but with different implementation approaches and emphasis.

![](https://europe-west1-rag-techniques-views-tracker.cloudfunctions.net/rag-techniques-tracker?notebook=all-rag-techniques--query-transformations)