# Query Rephrasing Techniques Demo

This notebook demonstrates different query rephrasing techniques used in retrieval systems to improve search quality and relevance.  

## Overview

We'll explore four main query processing techniques:

**Query Expansion** – Add synonyms, related terms, or contextual entities  
**Query Decomposition** – Break compound queries into atomic sub-queries  
**Query Rewriting** – Make context-dependent queries standalone  
**Self-Querying** – Transform complex input into optimal search queries  

In [None]:
# Import required modules
from langchain.prompts import PromptTemplate
from retrieval_playground.src.pre_retrieval.query_rephrasing import (
    expand_query,
    decompose_query,
    rewrite_query,
    self_query,
    QUERY_EXAMPLES
)

## 1. Query Expansion 

**Purpose**: Expand queries by replacing abbreviations, adding context, or enriching with domain-specific terms.

**When to use**: 
- Queries contain abbreviations or acronyms
- Queries are too broad or vague
- Queries lack domain-specific terms
- Queries are incomplete questions or direct phrases


In [None]:
QUERY_EXPANSION_TEMPLATE = PromptTemplate(
    input_variables=["query"],
        template="""
Given the query below, decide whether it needs expansion.  

Expand the query if any of the following apply:  
- It contains abbreviations or acronyms → replace them with their full forms.  
- It is too broad or vague → add minimal context to make it retrieval-ready.  
- It lacks domain-specific terms that are typically associated with the topic → enrich with relevant context.  
- It is just a direct phrase or incomplete question → reframe it into a clear query/question suitable for retrieval.  

If none of these apply, return the query exactly as it is.  

The output should be a natural search query suitable for retrieval.  
Do not include explanations, just return the final query text.  

Query: {query}  
Output:
"""
)


print("=" * 60)
print("QUERY EXPANSION EXAMPLES")
print("=" * 60)

expansion_examples = QUERY_EXAMPLES["expansion"]

for i, example in enumerate(expansion_examples, 1):
    query = example["query"]
    print(f"\n{i}. Original Query:")
    print(f"   '{query}'")
    
    try:
        expanded = expand_query(query)
        print(f"\n   Expanded Query:")
        print(f"   '{expanded}'")
    except Exception as e:
        print(f"   Error: {e}")
    
    print("-" * 40)

## 2. Query Decomposition

**Purpose**: Break down complex queries with multiple intents into smaller, atomic sub-queries.

**When to use**:
- Queries contain multiple questions or intents
- Compound queries that can be better answered separately
- Complex queries that need to be processed independently


In [None]:
QUERY_DECOMPOSITION_TEMPLATE = PromptTemplate(
    input_variables=["query"],
    template="""
Given the query below, check if it contains multiple intents or compound questions.  
- If it does, break it down into smaller, atomic sub-queries.  
- Each sub-query must be an independent, standalone query that can be retrieved without relying on the others.  
- If not, return the query inside a single-item list.  

Return only a valid Python list of sub-queries.  

Query: {query}  
Output:
"""
)

print("=" * 60)
print("QUERY DECOMPOSITION EXAMPLES")
print("=" * 60)

decomposition_examples = QUERY_EXAMPLES["decomposition"]

for i, example in enumerate(decomposition_examples, 1):
    query = example["query"]
    print(f"\n{i}. Original Complex Query:")
    print(f"   '{query}'")
    
    try:
        sub_queries = decompose_query(query)
        print(f"\nDecomposed Sub-queries:")
        for j, sub_query in enumerate(sub_queries, 1):
            sub_query = sub_query.replace("```python", "")
            sub_query = sub_query.replace("```", "")
            sub_query = sub_query.strip()
            print(sub_query)
    except Exception as e:
        print(f"   Error: {e}")
    
    print("-" * 40)


## 3. Query Rewriting

**Purpose**: Transform context-dependent queries into standalone queries suitable for retrieval.

**When to use**:
- Queries contain pronouns or references to previous context
- Follow-up questions in conversations
- Incomplete queries that depend on prior information


In [None]:
QUERY_REWRITING_TEMPLATE = PromptTemplate(
    input_variables=["query", "previous_conversation_history"],
    template="""
Given the current query and the previous conversation history:  
- If the query depends on prior context (e.g., pronouns, references, incomplete information), rewrite it into a clear, standalone query suitable for retrieval.  
- If it does not depend on prior context, return the query unchanged.  

Return only the final query text, without explanation or formatting.  

Query: {query}  
Previous conversation history: {previous_conversation_history}  
Output:
"""
)

print("=" * 60)
print("QUERY REWRITING EXAMPLES")
print("=" * 60)

rewriting_examples = QUERY_EXAMPLES["rewriting"]

for i, example in enumerate(rewriting_examples, 1):
    query = example["query"]
    context = example["previous_conversation_history"]
    
    print(f"\n{i}. Context-Dependent Query:")
    print(f"   '{query}'")
    
    print(f"\nPrevious Context:")
    print(f"   {context}")
    
    try:
        rewritten = rewrite_query(query, context)
        print(f"\nStandalone Query:")
        print(f"   '{rewritten}'")
    except Exception as e:
        print(f"   Error: {e}")
    
    print("-" * 40)

## 4. Self-Querying 

**Purpose**: Transform complex user input into optimal search queries for retrieval.

**When to use**:
- Complex, multi-faceted user requests
- When you need to generate multiple focused search queries
- To optimize retrieval by creating targeted queries


In [None]:
# Self-querying prompt template
SELF_QUERYING_TEMPLATE = PromptTemplate(
    input_variables=["query"],
    template="""
Transform the input into a set of optimal search queries for retrieval.  
- Queries should be clear, focused, and aligned with the user’s intent.  
- Each query must be independent and standalone.  

Return only a valid Python list of search queries.  

Input: {query}  
Output:
"""
)

print("=" * 60)
print("SELF-QUERYING EXAMPLES")
print("=" * 60)

self_querying_examples = QUERY_EXAMPLES["self_querying"]

for i, example in enumerate(self_querying_examples, 1):
    query = example["query"]
    print(f"\n{i}. Complex User Input:")
    print(f"   '{query}'")
    
    try:
        search_queries = self_query(query)
        print(f"\nOptimal Search Queries:")
        for j, search_query in enumerate(search_queries, 1):
            search_query = search_query.replace("```python", "")
            search_query = search_query.replace("```", "")
            print(f"   {search_query}")
    except Exception as e:
        print(f"   Error: {e}")
    
    print("-" * 40)


## Testing Section

You can try out the functions below with your own queries by modifying the `test_query` and `test_context` variables.  
This will help you experiment with **query expansion**, **query decomposition**, **query rewriting**, and **self-querying** techniques.  


```python

# Test your own query expansion
test_query = "AI models"  

print(f"Original: {test_query}")
print(f"Expanded: {expand_query(test_query)}")


# Test your own query decomposition
test_query = "What are neural networks and how do they work?"  

print(f"Original: {test_query}")
sub_queries = decompose_query(test_query)
print("Sub-queries:")
for i, sq in enumerate(sub_queries, 1):
    print(f"{i}. {sq}")


# Test your own query rewriting
test_query = "How does it work?"  
test_context = "User asked about neural networks"  

print(f"\nContext-dependent: {test_query}")
print(f"Context: {test_context}")
print(f"Standalone: {rewrite_query(test_query, test_context)}")


# Test your own self-querying
test_query = "I want to learn about machine learning"  

print(f"\nComplex input: {test_query}")
search_queries = self_query(test_query)
print("Optimal search queries:")
for i, sq in enumerate(search_queries, 1):
    print(f"{i}. {sq}")


## Summary

This notebook demonstrated four key query rephrasing techniques:  

| Method              | Complexity | Recall   | Precision | Best For                                     |
| ------------------- | ---------- | -------- | --------- | -------------------------------------------- |
| Query Expansion     | Low        | High     | Moderate  | Broad topics, exploratory searches           |
| Query Decomposition | Medium     | Moderate | High      | Complex queries, multi-part questions        |
| Query Rewriting     | Medium     | Moderate | High      | Ambiguous or poorly-formed queries           |
| Self-Querying       | High       | High     | High      | Ambiguous queries, domain-specific knowledge |


**Query Expansion**: Enriches queries with additional context and domain-specific terms  
**Query Decomposition**: Breaks complex multi-intent queries into focused sub-queries  
**Query Rewriting**: Makes context-dependent queries standalone for better retrieval  
**Self-Querying**: Transforms complex user inputs into optimized search queries  

These techniques help improve retrieval quality by ensuring queries are:  
- Clear and specific  
- Context-independent  
- Properly scoped  
- Optimized for search  



# Semantic Routing Demonstration

Demonstration of the semantic routing system with different example queries:

In [None]:
# Import required modules
import sys
import os
sys.path.append(os.getcwd())

from retrieval_playground.src.pre_retrieval.routing import (
    semantic_layer, 
    run_routing_examples,
    get_route_info
)

## 📍 Route

In [None]:
get_route_info()

![Routing](../utils/images/routing.svg)

## 👋 Example: Greeting Queries 

In [None]:
# Example 1: Casual greeting
query1 = "Hi there! How are you doing today?"
print(f"Query: \"{query1}\"")
result1 = semantic_layer(query1)
print(f"Result: {result1}")
print("\n" + "="*60 + "\n")

# Example 2: Gratitude expression  
query2 = "Thank you for your help with this research!"
print(f"Query: \"{query2}\"")
result2 = semantic_layer(query2)
print(f"Result: {result2}")


## 📚 Example: Research Papers Queries

In [None]:
# Example 3: Analytics/Causal Analysis query
query3 = "What research papers discuss counterfactual generation and causal analysis methods?"
print(f"Query: \"{query3}\"")
result3 = semantic_layer(query3)
print(f"Result: {result3}")
print("\n" + "="*60 + "\n")

# Example 4: Computer Vision/Remote Sensing query
query4 = "Can you explain annotation-free segmentation techniques for remote sensing images?"
print(f"Query: \"{query4}\"")
result4 = semantic_layer(query4)
print(f"Result: {result4}")


## 🔄 Example: Default Fallback Query

In [None]:
# Example 5: Unrelated query (should trigger default fallback)
query5 = "Give cheesecake recipe"
print(f"Query: \"{query5}\"")
result5 = semantic_layer(query5)
print(f"Result: {result5}")

## Complete Demonstration (All 5 Examples)

In [None]:
# Run the complete demonstration with all 5 examples
results = run_routing_examples()