# Query Transformations for Improved Retrieval in RAG Systems

### 1. Query Rewriting

- **Purpose**: To make queries more specific and detailed, improving the likelihood of retrieving relevant information.
- **Implementation**:
  - Takes the original query and reformulates it to be more specific and detailed.

### 2. Step-back Prompting

- **Purpose**: To generate broader, more general queries that can help retrieve relevant background information.
- **Implementation**:
  - Takes the original query and generates a more general "step-back" query.

### 3. Sub-query Decomposition

- **Purpose**: To break down complex queries into simpler sub-queries for more comprehensive information retrieval.
- **Implementation**:
  - Decomposes the original query into 2-4 simpler sub-queries.

## Benefits of these Approaches

1. **Improved Relevance**: Query rewriting helps in retrieving more specific and relevant information.
2. **Better Context**: Step-back prompting allows for retrieval of broader context and background information.
3. **Comprehensive Results**: Sub-query decomposition enables retrieval of information that covers different aspects of a complex query.
4. **Flexibility**: Each technique can be used independently or in combination, depending on the specific use case.


https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/all_rag_techniques/query_transformations.ipynb

## 1 - Query Rewriting: Reformulating queries to improve retrieval.



```python

 1. define llm
 2. prompt_template = """You are an AI assistant tasked with reformulating user queries to improve retrieval in a RAG system.
Given the original query, rewrite it to be more specific, detailed, and likely to retrieve relevant information.

            Original query: {original_query}
 Rewritten query:"""

 3. prompt = PromptTemplate(
    input_variables=["original_query"],
    template=prompt_template
)
 4. add to chain + invoke

```

## 2 - Step-back Prompting: Generating broader queries for better context retrieval.


```python
prompt_template ="""You are an AI assistant tasked with generating broader, more general queries to improve context retrieval in a RAG system.
Given the original query, generate a step-back query that is more general and can help retrieve relevant background information.

Original query: {original_query}

Step-back query:"""
```

## 3- Sub-query Decomposition: Breaking complex queries into simpler sub-queries.


```python
prompt_template ="""You are an AI assistant tasked with breaking down complex queries into simpler sub-queries for a RAG system.
Given the original query, decompose it into 2-4 simpler sub-queries that, when answered together, would provide a comprehensive response to the original query.

Original query: {original_query}

example: What are the impacts of climate change on the environment?

Sub-queries:
1. What are the impacts of climate change on biodiversity?
2. How does climate change affect the oceans?
3. What are the effects of climate change on agriculture?
4. What are the impacts of climate change on human health?"""

after getting response back from chain.invoke(query)

sub_queries = [q.strip() for q in response.content.split('\n') if q.strip() and not q.strip().startswith('Sub-queries:')]


```

# Hypothetical Document Embedding (HyDE) in Document Retrieval

https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/all_rag_techniques/HyDe_Hypothetical_Document_Embedding.ipynb


## Method Details

### Document Preprocessing and Vector Store Creation

1. The PDF is processed and split into chunks.
2. A Vector store is created for efficient similarity search.

### Hypothetical Document Generation

1. A language model is used to generate a hypothetical document that answers the given query.
2. The generation is guided by a prompt template that ensures the hypothetical document is detailed and matches the chunk size used in the vector store.

### Retrieval Process

The `HyDERetriever` class implements the following steps:

1. Generate a hypothetical document from the query using the language model.
2. Use the hypothetical document as the search query in the vector store.
3. Retrieve the most similar documents to this hypothetical document.

## Key Features

1. Query Expansion: Transforms short queries into detailed hypothetical documents.
2. Flexible Configuration: Allows adjustment of chunk size, overlap, and number of retrieved documents.
3. Integration with OpenAI Models: Uses GPT-4 for hypothetical document generation and OpenAI embeddings for vector representation.

## Benefits of this Approach

1. Improved Relevance: By expanding queries into full documents, HyDE can potentially capture more nuanced and relevant matches.
2. Handling Complex Queries: Particularly useful for complex or multi-faceted queries that might be difficult to match directly.
3. Adaptability: The hypothetical document generation can adapt to different types of queries and document domains.
4. Potential for Better Context Understanding: The expanded query might better capture the context and intent behind the original question.

## 1.Define the HyDe retriever class - creating vector store, generating hypothetical document, and retrieving


```python

  1. prompt_template = """You are analyzing a corporate annual report for {document_content_description}.

Based on the question: '{query}'

Write a brief, factual paragraph that would typically appear in an annual report answering this question. Use corporate financial language and format. Include specific numbers, percentages, or financial terms where relevant. Do not make up specific figures - use placeholder language like "the company reported" or "figures show".

Focus on the type of content and language that would actually appear in financial documents, not creative or hypothetical content.

Paragraph:"""

  2. hyde_prompt = PromptTemplate(
            input_variables=["document_content_description","query"],
            template=prompt_template,
        )

  3.
    input_variables = {"query": query,"document_content_description":document_content_description}
    response = hyde_chain.invoke(input_variables)


```

## 2. Create a HyDe retriever instance



```python

  1. hypothetical_doc = content part of response
  2. similar_docs = vectorstore.similarity_search(...,namespace=namespace)

```

# Hypothetical Prompt Embeddings (HyPE)

https://github.com/NirDiamant/RAG_TECHNIQUES/blob/main/all_rag_techniques/HyPE_Hypothetical_Prompt_Embeddings.ipynb

## Method Details

### Document Preprocessing

1. The PDF is loaded using `PyPDFLoader`.
2. The text is split into chunks using `RecursiveCharacterTextSplitter` with specified chunk size and overlap.

### Hypothetical Question Generation

Instead of embedding raw text chunks, HyPE **generates multiple hypothetical prompts** for each chunk. These **precomputed questions** simulate user queries, improving alignment with real-world searches. This removes the need for runtime synthetic answer generation needed in techniques like HyDE.

### Vector Store Creation

1. Each hypothetical question is embedded using OpenAI embeddings.
2. A vector store is built, associating **each question embedding with its original chunk**.
3. This approach **stores multiple representations per chunk**, increasing retrieval flexibility.

### Retriever Setup

1. The retriever is optimized for **question-question matching** rather than direct document retrieval.
2. The vector index enables **efficient nearest-neighbor** search over the hypothetical prompt embeddings.
3. Retrieved chunks provide a **richer and more precise context** for downstream LLM generation.

## Key Features

1. **Precomputed Hypothetical Prompts** – Improves query alignment without runtime overhead.
2. **Multi-Vector Representation**– Each chunk is indexed multiple times for broader semantic coverage.
3. **Efficient Retrieval** – FAISS ensures fast similarity search over the enhanced embeddings.
4. **Modular Design** – The pipeline is easy to adapt for different datasets and retrieval settings. Additionally it's compatible with most optimizations like reranking etc.

## Evaluation

HyPE's effectiveness is evaluated across multiple datasets, showing:

- Up to 42 percentage points improvement in retrieval precision
- Up to 45 percentage points improvement in claim recall
    (See full evaluation results in [preprint](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5139335))

## Benefits of this Approach

1. **Eliminates Query-Time Overhead** – All hypothetical generation is done offline at indexing.
2. **Enhanced Retrieval Precision** – Better alignment between queries and stored content.
3. **Scalable & Efficient** – No addinal per-query computational cost; retrieval is as fast as standard RAG.
4. **Flexible & Extensible** – Can be combined with advanced RAG techniques like reranking.

## 1. Define generation of Hypothetical Prompt Embeddings

    Uses the LLM to generate multiple hypothetical questions for a single chunk.
    These questions will be used as 'proxies' for the chunk during retrieval.

```python

  template_prompt = PromptTemplate.from_template(
        "Analyze the input text and generate essential questions that, when answered, \
        capture the main points of the text. Each question should be one line, \
        without numbering or prefixes.\n\n \
        Text:\n{chunk_text}\n\nQuestions:\n"
    )
  1. create chain & invoke with chunk_text
  2. results will be the set of qns that will be ingested into vector store along with chunk as proxy

```

## 2. Define creation and population of Vector Store + Retriver


```python
 1. Load PDF documents
 2. Split documents into chunks
 3. store into vector store -> chunk + hypothetical qns using the above function
 4. create retriver
 ```