# Query Rewriting

In RAG we can often encounter issues with the users query. This could be inaccuracy, ambiguity, or even a lack of information. In general its pretty common that the query is not optimal for the retrieval phase of RAG.

Imagine the following scenario. You're building an application that automatically answers support tickets. Queries in this case will be emails from customers. You can imagine the following issues that can arise:

- Emails often contain noise, such as greetings, signatures, etc. which are not relevant for the question and can confuse the model.
- The user might have multiple questions in the same email, which makes it harder to retrieve the correct documents. 
- The question posed by the user might be ambiguous, or not specific enough.

In many other applications, such as search engines, chatbots, etc. we can encounter similar issues. This notebook will explore different examples and techniques to rewrite queries to make them more suitable for the retrieval phase of RAG.

The following exmaples will be covered:

- **Rewrite-Retrieve-Read**: Explore a technique proposed in the paper [Query Rewriting for Retrieval-Augmented Large Language Models](https://arxiv.org/pdf/2305.14283.pdf)
- **Hypothetical Document Embeddings (HyDE)**: Generate hypothetical documents to align the semantic space proposed in the paper [Precise Zero-Shot Dense Retrieval without Relevance Labels](https://arxiv.org/pdf/2212.10496.pdf)
- **Step-Back Prompting**: A prompting technique that allows the LLM to do abstractions to derive high-level concepts based on the paper [Take A Step Back: Evoking Reasoning via Abstraction in Large Language Models](https://arxiv.org/pdf/2310.06117)

### Setup libraries and environment

In [None]:
%pip install python-dotenv
%pip install llama-index==0.10.33
%pip install llama-index-llms-openai==0.1.16

In [None]:
import os
from dotenv import load_dotenv
from util.helpers import get_wiki_pages, create_and_save_wiki_md_files

from llama_index.llms.openai import OpenAI
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex, PromptTemplate, get_response_synthesizer
from llama_index.core.query_engine import CustomQueryEngine
from llama_index.core.retrievers import BaseRetriever
from llama_index.core.response_synthesizers import BaseSynthesizer

In [None]:
load_dotenv()
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
model = "gpt-4-turbo"
llm = OpenAI(api_key=OPENAI_API_KEY, model=model)

## Rewrite-Retrieve-Read

This techniques improves on the baseline RAG setup by adding and intermediate between the input and retriever. This step uses an LLM to filter and rephrase the input query before passing it to the retriever. This potentially allows the for multiple queries to be generated (representing multiple questions in the input) which can be be be sent to the retriever separately.

In the following example we have two different knowledge sources that contain information about different subjects. For the sake of this example, we will just use two different wikipedia articles as the base for each knowledge source.

Lets start by loading the data and the models.

In [None]:
van_gogh_pages = get_wiki_pages(["Vincent van Gogh"])
amsterdam_pages = get_wiki_pages(["Amsterdam"])

In [None]:
create_and_save_wiki_md_files(van_gogh_pages, path="./data/docs/vangogh/")
create_and_save_wiki_md_files(amsterdam_pages, path="./data/docs/amsterdam/")

In [None]:
van_gogh_docs = SimpleDirectoryReader("./data/docs/vangogh").load_data()
amsterdam_docs = SimpleDirectoryReader("./data/docs/amsterdam").load_data()

In [None]:
vg_index = VectorStoreIndex.from_documents(documents=van_gogh_docs)
a_index = VectorStoreIndex.from_documents(documents=amsterdam_docs)

Now we will create a custom pipeline that will use the LLM to rewrite the query before passing it to the retriever. This pipeline will have the following steps:
- **Rewriter**: This step will use the LLM to rewrite the query into multiple questions an categorize for each question the knowledge source that should be used.
- **Retriever**: This step will use the retriever to retrieve the documents for each question.
- **Reader**: Answer each question using the retrieved documents for that document.
- **AnswerMerger**: This step will merge the answers from the reader into a single answer and validate the answers.

In [None]:
from typing import Dict
from llama_index.core.retrievers import BaseRetriever

rewrite_prompt_template = PromptTemplate(
    """You're a helpful AI assistant that helps people learn about different topics. 
Given the following query: 
-----------------------------------
{query_str},
-----------------------------------

Extract each question from the query and categorize it into one of the following categories separated into key and description pairs:
-----------------------------------
{categories_with_descriptions}
-----------------------------------

Your output should a comma separated list of questions with their corresponding category prepended in square brackets.
Example: 
-----------------------------------
"What is the capital of France? And who is Prince" -> "[Geography]What is the capital of France?,[People]Who is Prince?"
-----------------------------------
Answer:
"""
)

qa_prompt = PromptTemplate(
    """You are a helpful assistant that answers questions. 

Question: {question}

Context: 
-----------------------------------
{context}
-----------------------------------

Answer:
"""
)

validate_prompt = PromptTemplate(
    """You are a helpful AI assistant that validates, corrects and combines information in answers to a query
Query:
-----------------------------------
{query_stry}
-----------------------------------

Answers:
-----------------------------------
{answers}
-----------------------------------

Validate, correct and combine the answers to provide a single coherent response.
Answer:    
"""
)


class RewriteRetrieveReadQueryEngine(CustomQueryEngine):
    """RAG String Query Engine."""

    categories: list[str]
    descriptions: list[str]
    retrievers: Dict[str, BaseRetriever]
    llm: OpenAI = OpenAI(api_key=OPENAI_API_KEY, model="gpt-4-turbo")
    verbose: bool = False

    def custom_query(self, query_str: str):
        categories_with_descriptions = "\n".join(
            [
                f"{category} - {description}"
                for category, description in zip(self.categories, self.descriptions)
            ]
        )
        rewrite_prompt = rewrite_prompt_template.format(
            query_str=query_str,
            categories_with_descriptions=categories_with_descriptions,
        )
        rewrite_res = self.llm.complete(rewrite_prompt)
            
        questions = str(rewrite_res).replace("\"", "").split(",")
        if self.verbose:
            print ("Questions:", questions)
            
        answers = []
        for question in questions:
            category, q = question[1:].split("]")
            if self.verbose:
                print("\n\nRetrieving answer for question:", q)
                print("Using category:", category)
            nodes = self.retrievers[category].retrieve(q)
            if self.verbose:
                print("Retrieved nodes:", nodes)
            context = "\n\n".join([n.node.get_content() for n in nodes])
            answer = self.llm.complete(
                qa_prompt.format(question=q, context=context)
            )
            if self.verbose:
                print("Answer:", answer)
            answers.append(answer.text)
        if self.verbose:
            print("\n\nValidating answers")
        response = self.llm.complete(
            validate_prompt.format(query_stry=query_str, answers="\n".join(answers))
        )

        return str(response)

In [None]:
categories = ["VAN_GOGH", "AMSTERDAM"]
descriptions = ["Questions about Vincent van Gogh", "Questions about Amsterdam"]
retrievers = {"VAN_GOGH": vg_index.as_retriever(similarity_top_k=2), "AMSTERDAM": a_index.as_retriever(similarity_top_k=2)}

query_engine = RewriteRetrieveReadQueryEngine(
    categories=categories,
    descriptions=descriptions,
    retrievers=retrievers,
    llm=llm,
    verbose=True,
)

In [None]:
query = """
Hello, I am a student who is interested in learning about art and history. I have two questions that I would like to know more about.
Who is Vincent van Gogh and what is Amsterdam famous for?

Kind regards, Billy
"""

In [None]:
response = query_engine.query(query)
print("Response:", response)

## Hypothetical Document Embeddings (HyDE)
TODO

## Step-Back Prompting
TODO