# Query transformations for improved retrieval in RAG systems

This notebook demonstrates the use of three query transformation techniques designed to improve the retrieval process in RAG systems. RAG systems often face challenges when handling complex or ambiguous queries. These techniques aim to enhance the relevance and comprehensiveness of information retrieved from a document store. The techniques here help to achieve that by modifying or expanding the user query before the retrieval step.

In [1]:
from dotenv import load_dotenv
import os
from langchain_openai import ChatOpenAI
from langchain.prompts import PromptTemplate

# Load environment variables from a .env file
load_dotenv()

# Access the API key
os.environ["OPENAI_API_KEY"] = os.getenv('OPENAI_API_KEY')

## 1. Query rewriting

The goal of query rewriting is to make queries more specific and detailed, increasing the chances of retrieving relevant information.
We use the LLM to reformulate the user query. The model is given a prompt template that instructs it to take the original query and rewrite it in a more specific and detailed manner.

#### Initialize the LLM

In [2]:
# Initialize the LLM for query rewriting
re_write_llm = ChatOpenAI(temperature=0, model_name="gpt-4o-mini-2024-07-18", max_tokens=4000)

We use `ChatOpenAI`, an interface for interacting with OpenAI models.
- `temperature=0`: This ensures that the model's responses are more deterministic (consistent).
- `model_name="gpt-4o-mini-2024-07-18"`: Specifies the version of GPT-4 we are using.
- `max_tokens=4000`: Limits the maximum length of the response from the model.

#### Create a prompt template for query rewriting
Next, we create a prompt template that guides the model on how to rewrite the original query. The prompt instructs the model to take the user’s query and transform it into a more specific and detailed form.

In [3]:
# Create a prompt template for query rewriting
query_rewrite_template = """You are an AI assistant tasked with reformulating user queries to improve retrieval in a RAG system. 
Given the original query, rewrite it to be more specific, detailed, and likely to retrieve relevant information.

Original query: {original_query}

Rewritten query:"""

`{original_query}` is a placeholder that will be replaced with the actual user query.

#### Set up the prompt template
Now, we bind the prompt template to a `PromptTemplate` object, which allows us to easily pass it to the model during the query rewriting process.

In [4]:
# Set up the prompt template
query_rewrite_prompt = PromptTemplate(
    input_variables=["original_query"],
    template=query_rewrite_template
)

- `input_variables=["original_query"]`: This defines that the input to the template will be the `original_query`.
- `template=query_rewrite_template`: The template we created earlier is now passed to the `PromptTemplate`.

#### Create an LLM chain for query rewriting
Now, we create an LLM Chain that links the prompt template and the LLM. This chain allows us to process the original query and produce a rewritten version by passing the query through both the prompt and the model.

In [5]:
# Create an LLM chain for query rewriting
query_rewriter = query_rewrite_prompt | re_write_llm

The `|` operator links the prompt template (`query_rewrite_prompt`) with the LLM (`re_write_llm`). This creates a chain where the input query is first processed by the prompt, and then the output from the prompt is passed to the LLM for query rewriting.

#### Pass the query to the model for rewriting
We now pass the original query to the model by invoking it through the chain, and get the rewritten query in response.

In [6]:
# Pass the original query to the model for rewriting
original_query = "What are the impacts of climate change on the environment?"

rewritten_query = query_rewriter.invoke({"original_query": original_query}).content

- `original_query` is the user's input query.
- `invoke({"original_query": original_query})`: This sends the original query to the model, which processes it and returns a rewritten version.
- `.content`: This extracts the rewritten query from the response returned by the model.

In [7]:
# Print the original and rewritten queries
print("Original query:", original_query)
print("\nRewritten query:", rewritten_query)

Original query: What are the impacts of climate change on the environment?

Rewritten query: What specific effects does climate change have on various environmental factors such as biodiversity, ocean levels, weather patterns, and ecosystems?


The rewritten query should be more specific, detailed, and relevant for the RAG system to retrieve better results.

## 2. Step-back prompting
Step-back prompting generates a broader query that helps retrieve additional, relevant context. It allows the system to go beyond the user’s original query and fetch a more general set of documents that could provide background or supplementary information for a more specific query.

#### Initialize the LLM
We initialize the `ChatOpenAI` model that will be responsible for generating the broader "step-back" queries.

In [8]:
step_back_llm = ChatOpenAI(temperature=0, model_name="gpt-4o-mini-2024-07-18", max_tokens=4000)

#### Create a prompt template for step-back query generation
Next, we create a prompt template that guides the model to generate broader, more general queries. The prompt instructs the model to take the original query and generate a step-back query that will help retrieve useful background information.

In [9]:
# Create a prompt template for step-back prompting
step_back_template = """You are an AI assistant tasked with generating broader, more general queries to improve context retrieval in a RAG system.
Given the original query, generate a step-back query that is more general and can help retrieve relevant background information.

Original query: {original_query}

Step-back query:"""

`{original_query}` is a placeholder that will be replaced with the user’s input query.

#### Set up the prompt template
Now, we bind the step-back prompt template to a `PromptTemplate` object, which will be used to pass the original query to the model during the step-back query generation process.

In [10]:
# Set up the prompt template
step_back_prompt = PromptTemplate(
    input_variables=["original_query"],
    template=step_back_template
)

- `input_variables=["original_query"]`: Defines that the input to the template will be the `original_query`.
- `template=step_back_template`: Passes the step-back prompt template that we created earlier.

#### Create an LLM chain for step-back prompting
Now, we create an LLM chain that links the step-back prompt template and the LLM. This chain will process the original query and generate the step-back query by passing the query through both the prompt and the model.

In [11]:
# Create an LLMChain for step-back prompting
step_back_chain = step_back_prompt | step_back_llm

This chain takes the original query, processes it through the prompt, and then passes the output to the model to generate the broader step-back query.

#### Pass the original query to the model for step-back query generation
We now pass the original query to the model using the LLM chain. The model processes the original query and returns the step-back query in response.

In [12]:
# Pass the original query to the model for step-back query generation
original_query = "What are the impacts of climate change on the environment?"

step_back_query = step_back_chain.invoke({"original_query": original_query}).content

- `original_query` is the user's original query.
- `invoke({"original_query": original_query})`: This sends the original query to the model for processing, which generates the step-back query.
- `.content`: This extracts the step-back query from the response returned by the model.

In [13]:
# Print the original and step-back queries
print("Original query:", original_query)
print("\nStep-back query:", step_back_query)

Original query: What are the impacts of climate change on the environment?

Step-back query: What are the effects of environmental changes on ecosystems and biodiversity?


The step-back query is more general, focusing on a broader view.

## 3. Sub-query decomposition
Sub-query decomposition breaks down a complex query into smaller, simpler sub-queries. This allows the RAG system to retrieve relevant information on specific aspects of the original query, which can then be aggregated for a comprehensive answer.

#### Initialize the LLM
We initialize the `ChatOpenAI` model that will be responsible for generating the sub-queries. The model will take the original complex query and decompose it into 2-4 simpler sub-queries that are easier to handle by the retrieval system.

In [14]:
sub_query_llm = ChatOpenAI(temperature=0, model_name="gpt-4o-mini-2024-07-18", max_tokens=4000)

### Create a prompt template for sub-query decomposition
Next, we create a prompt template that instructs the model on how to decompose the original complex query into simpler sub-queries. The prompt asks the model to break down the original query into a set of 2-4 sub-queries that address different aspects of the question.

In [15]:
# Create a prompt template for sub-query decomposition
subquery_decomposition_template = """You are an AI assistant tasked with breaking down complex queries into simpler sub-queries for a RAG system.
Given the original query, decompose it into 2-4 simpler sub-queries that, when answered together, would provide a comprehensive response to the original query.

Original query: {original_query}

example: What are the impacts of climate change on the environment?

Sub-queries:
1. What are the impacts of climate change on biodiversity?
2. How does climate change affect the oceans?
3. What are the effects of climate change on agriculture?
4. What are the impacts of climate change on human health?"""

`{original_query}` is a placeholder that will be replaced by the actual user's query. The example shows how a complex query can be decomposed into sub-queries, providing a clearer structure for the retrieval system.

#### Set Up the prompt template
Now, we bind the sub-query decomposition template to a `PromptTemplate` object, which will be used during the decomposition process to pass the original query to the model.

In [16]:
# Set up the prompt template
subquery_decomposition_prompt = PromptTemplate(
    input_variables=["original_query"],
    template=subquery_decomposition_template
)

- `input_variables=["original_query"]`: Defines that the input to the template will be the `original_query`.
- `template=subquery_decomposition_template`: Passes the sub-query decomposition template that we created earlier.

#### Create an LLM chain for sub-query decomposition
We create an LLM chain that links the sub-query decomposition prompt template with the LLM. This chain will process the original query and generate the simpler sub-queries by passing the query through both the prompt and the model.

In [17]:
# Create an LLMChain for sub-query decomposition
subquery_decomposer_chain = subquery_decomposition_prompt | sub_query_llm

This chain takes the original query, processes it through the prompt, and then passes the output to the model, generating the decomposed sub-queries.

#### Pass the original query to the model for sub-query decomposition
We now pass the original query to the model using the LLM chain. The model processes the original query and returns the decomposed sub-queries in response.

In [18]:
# Pass the original query to the model for sub-query decomposition
original_query = "What are the long-term effects of technological advancements on the society?"

sub_queries = subquery_decomposer_chain.invoke({"original_query": original_query}).content

- `original_query` is the user's original query.
- `invoke({"original_query": original_query})`: This sends the original query to the model, which processes it and generates the simpler sub-queries.
- `.content`: This extracts the sub-queries from the response returned by the model.

#### Process and extract sub-queries
Once we have the response from the model, we need to process the returned content. We clean the response and extract the sub-queries by splitting them from the output text.

In [19]:
# Process and extract sub-queries from the model's response
sub_queries = [q.strip() for q in sub_queries.split('\n') if q.strip() and not q.strip().startswith('Sub-queries:')]

This step processes the model's output, which contains the sub-queries along with some additional text. We strip the extra whitespace, and filter out anything that doesn't look like a valid sub-query (e.g., introductory text).

In [20]:
# Print the original and decomposed sub-queries
print("\nSub-queries:")
for i, sub_query in enumerate(sub_queries, 1):
    print(f"{i}. {sub_query}")


Sub-queries:
1. Sub-queries for the original query "What are the long-term effects of technological advancements on society?":
2. 1. How do technological advancements influence employment and job markets over time?
3. 2. What are the effects of technological advancements on education and learning methods in society?
4. 3. How do technological advancements impact social interactions and relationships among individuals?
5. 4. What are the implications of technological advancements for privacy and security in society?


The complex query has been broken down into smaller, more manageable pieces.