# Query Transformation Techniques - llamaindex

<a href="https://colab.research.google.com/github/jerryjliu/llama_index/blob/main/docs/examples/query_transformations/query_transform_cookbook.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

A user query can be transformed and decomposed in many ways before being executed as part of a RAG query engine, agent, or any other pipeline.

In this guide we show you different ways to transform, decompose queries, and find the set of relevant tools. Each technique might be applicable for different use cases!

Here are the different query transformations:


1. **Query-Rewriting**: Keep the tools, but rewrite the query in a variety of different ways to execute against the same tools.
2. **Sub-Questions**: Decompose queries into multiple sub-questions over different tools (identified by their metadata).
3. **ReAct Agent Tool Picking**: Given the initial query, identify 1) the tool to pick, and 2) the query to execute on the tool.
4. **Multi-Step Query Decomposition** : A multi-step query engine that's able to decompose a complex query into sequential subquestions



In [1]:
%pip install llama-index-question-gen-openai
%pip install llama-index-llms-openai
%pip install llama-index



In [2]:
from IPython.display import Markdown, display


# define prompt viewing function
def display_prompt_dict(prompts_dict):
    for k, p in prompts_dict.items():
        text_md = f"**Prompt Key**: {k}<br>" f"**Text:** <br>"
        display(Markdown(text_md))
        print(p.get_template())
        display(Markdown("<br><br>"))

## Routing

In this example, we show how a query can be used to select the set of relevant tool choices.

We use our `selector` abstraction to pick the relevant tool(s) - it can be a single tool, or a multiple tool depending on the abstraction.

We have four selectors: combination of (LLM or function calling) x (single selection or multi-selection)

In [3]:
from llama_index.core.selectors import LLMSingleSelector, LLMMultiSelector
from llama_index.core.selectors import (
    PydanticMultiSelector,
    PydanticSingleSelector,
)

In [4]:
import os
from getpass import getpass

# platform.openai.com
os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY") or getpass(
    "Enter OpenAI API Key: "
)


Enter OpenAI API Key: ··········


In [5]:

selector = LLMMultiSelector.from_defaults()

In [6]:
print(selector)

<llama_index.core.selectors.llm_selectors.LLMMultiSelector object at 0x7f466ca08be0>


In [7]:
from llama_index.core.tools import ToolMetadata

tool_choices = [
    ToolMetadata(
        name="Apples_chinese",
        description=("This tool contains a Chinese apple variants"),
    ),
    ToolMetadata(
        name="Apples_australian",
        description=("This tool contains the Australian apple variants"),
    ),
    ToolMetadata(
        name="Apples_washington",
        description=("This tool contains the washinghton breed of apples"),
    ),
]

In [8]:
display_prompt_dict(selector.get_prompts())

**Prompt Key**: prompt<br>**Text:** <br>

Some choices are given below. It is provided in a numbered list (1 to {num_choices}), where each item in the list corresponds to a summary.
---------------------
{context_list}
---------------------
Using only the choices above and not prior knowledge, return the top choices (no more than {max_outputs}, but only select what is needed) that are most relevant to the question: '{query_str}'


The output should be ONLY JSON formatted as a JSON instance.

Here is an example:
[
    {{
        choice: 1,
        reason: "<insert reason for choice>"
    }},
    ...
]



<br><br>

In [9]:
selector_result = selector.select(
    tool_choices, query="Tell me more about Apples"
)

In [10]:
selector_result.selections

[SingleSelection(index=2, reason='Washington breed of apples is a popular and well-known variety')]

## Query Rewriting

In this section, we show you how to rewrite queries into multiple queries. You can then execute all these queries against a retriever.

This is a key step in advanced retrieval techniques. By doing query rewriting, you can generate multiple queries for [ensemble retrieval] and [fusion], leading to higher-quality retrieved results.

Unlike the sub-question generator, this is just a prompt call, and exists independently of tools.

### Query Rewriting (Custom)

Here we show you how to use a prompt to generate multiple queries, using our LLM and prompt abstractions.

In [11]:
from llama_index.core import PromptTemplate
from llama_index.llms.openai import OpenAI

query_gen_str = """\
You are a helpful assistant that generates multiple search queries based on a \
single input query. Generate {num_queries} search queries, one on each line, \
related to the following input query:
Query: {query}
Queries:
"""
query_gen_prompt = PromptTemplate(query_gen_str)

llm = OpenAI(model="gpt-3.5-turbo")


def generate_queries(query: str, llm, num_queries: int = 4):
    response = llm.predict(
        query_gen_prompt, num_queries=num_queries, query=query
    )
    # assume LLM proper put each query on a newline
    queries = response.split("\n")
    queries_str = "\n".join(queries)
    print(f"Generated queries:\n{queries_str}")
    return queries

In [12]:
queries = generate_queries("What happened at Citrix", llm)

Generated queries:
1. Latest news on Citrix incident
2. Citrix security breach details
3. Impact of Citrix incident on customers
4. Citrix response to recent events


### Query Rewriting (using HyDE Query Transform)

In this section we show you how to do query transformations using our Hyde QueryTransform class.

In [13]:
import logging
import sys

logging.basicConfig(stream=sys.stdout, level=logging.INFO)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.core.indices.query.query_transform import HyDEQueryTransform
from llama_index.core.query_engine import TransformQueryEngine
from IPython.display import Markdown, display

In [14]:
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'

--2024-02-27 08:45:05--  https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 75042 (73K) [text/plain]
Saving to: ‘data/paul_graham/paul_graham_essay.txt’


2024-02-27 08:45:05 (4.45 MB/s) - ‘data/paul_graham/paul_graham_essay.txt’ saved [75042/75042]



In [15]:
# load documents
documents = SimpleDirectoryReader("./data/paul_graham/").load_data()

In [16]:
index = VectorStoreIndex.from_documents(documents)


First, we query without transformation: The same query string is used for embedding lookup and also summarization.

In [17]:
query_str = "what did paul graham do after going to RISD"

In [18]:
query_engine = index.as_query_engine()
response = query_engine.query(query_str)
display(Markdown(f"<b>{response}</b>"))

<b>Paul Graham started painting still lives in his bedroom at night while he was a student at the Accademia di Belli Arti in Florence.</b>

Now, we use HyDEQueryTransform to generate a hypothetical document and use it for embedding lookup.

In [19]:
hyde = HyDEQueryTransform(include_original=True)
hyde_query_engine = TransformQueryEngine(query_engine, hyde)
response = hyde_query_engine.query(query_str)
display(Markdown(f"<b>{response}</b>"))

<b>After going to RISD, Paul Graham resumed his old patterns and started looking for an apartment to buy. He also had an idea to build a web app for making web apps, which eventually led him to start a new company called Aspra.</b>

In [20]:
query_bundle = hyde(query_str)
hyde_doc = query_bundle.embedding_strs[0]

In [21]:
hyde_doc

"After attending the Rhode Island School of Design (RISD), Paul Graham went on to co-found Viaweb, one of the first online stores that allowed users to build their own e-commerce websites. Viaweb was later acquired by Yahoo for $49 million, marking Graham's first major success in the tech industry. Following this, Graham went on to co-found Y Combinator, a startup accelerator that has helped launch successful companies such as Dropbox, Airbnb, and Reddit. Graham is also known for his influential essays on startups and entrepreneurship, as well as his work as an angel investor in various tech companies. Overall, Paul Graham's career trajectory after RISD has been marked by innovation, success, and a significant impact on the tech industry."

## Sub-Questions

Given a set of tools and a user query, decide both the 1) set of sub-questions to generate, and 2) the tools that each sub-question should run over.

We run through an example using the `OpenAIQuestionGenerator`, which depends on function calling, and also the `LLMQuestionGenerator`, which depends on prompting.

In [22]:
from llama_index.core.question_gen import LLMQuestionGenerator
from llama_index.question_gen.openai import OpenAIQuestionGenerator
from llama_index.llms.openai import OpenAI

In [23]:
llm = OpenAI()
question_gen = OpenAIQuestionGenerator.from_defaults(llm=llm)

In [24]:
display_prompt_dict(question_gen.get_prompts())

**Prompt Key**: question_gen_prompt<br>**Text:** <br>

You are a world class state of the art agent.

You have access to multiple tools, each representing a different data source or API.
Each of the tools has a name and a description, formatted as a JSON dictionary.
The keys of the dictionary are the names of the tools and the values are the descriptions.
Your purpose is to help answer a complex user question by generating a list of sub questions that can be answered by the tools.

These are the guidelines you consider when completing your task:
* Be as specific as possible
* The sub questions should be relevant to the user question
* The sub questions should be answerable by the tools provided
* You can generate multiple sub questions for each tool
* Tools must be specified by their name, not their description
* You don't need to use a tool if you don't think it's relevant

Output the list of sub questions by calling the SubQuestionList function.

## Tools
```json
{tools_str}
```

## User Question
{query_str}



<br><br>

In [25]:
from llama_index.core.tools import ToolMetadata

tool_choices = [
    ToolMetadata(
        name="uber_2021_10k",
        description=(
            "Provides information about Uber financials for year 2021"
        ),
    ),
    ToolMetadata(
        name="lyft_2021_10k",
        description=(
            "Provides information about Lyft financials for year 2021"
        ),
    ),
]

In [26]:
from llama_index.core import QueryBundle

query_str = "Compare and contrast Uber and Lyft"
choices = question_gen.generate(tool_choices, QueryBundle(query_str=query_str))

The outputs are `SubQuestion` Pydantic objects.

In [27]:
choices

[SubQuestion(sub_question='What were the total revenues for Uber in 2021?', tool_name='uber_2021_10k'),
 SubQuestion(sub_question='What were the total revenues for Lyft in 2021?', tool_name='lyft_2021_10k'),
 SubQuestion(sub_question='What were the net profits for Uber in 2021?', tool_name='uber_2021_10k'),
 SubQuestion(sub_question='What were the net profits for Lyft in 2021?', tool_name='lyft_2021_10k')]

## Query Transformation with ReAct Prompt

ReAct is a popular framework for agents, and here we show how the core ReAct prompt can be used to transform queries.

We use the `ReActChatFormatter` to get the set of input messages for the LLM.

In [28]:
from llama_index.core.agent import ReActChatFormatter
from llama_index.core.agent.react.output_parser import ReActOutputParser
from llama_index.core.tools import FunctionTool
from llama_index.core.llms import ChatMessage

In [29]:
def execute_sql(sql: str) -> str:
    """Given a SQL input string, execute it."""
    # NOTE: This is a mock function
    return f"Executed {sql}"


def add(a: int, b: int) -> int:
    """Add two numbers."""
    return a + b


tool1 = FunctionTool.from_defaults(fn=execute_sql)
tool2 = FunctionTool.from_defaults(fn=add)
tools = [tool1, tool2]

Here we get the input prompt messages to pass to the LLM. Take a look!

In [30]:
chat_formatter = ReActChatFormatter()
output_parser = ReActOutputParser()
input_msgs = chat_formatter.format(
    tools,
    [
        ChatMessage(
            content="Can you find the top three rows from the table named `revenue_years`",
            role="user",
        )
    ],
)
input_msgs

[ChatMessage(role=<MessageRole.SYSTEM: 'system'>, content='\nYou are designed to help with a variety of tasks, from answering questions     to providing summaries to other types of analyses.\n\n## Tools\nYou have access to a wide variety of tools. You are responsible for using\nthe tools in any sequence you deem appropriate to complete the task at hand.\nThis may require breaking the task into subtasks and using different tools\nto complete each subtask.\n\nYou have access to the following tools:\n> Tool Name: execute_sql\nTool Description: execute_sql(sql: str) -> str\nGiven a SQL input string, execute it.\nTool Args: {"type": "object", "properties": {"sql": {"title": "Sql", "type": "string"}}, "required": ["sql"]}\n\n> Tool Name: add\nTool Description: add(a: int, b: int) -> int\nAdd two numbers.\nTool Args: {"type": "object", "properties": {"a": {"title": "A", "type": "integer"}, "b": {"title": "B", "type": "integer"}}, "required": ["a", "b"]}\n\n\n## Output Format\nTo answer the qu

Next we get the output from the model.

In [31]:
llm = OpenAI(model="gpt-4-1106-preview")

In [32]:
response = llm.chat(input_msgs)

Finally we use our ReActOutputParser to parse the content into a structured output, and analyze the action inputs.

In [33]:
reasoning_step = output_parser.parse(response.message.content)

In [34]:
reasoning_step.action_input

{'sql': 'SELECT * FROM revenue_years ORDER BY revenue DESC LIMIT 3'}

# Multi-Step Query Decomposition

We have a multi-step query engine that's able to decompose a complex query into sequential subquestions. This
guide walks you through how to set it up!

#### Download Data

In [35]:
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'

--2024-02-27 08:45:24--  https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 75042 (73K) [text/plain]
Saving to: ‘data/paul_graham/paul_graham_essay.txt’


2024-02-27 08:45:24 (4.42 MB/s) - ‘data/paul_graham/paul_graham_essay.txt’ saved [75042/75042]



#### Load documents, build the VectorStoreIndex

In [36]:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.llms.openai import OpenAI
from IPython.display import Markdown, display

In [37]:
# LLM (gpt-3.5)
gpt35 = OpenAI(temperature=0, model="gpt-3.5-turbo")



In [38]:
# load documents
documents = SimpleDirectoryReader("./data/paul_graham/").load_data()

In [39]:
index = VectorStoreIndex.from_documents(documents)

#### Query Index

In [40]:
from llama_index.core.indices.query.query_transform.base import (
    StepDecomposeQueryTransform,
)

step_decompose_transform = StepDecomposeQueryTransform(llm=gpt35, verbose=True)

# gpt-3
step_decompose_transform_gpt3 = StepDecomposeQueryTransform(
    llm=gpt35, verbose=True
)

In [41]:
index_summary = "Used to answer questions about the author"

In [42]:
# set Logging to DEBUG for more detailed outputs
from llama_index.core.query_engine import MultiStepQueryEngine

query_engine = index.as_query_engine(llm=gpt35)
query_engine = MultiStepQueryEngine(
    query_engine=query_engine,
    query_transform=step_decompose_transform,
    index_summary=index_summary,
)
response_gpt_1 = query_engine.query(
    "Who was in the first batch of the accelerator program the author"
    " started?",
)

[1;3;33m> Current query: Who was in the first batch of the accelerator program the author started?
[0m[1;3;38;5;200m> New query: Who started the accelerator program?
[0m[1;3;33m> Current query: Who was in the first batch of the accelerator program the author started?
[0m[1;3;38;5;200m> New query: Who was in the first batch of Paul Graham's accelerator program?
[0m[1;3;33m> Current query: Who was in the first batch of the accelerator program the author started?
[0m[1;3;38;5;200m> New query: Who were some of the individuals in the first batch of the accelerator program started by the author?
[0m

In [43]:
display(Markdown(f"<b>{response_gpt_1}</b>"))

<b>reddit, Justin Kan and Emmett Shear, Aaron Swartz, and Sam Altman were in the first batch of the accelerator program the author started.</b>

In [44]:
sub_qa = response_gpt_1.metadata["sub_qa"]
tuples = [(t[0], t[1].response) for t in sub_qa]
print(tuples)

[('Who started the accelerator program?', 'Paul Graham started the accelerator program.'), ("Who was in the first batch of Paul Graham's accelerator program?", "The first batch of Paul Graham's accelerator program included reddit, Justin Kan and Emmett Shear, Aaron Swartz, and Sam Altman."), ('Who were some of the individuals in the first batch of the accelerator program started by the author?', 'The first batch of the accelerator program started by the author included reddit, Justin Kan and Emmett Shear, Aaron Swartz, and Sam Altman.')]


In [45]:
response_gpt_1 = query_engine.query(
    "In which city did the author found his first company, Viaweb?",
)

[1;3;33m> Current query: In which city did the author found his first company, Viaweb?
[0m[1;3;38;5;200m> New query: Where did the author found his first company, Viaweb?
[0m[1;3;33m> Current query: In which city did the author found his first company, Viaweb?
[0m[1;3;38;5;200m> New query: Where was the author's friend Robert's apartment located when he founded his first company, Viaweb?
[0m[1;3;33m> Current query: In which city did the author found his first company, Viaweb?
[0m[1;3;38;5;200m> New query: Where was the author's friend Robert's apartment located when he founded his first company, Viaweb?
[0m

In [46]:
print(response_gpt_1)

The author founded his first company, Viaweb, in Cambridge.


In [47]:
query_engine = index.as_query_engine(llm=gpt35)
query_engine = MultiStepQueryEngine(
    query_engine=query_engine,
    query_transform=step_decompose_transform_gpt3,
    index_summary=index_summary,
)

response_gpt3 = query_engine.query(
    "In which city did the author found his first company, Viaweb?",
)

[1;3;33m> Current query: In which city did the author found his first company, Viaweb?
[0m[1;3;38;5;200m> New query: Where did the author found his first company, Viaweb?
[0m[1;3;33m> Current query: In which city did the author found his first company, Viaweb?
[0m[1;3;38;5;200m> New query: Where was the author working when he founded his first company, Viaweb?
[0m[1;3;33m> Current query: In which city did the author found his first company, Viaweb?
[0m[1;3;38;5;200m> New query: Where was the author working when he founded his first company, Viaweb?
[0m

In [48]:
print(response_gpt3)

The author founded his first company, Viaweb, in Cambridge.
