# Adaptive RAG

User queries in general can be complex queries, simple queries. One don't always need complex RAG system even to handle simple queries. [Adaptive RAG](https://arxiv.org/abs/2403.14403) proposes an approach to handles complex queries and simple queries seperately.

In this notebook we will implement a similar approach in handling complex and simple queries seperately. We will consider following documents for the demonstration:

1. Uber 10K 2021 SEC Filings.
2. Lyft 10K 2021 SEC Filings.
3. Paul Graham Essay.

We will use `SubQuestionQueryEngine` to handle complex queries, simple `QueryEngine` to handle simple queries on these docs and only `LLM`(CustomQueryEngine) to handle queries pertaining to other than these documents.

Complex Queries - Queries that need context from multiple documents.
Simple Queries - Queries that need context from single document or directly LLM can answer it.

Following are the steps we follow here:

1. Download Data
2. Load Data
3. Create indices for 3 documents.
4. Create query engines.
5. Create tools.
6. Create `SubQuestionQueryEngine`.
7. Create `CustomQueryEngine`.(LLM)
8. Create Tools for SubQuestionQueryEngine and CustomQueryEngine.
9. Create `RouterQueryEngine` - To route queries based on its complexity.


User queries can range from simple to complex. Not all queries require the intricacies of a complex RAG system for effective handling. [Adaptive RAG](https://arxiv.org/abs/2403.14403) introduces a method that differentiates between handling complex and simple queries.

In this notebook, we will adopt a similar strategy to manage complex and simple queries distinctly. We will use the following documents for our demonstration:

1. Uber's 10K 2021 SEC Filings.
2. Lyft's 10K 2021 SEC Filings.
3. Paul Graham Essay.

   
For complex queries, which require context from multiple documents, we'll utilize the `SubQuestionQueryEngine`. For simple queries, which need context from a single document or can be directly answered by an LLM, we'll use a basic `QueryEngine` and a `CustomQueryEngine` (LLM), respectively.

Here are the steps we will follow:

1. Download the data.
2. Load the data.
3. Create indices on the documents.
4. Set up the query engines.
5. Create the tools on Query Engines.
6. Create the `SubQuestionQueryEngine`.
7. Create the `CustomQueryEngine` (LLM).
8. Implement tools for `SubQuestionQueryEngine` and `CustomQueryEngine`.
9. Construct the `RouterQueryEngine` to direct queries based on their complexity.

### Installation

In [None]:
!pip install llama-index
!pip install llama-index-llms-mistralai
!pip install llama-index-embeddings-mistralai

### Setup API Key

In [1]:
import os
os.environ['MISTRAL_API_KEY'] = 'YOUR MISTRAL API KEY'

### Setup LLM and Embedding Model

In [2]:
import nest_asyncio

nest_asyncio.apply()

In [3]:
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
from llama_index.llms.mistralai import MistralAI
from llama_index.embeddings.mistralai import MistralAIEmbedding
from llama_index.core import Settings

from llama_index.core.tools import QueryEngineTool, ToolMetadata
from llama_index.core.query_engine import SubQuestionQueryEngine
from llama_index.core.query_engine.router_query_engine import RouterQueryEngine
from llama_index.core.selectors.llm_selectors import LLMSingleSelector

In [4]:
llm = MistralAI(model='mistral-large')
embed_model = MistralAIEmbedding()

Settings.llm = llm
Settings.embed_model = embed_model

### Logging

In [5]:
# NOTE: This is ONLY necessary in jupyter notebook.
# Details: Jupyter runs an event-loop behind the scenes.
#          This results in nested event-loops when we start an event-loop to make async queries.
#          This is normally not allowed, we use nest_asyncio to allow it for convenience.
import nest_asyncio

nest_asyncio.apply()

import logging
import sys

# Set up the root logger
logger = logging.getLogger()
logger.setLevel(logging.INFO)  # Set logger level to INFO

# Clear out any existing handlers
logger.handlers = []

# Set up the StreamHandler to output to sys.stdout (Colab's output)
handler = logging.StreamHandler(sys.stdout)
handler.setLevel(logging.INFO)  # Set handler level to INFO

# Add the handler to the logger
logger.addHandler(handler)

from IPython.display import display, HTML

### Download Data

In [6]:
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/uber_2021.pdf' -O './uber_2021.pdf'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/lyft_2021.pdf' -O './lyft_2021.pdf'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt' -O './paul_graham_essay.txt'

--2024-03-31 06:00:13--  https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/uber_2021.pdf
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.109.133, 185.199.108.133, 185.199.111.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.109.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1880483 (1.8M) [application/octet-stream]
Saving to: ‘./uber_2021.pdf’


2024-03-31 06:00:14 (41.4 MB/s) - ‘./uber_2021.pdf’ saved [1880483/1880483]

--2024-03-31 06:00:14--  https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/lyft_2021.pdf
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1440303 (1.4M) [application/octet-

### Load Data

In [7]:
# Uber docs
uber_docs = SimpleDirectoryReader(input_files=["./data/10k/uber_2021.pdf"]).load_data()

# Lyft docs
lyft_docs = SimpleDirectoryReader(input_files=["./data/10k/lyft_2021.pdf"]).load_data()

# Paul Graham Essay 
paul_graham_docs = SimpleDirectoryReader(input_files=["./paul_graham_essay.txt"]).load_data()

### Create Indicies

In [None]:
# Index on Lyft Document
lyft_index = VectorStoreIndex.from_documents(lyft_docs)

# Index on Uber Document
uber_index = VectorStoreIndex.from_documents(uber_docs)

# Index on Paul Graham Document
paul_graham_index = VectorStoreIndex.from_documents(paul_graham_docs)

### Create Query Engines

In [9]:
# Query Engine on Lyft Index
lyft_query_engine = lyft_index.as_query_engine(similarity_top_k=5)

# Query Engine on Uber Index
uber_query_engine = uber_index.as_query_engine(similarity_top_k=5)

# Query Engine on Paul Graham Index
paul_graham_engine = paul_graham_index.as_query_engine(similarity_top_k=5)

### Create Tools

In [10]:
# Tool on Lyft Query Engine
lyft_tool = QueryEngineTool(
        query_engine=lyft_query_engine,
        metadata=ToolMetadata(
            name="lyft_10k",
            description=(
                "Provides information about Lyft financials for year 2021"
            ),
        ),
    )

# Tool on Uber Query Engine
uber_tool = QueryEngineTool(
        query_engine=uber_query_engine,
        metadata=ToolMetadata(
            name="uber_10k",
            description=(
                "Provides information about Uber financials for year 2021"
            )
        )
)

# Tool on Paul Graham Query Engine
paul_graham_tool = QueryEngineTool(
        query_engine=paul_graham_engine,
        metadata=ToolMetadata(
            name="paul_graham_engine",
            description=(
                "Provides information about Paul Graham Essay."
            )
        )
)

### Create SubQuestionQueryEngine

In [11]:
query_engine_tools = [lyft_tool, uber_tool, paul_graham_tool]

sub_question_query_engine = SubQuestionQueryEngine.from_defaults(
    query_engine_tools=query_engine_tools
)

##### Testing

In [12]:
response = sub_question_query_engine.query(
    "Compare the revenue of uber and lyft."
)
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))

HTTP Request: POST https://api.mistral.ai/v1/chat/completions "HTTP/1.1 200 OK"
Generated 2 sub questions.
[1;3;38;2;237;90;200m[uber_10k] Q: What is the revenue of Uber
[0m[1;3;38;2;90;149;237m[lyft_10k] Q: What is the revenue of Lyft
[0mHTTP Request: POST https://api.mistral.ai/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.mistral.ai/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.mistral.ai/v1/chat/completions "HTTP/1.1 200 OK"
[1;3;38;2;237;90;200m[uber_10k] A: The revenue of Uber for the year ended December 31, 2021, was $17,455 million. This is an increase of 57% compared to the previous year, primarily due to an increase in Gross Bookings. The revenue is disaggregated into Mobility revenue, Delivery revenue, Freight revenue, and All Other revenue. The Mobility revenue was $6,953 million, Delivery revenue was $8,362 million, Freight revenue was $2,132 million, and All Other revenue was $8 million for the year ended December 31, 2021.
[0mHTTP 

### Create CustomQueryEngine

`CustomQueryEngine` will be useful to answer queries direcly by LLM.

In [13]:
from llama_index.core.query_engine import CustomQueryEngine

class LLMQueryEngine(CustomQueryEngine):
    """RAG String Query Engine."""

    llm: llm

    def custom_query(self, query_str: str):

        response = self.llm.complete(query_str)

        return str(response)

In [14]:
llm_query_engine = LLMQueryEngine(llm=llm)

##### Testing

In [15]:
response = llm_query_engine.query("What is the capital of France?")
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))

HTTP Request: POST https://api.mistral.ai/v1/chat/completions "HTTP/1.1 200 OK"


### Create Tools for SubQuestionQueryEngine and CustomQueryEngine

In [31]:
# Tool on SubQuestionQueryEngine
sub_question_query_engine_tool = QueryEngineTool(
        query_engine=sub_question_query_engine,
        metadata=ToolMetadata(
            name="uber_lyft_paul_graham_queries",
            description=(
                "Useful to answer complex queries involving uber financials in 2021 or lyft financials in 2021 or paul graham or involving all of them."
            )
        )
)

In [32]:
# Tool on CustomQueryEngine for LLM
llm_query_engine_tool = QueryEngineTool(
        query_engine=llm_query_engine,
        metadata=ToolMetadata(
            name="llm_general_queries",
            description=(
                "Provides information about general queries"
            )
        )
)

### Create RouterQueryEngine

In [33]:
from llama_index.core.query_engine import RouterQueryEngine
from llama_index.core.selectors import LLMSingleSelector

query_engine = RouterQueryEngine(
    selector=LLMSingleSelector.from_defaults(),
    query_engine_tools=[
        llm_query_engine_tool,
        sub_question_query_engine_tool,
        lyft_tool,
        uber_tool, 
        paul_graham_tool
    ],
    verbose = True
)

### Querying

#### Simple queries:

Query: Why did Paul Graham start YC? 

This should use only Tool/ Index/ QueryEngine of Paul Graham documents.

In [34]:
response = query_engine.query("Why did Paul Graham start YC?")
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))

HTTP Request: POST https://api.mistral.ai/v1/chat/completions "HTTP/1.1 200 OK"
Selecting query engine 4: The question is about Paul Graham and why he started YC (Y Combinator). The only option that mentions Paul Graham is choice 5, which provides information about Paul Graham's essay. It's reasonable to assume that his essays might contain information about why he started YC..
[1;3;38;5;200mSelecting query engine 4: The question is about Paul Graham and why he started YC (Y Combinator). The only option that mentions Paul Graham is choice 5, which provides information about Paul Graham's essay. It's reasonable to assume that his essays might contain information about why he started YC..
[0mHTTP Request: POST https://api.mistral.ai/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.mistral.ai/v1/chat/completions "HTTP/1.1 200 OK"


Query: What is the revenue of uber?

This should use only Tool/ Index/ QueryEngine of Uber 10K SEC Filing document.

In [35]:
response = query_engine.query("What is the revenue of uber?")
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))

HTTP Request: POST https://api.mistral.ai/v1/chat/completions "HTTP/1.1 200 OK"
Selecting query engine 3: This choice specifically mentions providing information about Uber financials for the year 2021, which is most likely to include details about Uber's revenue..
[1;3;38;5;200mSelecting query engine 3: This choice specifically mentions providing information about Uber financials for the year 2021, which is most likely to include details about Uber's revenue..
[0mHTTP Request: POST https://api.mistral.ai/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.mistral.ai/v1/chat/completions "HTTP/1.1 200 OK"


Query: What is the Capital Of France?

This should use only `CustomeQueryEngine` Tool which uses LLM to answer the query as information about it is not present in indexed documents. .

In [36]:
response = query_engine.query("What is the Capital Of France?")
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))

HTTP Request: POST https://api.mistral.ai/v1/chat/completions "HTTP/1.1 200 OK"
Selecting query engine 0: The question 'What is the Capital Of France?' is a general query, and option 1 states it provides information about general queries..
[1;3;38;5;200mSelecting query engine 0: The question 'What is the Capital Of France?' is a general query, and option 1 states it provides information about general queries..
[0mHTTP Request: POST https://api.mistral.ai/v1/chat/completions "HTTP/1.1 200 OK"


#### Complex Queries

Query: What is the revenue of uber and why did Paul Graham start YC?

This should use `SubQuestionQueryEngine` tool as it needs context of Uber and Paul Graham documents.

In [37]:
response = query_engine.query("What is the revenue of Uber and why did Paul Graham start YC?")
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))

HTTP Request: POST https://api.mistral.ai/v1/chat/completions "HTTP/1.1 200 OK"
Selecting query engine 1: This choice is most relevant as it mentions both 'Uber financials' and 'Paul Graham'. It is likely to provide information about Uber's revenue and Paul Graham's initiatives, which could include starting YC..
[1;3;38;5;200mSelecting query engine 1: This choice is most relevant as it mentions both 'Uber financials' and 'Paul Graham'. It is likely to provide information about Uber's revenue and Paul Graham's initiatives, which could include starting YC..
[0mHTTP Request: POST https://api.mistral.ai/v1/chat/completions "HTTP/1.1 200 OK"
Generated 2 sub questions.
[1;3;38;2;237;90;200m[uber_10k] Q: What is the revenue of Uber
[0m[1;3;38;2;90;149;237m[paul_graham_engine] Q: Why did Paul Graham start YC
[0mHTTP Request: POST https://api.mistral.ai/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.mistral.ai/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.

Query: What is the revenue of uber and lyft. Why did Paul Graham start YC?

This should use `SubQuestionQueryEngine` tool as it needs context of Uber, Lyft and Paul Graham documents.

In [40]:
response = query_engine.query("Compare revenue of Uber and Lyft in 2021. Why did Paul Graham start YC?")
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))

HTTP Request: POST https://api.mistral.ai/v1/chat/completions "HTTP/1.1 200 OK"
Selecting query engine 1: This choice is most relevant as it mentions handling complex queries involving Uber financials in 2021, Lyft financials in 2021, and Paul Graham. It can potentially provide information to compare the revenues of Uber and Lyft in 2021 and some information about Paul Graham..
[1;3;38;5;200mSelecting query engine 1: This choice is most relevant as it mentions handling complex queries involving Uber financials in 2021, Lyft financials in 2021, and Paul Graham. It can potentially provide information to compare the revenues of Uber and Lyft in 2021 and some information about Paul Graham..
[0mHTTP Request: POST https://api.mistral.ai/v1/chat/completions "HTTP/1.1 200 OK"
Generated 3 sub questions.
[1;3;38;2;237;90;200m[uber_10k] Q: What is the revenue of Uber in 2021
[0m[1;3;38;2;90;149;237m[lyft_10k] Q: What is the revenue of Lyft in 2021
[0m[1;3;38;2;11;159;203m[paul_graham_engin