<a href="https://colab.research.google.com/github/mistralai/cookbook/blob/main/third_party/LlamaIndex/Adaptive_RAG.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Adaptive RAG

User queries in general can be complex queries, simple queries. One don't always need complex RAG system even to handle simple queries. [Adaptive RAG](https://arxiv.org/abs/2403.14403) proposes an approach to handle complex queries and simple queries.

In this notebook, we will implement an approach similar to Adaptive RAG, which differentiates between handling complex and simple queries. We'll focus on Lyft's 10k SEC filings for the years 2020, 2021, and 2022.

Our approach will involve using `FunctionCalling` capabilities of `MistralAI` by defining a `FunctionCallingAgentWorker` to call different tools or indices based on the query's complexity.

- **Complex Queries:** These will leverage multiple tools that require context from several documents.
- **Simple Queries:** These will utilize a single tool that requires context from a single document or directly use an LLM to provide an answer.

Following are the steps we follow here:

1. Download Data.
2. Load Data.
3. Create indices for 3 documents.
4. Create query engines with documents and LLM.
5. Create tools.
6. Initialize a `FunctionCallingAgentWorker`.
7. Querying.

### Installation

In [None]:
!pip install llama-index
!pip install llama-index-llms-mistralai
!pip install llama-index-embeddings-mistralai

### Setup API Key

In [1]:
import os
os.environ['MISTRAL_API_KEY'] = '<YOUR MISTRAL API KEY>'

### Setup LLM and Embedding Model

In [2]:
import nest_asyncio

nest_asyncio.apply()

In [3]:
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
from llama_index.llms.mistralai import MistralAI
from llama_index.embeddings.mistralai import MistralAIEmbedding
from llama_index.core import Settings

from llama_index.core.tools import QueryEngineTool, ToolMetadata
from llama_index.core.query_engine import SubQuestionQueryEngine
from llama_index.core.query_engine.router_query_engine import RouterQueryEngine
from llama_index.core.selectors.llm_selectors import LLMSingleSelector

In [4]:
# Note: Only `mistral-large-latest` supports function calling
llm = MistralAI(model='mistral-large-latest') 
embed_model = MistralAIEmbedding()

Settings.llm = llm
Settings.embed_model = embed_model

### Logging

In [5]:
# NOTE: This is ONLY necessary in jupyter notebook.
# Details: Jupyter runs an event-loop behind the scenes.
#          This results in nested event-loops when we start an event-loop to make async queries.
#          This is normally not allowed, we use nest_asyncio to allow it for convenience.
import nest_asyncio

nest_asyncio.apply()

import logging
import sys

# Set up the root logger
logger = logging.getLogger()
logger.setLevel(logging.INFO)  # Set logger level to INFO

# Clear out any existing handlers
logger.handlers = []

# Set up the StreamHandler to output to sys.stdout (Colab's output)
handler = logging.StreamHandler(sys.stdout)
handler.setLevel(logging.INFO)  # Set handler level to INFO

# Add the handler to the logger
logger.addHandler(handler)

from IPython.display import display, HTML

### Download Data

We will download Lyft's 10k SEC filings for the years 2020, 2021, and 2022.

In [6]:
!wget "https://www.dropbox.com/scl/fi/ywc29qvt66s8i97h1taci/lyft-10k-2020.pdf?rlkey=d7bru2jno7398imeirn09fey5&dl=0" -q -O ./lyft_10k_2020.pdf
!wget "https://www.dropbox.com/scl/fi/lpmmki7a9a14s1l5ef7ep/lyft-10k-2021.pdf?rlkey=ud5cwlfotrii6r5jjag1o3hvm&dl=0" -q -O ./lyft_10k_2021.pdf
!wget "https://www.dropbox.com/scl/fi/iffbbnbw9h7shqnnot5es/lyft-10k-2022.pdf?rlkey=grkdgxcrib60oegtp4jn8hpl8&dl=0" -q -O ./lyft_10k_2022.pdf

### Load Data

In [7]:
# Lyft 2020 docs
lyft_2020_docs = SimpleDirectoryReader(input_files=["./lyft_10k_2020.pdf"]).load_data()

# Lyft 2021 docs
lyft_2021_docs = SimpleDirectoryReader(input_files=["./lyft_10k_2021.pdf"]).load_data()

# Lyft 2022 docs
lyft_2022_docs = SimpleDirectoryReader(input_files=["./lyft_10k_2022.pdf"]).load_data()

### Create Indicies

In [8]:
# Index on Lyft 2020 Document
lyft_2020_index = VectorStoreIndex.from_documents(lyft_2020_docs)

# Index on Lyft 2021 Document
lyft_2021_index = VectorStoreIndex.from_documents(lyft_2021_docs)

# Index on Lyft 2022 Document
lyft_2022_index = VectorStoreIndex.from_documents(lyft_2022_docs)

HTTP Request: POST https://api.mistral.ai/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.mistral.ai/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.mistral.ai/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.mistral.ai/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.mistral.ai/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.mistral.ai/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.mistral.ai/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.mistral.ai/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.mistral.ai/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.mistral.ai/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.mistral.ai/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.mistral.ai/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.mistral.ai/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.mistral

### Create Query Engines

In [9]:
# Query Engine on Lyft 2020 Docs Index
lyft_2020_query_engine = lyft_2020_index.as_query_engine(similarity_top_k=5)

# Query Engine on Lyft 2021 Docs Index
lyft_2021_query_engine = lyft_2021_index.as_query_engine(similarity_top_k=5)

# Query Engine on Lyft 2022 Docs Index
lyft_2022_query_engine = lyft_2022_index.as_query_engine(similarity_top_k=5)

Query Engine for LLM. With this we will use LLM to answer the query.

In [10]:
from llama_index.core.query_engine import CustomQueryEngine

class LLMQueryEngine(CustomQueryEngine):
    """RAG String Query Engine."""

    llm: llm

    def custom_query(self, query_str: str):

        response = self.llm.complete(query_str)

        return str(response)

llm_query_engine = LLMQueryEngine(llm=llm)

### Create Tools

We will create tools using the `QueryEngines` created earlier. The LLM will then select either a single tool or multiple tools to respond to the user's query.

In [11]:
from llama_index.core.tools import QueryEngineTool, ToolMetadata

query_engine_tools = [
    QueryEngineTool(
        query_engine=lyft_2020_query_engine,
        metadata=ToolMetadata(
            name="lyft_2020_10k_form",
            description="Annual report of Lyft's financial activities in 2020",
        ),
    ),
    QueryEngineTool(
        query_engine=lyft_2021_query_engine,
        metadata=ToolMetadata(
            name="lyft_2021_10k_form",
            description="Annual report of Lyft's financial activities in 2021",
        ),
    ),
    QueryEngineTool(
        query_engine=lyft_2022_query_engine,
        metadata=ToolMetadata(
            name="lyft_2022_10k_form",
            description="Annual report of Lyft's financial activities in 2022",
        ),
    ),
    QueryEngineTool(
        query_engine=llm_query_engine,
        metadata=ToolMetadata(
            name="general_queries",
            description=(
                "Provides information about general queries other than lyft."
            )
        )
)
]

Initialize a `FunctionCallingAgentWorker`

In [12]:
from llama_index.core.agent import FunctionCallingAgentWorker
from llama_index.core.agent import AgentRunner

agent_worker = FunctionCallingAgentWorker.from_tools(
    query_engine_tools,
    llm=llm,
    verbose=True,
    allow_parallel_tool_calls=True,
)

### Querying

#### Simple Queries:

##### Query: What is the capital of France?

You can see that it used LLM tool since it is a general query.

In [13]:
agent = AgentRunner(agent_worker)
response = agent.chat("What is the capital of France?")
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))

Added user message to memory: What is the capital of France?
HTTP Request: POST https://api.mistral.ai/v1/chat/completions "HTTP/1.1 200 OK"
=== Calling Function ===
Calling function: general_queries with args: {"input": "What is the capital of France?"}
HTTP Request: POST https://api.mistral.ai/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.mistral.ai/v1/chat/completions "HTTP/1.1 200 OK"


##### Query: What did Lyft do in R&D in 2022?

You can see that it used lyft_2022 tool to answer the query.

In [14]:
agent = AgentRunner(agent_worker)
response = agent.chat("What did Lyft do in R&D in 2022?")
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))

Added user message to memory: What did Lyft do in R&D in 2022?
HTTP Request: POST https://api.mistral.ai/v1/chat/completions "HTTP/1.1 200 OK"
=== Calling Function ===
Calling function: lyft_2022_10k_form with args: {"input": "R&D activities in 2022"}
HTTP Request: POST https://api.mistral.ai/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.mistral.ai/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.mistral.ai/v1/chat/completions "HTTP/1.1 200 OK"


##### Query: What did Lyft do in R&D in 2021?

You can see that it used lyft_2021 tool to answer the query.

In [15]:
agent = AgentRunner(agent_worker)
response = agent.chat("What did Lyft do in R&D in 2021?")
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))

Added user message to memory: What did Lyft do in R&D in 2021?
HTTP Request: POST https://api.mistral.ai/v1/chat/completions "HTTP/1.1 200 OK"
=== Calling Function ===
Calling function: lyft_2021_10k_form with args: {"input": "R&D activities in 2021"}
HTTP Request: POST https://api.mistral.ai/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.mistral.ai/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.mistral.ai/v1/chat/completions "HTTP/1.1 200 OK"


##### Query: What did Lyft do in R&D in 2020?

You can see that it used lyft_2020 tool to answer the query.

In [16]:
agent = AgentRunner(agent_worker)
response = agent.chat("What did Lyft do in R&D in 2020?")
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))

Added user message to memory: What did Lyft do in R&D in 2020?
HTTP Request: POST https://api.mistral.ai/v1/chat/completions "HTTP/1.1 200 OK"
=== Calling Function ===
Calling function: lyft_2020_10k_form with args: {"input": "R&D activities in 2020"}
HTTP Request: POST https://api.mistral.ai/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.mistral.ai/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.mistral.ai/v1/chat/completions "HTTP/1.1 200 OK"


#### Complex Queries

Let's test queries that requires multiple tools.

##### Query: What did Lyft do in R&D in 2022 vs 2020?

You can see that it used lyft_2020 and lyft_2022 tools to answer the query.

In [17]:
agent = AgentRunner(agent_worker)
response = agent.chat("What did Lyft do in R&D in 2022 vs 2020?")
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))

Added user message to memory: What did Lyft do in R&D in 2022 vs 2020?
HTTP Request: POST https://api.mistral.ai/v1/chat/completions "HTTP/1.1 200 OK"
=== Calling Function ===
Calling function: lyft_2020_10k_form with args: {"input": "R&D expenses"}
HTTP Request: POST https://api.mistral.ai/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.mistral.ai/v1/chat/completions "HTTP/1.1 200 OK"
=== Calling Function ===
Calling function: lyft_2022_10k_form with args: {"input": "R&D expenses"}
HTTP Request: POST https://api.mistral.ai/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.mistral.ai/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.mistral.ai/v1/chat/completions "HTTP/1.1 200 OK"


##### Query: What did Lyft do in R&D in 2022 vs 2021 vs 2020?

You can see that it used lyft_2020, lyft_2021 and lyft_2022 tools to answer the query.

In [18]:
agent = AgentRunner(agent_worker)
response = agent.chat("What did Lyft do in R&D in 2022 vs 2021 vs 2020?")
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))

Added user message to memory: What did Lyft do in R&D in 2022 vs 2021 vs 2020?
HTTP Request: POST https://api.mistral.ai/v1/chat/completions "HTTP/1.1 200 OK"
=== Calling Function ===
Calling function: lyft_2020_10k_form with args: {"input": "R&D activities in 2020"}
HTTP Request: POST https://api.mistral.ai/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.mistral.ai/v1/chat/completions "HTTP/1.1 200 OK"
=== Calling Function ===
Calling function: lyft_2021_10k_form with args: {"input": "R&D activities in 2021"}
HTTP Request: POST https://api.mistral.ai/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.mistral.ai/v1/chat/completions "HTTP/1.1 200 OK"
=== Calling Function ===
Calling function: lyft_2022_10k_form with args: {"input": "R&D activities in 2022"}
HTTP Request: POST https://api.mistral.ai/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.mistral.ai/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.mistral.ai/v1/chat/co

##### Query: What did Lyft do in R&D in 2022 and what is the capital of France?

You can see that it used lyft_2022, and LLM tools to answer the query.

In [19]:
agent = AgentRunner(agent_worker)
response = agent.chat("What did Lyft do in R&D in 2022 and what is the capital of France?")
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))

Added user message to memory: What did Lyft do in R&D in 2022 and what is the capital of France?
HTTP Request: POST https://api.mistral.ai/v1/chat/completions "HTTP/1.1 200 OK"
=== Calling Function ===
Calling function: lyft_2022_10k_form with args: {"input": "R&D activities"}
HTTP Request: POST https://api.mistral.ai/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.mistral.ai/v1/chat/completions "HTTP/1.1 200 OK"
=== Calling Function ===
Calling function: general_queries with args: {"input": "capital of France"}
HTTP Request: POST https://api.mistral.ai/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.mistral.ai/v1/chat/completions "HTTP/1.1 200 OK"
