## LlamaIndex ReAct Agent with Query Engine (RAG) Tools

https://docs.llamaindex.ai/en/stable/examples/agent/react_agent_with_query_engine/

## Build Query Engine Tools

In [1]:
from llama_index.core import (
    SimpleDirectoryReader,
    VectorStoreIndex,
    StorageContext,
    load_index_from_storage,
)

from llama_index.core.tools import QueryEngineTool, ToolMetadata

In [2]:
try:
    storage_context = StorageContext.from_defaults(
        persist_dir="/app/data/storage/lyft"
    )
    lyft_index = load_index_from_storage(storage_context)

    storage_context = StorageContext.from_defaults(
        persist_dir="/app/data/storage/uber"
    )
    uber_index = load_index_from_storage(storage_context)

    index_loaded = True
except:
    index_loaded = False

In [3]:
!mkdir -p '/app/data/10k/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/uber_2021.pdf' -O '/app/data/10k/uber_2021.pdf'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/lyft_2021.pdf' -O '/app/data/10k/lyft_2021.pdf'

--2024-07-09 06:56:39--  https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/uber_2021.pdf
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.110.133, 185.199.109.133, 185.199.108.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.110.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1880483 (1.8M) [application/octet-stream]
Saving to: ‘/app/data/10k/uber_2021.pdf’


2024-07-09 06:56:40 (13.5 MB/s) - ‘/app/data/10k/uber_2021.pdf’ saved [1880483/1880483]

--2024-07-09 06:56:40--  https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/lyft_2021.pdf
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.110.133, 185.199.109.133, 185.199.108.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.110.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1440303 (1

In [3]:
if not index_loaded:
    # load data
    lyft_docs = SimpleDirectoryReader(
        input_files=["/app/data/10k/lyft_2021.pdf"]
    ).load_data()
    uber_docs = SimpleDirectoryReader(
        input_files=["/app/data/10k/uber_2021.pdf"]
    ).load_data()

    # build index
    lyft_index = VectorStoreIndex.from_documents(lyft_docs)
    uber_index = VectorStoreIndex.from_documents(uber_docs)

    # persist index
    lyft_index.storage_context.persist(persist_dir="/app/data/storage/lyft")
    uber_index.storage_context.persist(persist_dir="/app/data/storage/uber")

In [4]:
lyft_engine = lyft_index.as_query_engine(similarity_top_k=3)
uber_engine = uber_index.as_query_engine(similarity_top_k=3)

In [5]:
query_engine_tools = [
    QueryEngineTool(
        query_engine=lyft_engine,
        metadata=ToolMetadata(
            name="lyft_10k",
            description=(
                "Provides information about Lyft financials for year 2021. "
                "Use a detailed plain text question as input to the tool."
            ),
        ),
    ),
    QueryEngineTool(
        query_engine=uber_engine,
        metadata=ToolMetadata(
            name="uber_10k",
            description=(
                "Provides information about Uber financials for year 2021. "
                "Use a detailed plain text question as input to the tool."
            ),
        ),
    ),
]

## Setup ReAct Agent

In [6]:
from llama_index.core.agent import ReActAgent
from llama_index.llms.openai import OpenAI

In [8]:
# [Optional] Add Context
# context = """\
# You are a stock market sorcerer who is an expert on the companies Lyft and Uber.\
#     You will answer questions about Uber and Lyft as in the persona of a sorcerer \
#     and veteran stock market investor.
# """
llm = OpenAI(model="gpt-3.5-turbo")

agent = ReActAgent.from_tools(
    query_engine_tools,
    llm=llm,
    verbose=True,
    # context=context
)

In [9]:
response = agent.chat("What was Lyft's revenue growth in 2021?")
print(str(response))

> Running step 2756bc63-9ad6-4b93-a054-c508bc1cf25e. Step input: What was Lyft's revenue growth in 2021?
[1;3;38;5;200mThought: The user is asking about Lyft's revenue growth in 2021. I can use the lyft_10k tool to find this information.
Action: lyft_10k
Action Input: {'input': "What was Lyft's revenue growth in 2021?"}
[0m[1;3;34mObservation: Lyft's revenue increased by 36% in 2021 compared to the prior year.
[0m> Running step 0206e406-06d8-4a86-a6ec-d2e496f0842f. Step input: None
[1;3;38;5;200mThought: I can answer the user's question without using any more tools.
Answer: Lyft's revenue grew by 36% in 2021 compared to the previous year.
[0mLyft's revenue grew by 36% in 2021 compared to the previous year.


## Running Example Queries

In [10]:
response = agent.chat(
    "Compare and contrast the revenue growth of Uber and Lyft in 2021, then give an analysis"
)
print(str(response))

> Running step b8247dae-9a57-49cd-b005-03da6a21fe7e. Step input: Compare and contrast the revenue growth of Uber and Lyft in 2021, then give an analysis
[1;3;38;5;200mThought: I need to compare the revenue growth of Uber and Lyft in 2021 to provide an analysis.
Action: lyft_10k
Action Input: {'input': "What was Lyft's revenue growth in 2021?"}
[0m[1;3;34mObservation: Lyft's revenue increased by 36% in 2021 compared to the prior year.
[0m> Running step 31c0874e-1893-42af-92da-c934380161ad. Step input: None
[1;3;38;5;200mThought: I need to gather information about Uber's revenue growth in 2021 to compare it with Lyft's growth.
Action: uber_10k
Action Input: {'input': "What was Uber's revenue growth in 2021?"}
[0m[1;3;34mObservation: Uber's revenue grew by 57% in 2021.
[0m> Running step 305fddf6-7ba0-45fb-aacb-0d72df4b4420. Step input: None
[1;3;38;5;200mThought: I have the necessary information to compare and contrast the revenue growth of Uber and Lyft in 2021.
Answer: In 2021,

In [11]:
response = agent.chat(
    "Can you tell me about the risk factors of the company with the higher revenue?"
)
print(str(response))

> Running step 9b209459-07ee-44af-9e5c-28dc306e9fa3. Step input: Can you tell me about the risk factors of the company with the higher revenue?
[1;3;38;5;200mThought: I can answer without using any more tools. I'll use the user's language to answer
Answer: The risk factors for Uber, which had higher revenue growth in 2021, may include regulatory challenges, competition in the ride-sharing industry, and potential impacts of economic downturns on consumer spending.
[0mThe risk factors for Uber, which had higher revenue growth in 2021, may include regulatory challenges, competition in the ride-sharing industry, and potential impacts of economic downturns on consumer spending.
