# ReActAgent

Data Agents are LLM-powered knowledge workers in LlamaIndex that can intelligently perform various tasks over your data, in both a “read” and “write” function. They are capable of the following:

- Perform automated search and retrieval over different types of data - unstructured, semi-structured, and structured.

- Calling any external service API in a structured fashion, and processing the response + storing it for later.

In that sense, agents are a step beyond our query engines in that they can not only “read” from a static source of data, but can dynamically ingest and modify data from a variety of different tools.

ReAct, short for Reasoning and Acting, was first introduced in the paper [ReAct: Synergizing Reasoning and Acting in Language Models](https://arxiv.org/pdf/2210.03629.pdf).  

ReAct Agent, introduced by LlamaIndex, is an agent-based chat mode built on top of a query engine over your data. ReAct Agent is one of LlamaIndex’s main chat engines. For each chat interaction, the agent enters a reasoning and acting loop:

- First, decide whether to use the query engine tool and which query engine tool to use to come up with appropriate input.
- Query with the query engine tool and observe its output.
- Based on the output, decide whether to repeat the process or give a final response.

We will use ReAct agent to analyze the U.S. government’s financial reports for fiscal years 2020, 2021, 2022, and 2023.

LlamaIndex notebook: https://docs.llamaindex.ai/en/stable/examples/chat_engine/chat_engine_react.html.

## Step 1: Setup the Query Tools

In [12]:
import os
from openai import OpenAI
from dotenv import load_dotenv, find_dotenv

_ = load_dotenv(find_dotenv()) # read local .env file
openai_api_key = os.getenv('OPENAI_API_KEY')

Be sure to save the files with file name containing underscore, not dash, otherwise, ReAct agent chat completion won't work.

In [2]:
# !mkdir reports
# !wget https://www.fiscal.treasury.gov/files/reports-statements/financial-report/2020/executive-summary-2020.pdf -O ./reports/executive_summary_2020.pdf
# !wget https://www.fiscal.treasury.gov/files/reports-statements/financial-report/2021/executive-summary-2021.pdf -O ./reports/executive_summary_2021.pdf
# !wget https://www.fiscal.treasury.gov/files/reports-statements/financial-report/2022/executive-summary-2022.pdf -O ./reports/executive_summary_2022.pdf
# !wget https://www.fiscal.treasury.gov/files/reports-statements/financial-report/2023/executive-summary-2023.pdf -O ./reports/executive_summary_2023.pdf

# !wget https://www.bls.gov/news.release/archives/cpi_07112024.pdf
# !wget https://www.bls.gov/news.release/archives/cpi_06122024.pdf
# !wget https://www.bls.gov/news.release/archives/cpi_05152024.pdf
# !wget https://www.bls.gov/news.release/archives/cpi_04102024.pdf
# !wget https://www.bls.gov/news.release/archives/cpi_03122024.pdf
# !wget https://www.bls.gov/news.release/archives/cpi_02132024.pddf


In [3]:
from llama_index.core import SimpleDirectoryReader

# load documents
documents = SimpleDirectoryReader(input_files=["202406.pdf"]).load_data()

In [4]:
from llama_index.core.node_parser import SentenceSplitter

splitter = SentenceSplitter(chunk_size=1024)
nodes = splitter.get_nodes_from_documents(documents)

In [5]:
print(nodes[0].get_content(metadata_mode="all"))

page_label: 1
file_name: 202406.pdf
file_path: 202406.pdf
file_type: application/pdf
file_size: 525632
creation_date: 2024-07-26
last_modified_date: 2024-07-26

Transmission of material in this release is embargoed until                                         USDL -24-1325 
8:30 a.m. ( ET) Thursday , July 11, 2024      
  
Technical information: (202) 691- 7000  •  cpi_info @bls.gov  •  www.bls.gov/cpi  
Media c ontact:              (202) 691- 5902  •  PressOffice@bls.gov   
 
CONSUMER  PRICE  INDEX  – JUNE  2024  
 
(NOTE: This news release was reissued on July 11, 2024. BLS inadvertently published an index 
value and related 1 -month and 12 -month percent changes for inpatient hospital services for June 
2024 that did not meet publication criteria.  These data have been removed from tables 2, 6, and 7 
of the news release. These data were not published in the database. )  
 
The Consumer Price Index for All Urban Consumers (CPI- U) declined 0.1 percent  on a seasonally 
adjusted bas

In [6]:
from llama_index.core import VectorStoreIndex

vector_index = VectorStoreIndex(nodes)

query_engine = vector_index.as_query_engine(similarity_top_k=10)

In [6]:
# What is the U.S. government's net operating cost?
# What are the total revenues reported by the U.S. government?
# What are the total assets reported by the U.S. government?
# What is the total national debt reported by the U.S. government?
# response = query_engine.query(
#     "What is the U.S. government's net operating cost?" 
# )
# print(response)

In [7]:
def vector_query(query: str) -> str:
    query_engine = vector_index.as_query_engine(similarity_top_k=10)
    response = query_engine.query(query)
    return response.response

In [8]:
vector_query("What is the overall CPI for the current period?")

'The overall CPI for the current period is 3.0 percent.'

In [9]:
vector_query("Which categories (e.g., food, energy, housing) contributed most to the change in CPI?")

'Food, housing, and transportation were the categories that contributed significantly to the change in CPI.'

In [10]:
from llama_index.llms.openai import OpenAI
from llama_index.core.tools import FunctionTool

vector_query_tool = FunctionTool.from_defaults(
    name="vector_tool",
    fn=vector_query
)

llm = OpenAI(model="gpt-4o-mini", temperature=0)
response = llm.predict_and_call(
    [vector_query_tool], 
    "Which categories (e.g., food, energy, housing) contributed most to the change in CPI?", 
    verbose=True
)

=== Calling Function ===
Calling function: vector_tool with args: {"query": "categories contributing to change in Consumer Price Index (CPI)"}
=== Function Output ===
Categories contributing to change in Consumer Price Index (CPI) include housing, education and communication, pets and pet products, food and beverages, household energy, medical care, transportation, and other goods and services.


## Multi-Document Agent

In [13]:
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
from llama_index.core.tools import QueryEngineTool, ToolMetadata


query_engine_tools = []
for filename in os.listdir("cpi"):
    if filename.endswith(".pdf"):
        file_path = os.path.join("cpi", filename)

        with open(file_path, "r") as file:
            documents = SimpleDirectoryReader(input_files=[file_path]).load_data()
            print(f"Loaded {len(documents)} documents from {filename}")
            print(filename[:-4]) # print name without extension

            index = VectorStoreIndex.from_documents(documents)
            splitter = SentenceSplitter(chunk_size=512)
            nodes = splitter.get_nodes_from_documents(documents)
            vector_index = VectorStoreIndex(nodes)
            query_engine = vector_index.as_query_engine(similarity_top_k=10)
            
            query_engine_tool = QueryEngineTool(
                query_engine=query_engine,
                metadata=ToolMetadata(
                    name=f"{filename[:-4]}",  # Construct name without extension
                    description=(
                        f"Consumer Price Index (CPI) in 2024 {filename}"
                    ),
                ),
            )
            query_engine_tools.append(query_engine_tool)

Loaded 38 documents from 202406.pdf
202406


In [15]:
query_engine_tools[0].metadata

ToolMetadata(description='Consumer Price Index (CPI) in 2024 202406.pdf', name='202406', fn_schema=<class 'llama_index.core.tools.types.DefaultToolFnSchema'>, return_direct=False)

In [16]:
from llama_index.core.agent import FunctionCallingAgentWorker
from llama_index.core.agent import AgentRunner

agent_worker = FunctionCallingAgentWorker.from_tools(
    query_engine_tools, 
    llm=llm, 
    verbose=True
)
agent = AgentRunner(agent_worker)

In [17]:
response = agent.query("Which categories (e.g., food, energy, housing) contributed most to the change in CPI?")

Added user message to memory: Which categories (e.g., food, energy, housing) contributed most to the change in CPI?
=== Calling Function ===
Calling function: 202406 with args: {"input": "categories contributing to change in CPI"}
=== Function Output ===
Categories contributing to the change in CPI include Household furnishings and operations, Other goods and services, Personal care, New and used motor vehicles, Utilities and public transportation, Other services, Apparel less footwear, Fuels and utilities, Household energy, Medical care, Transportation, and Food and beverages.
=== LLM Response ===
The categories that contributed most to the change in the Consumer Price Index (CPI) include:

1. Household furnishings and operations
2. Other goods and services
3. Personal care
4. New and used motor vehicles
5. Utilities and public transportation
6. Other services
7. Apparel less footwear
8. Fuels and utilities
9. Household energy
10. Medical care
11. Transportation
12. Food and beverages

### ReAct Agent

In [19]:
from llama_index.core.agent import ReActAgent
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-4o-mini")

react_agent = ReActAgent.from_tools(query_engine_tools, llm=llm, verbose=True)

In [20]:
response = react_agent.chat("Which categories (e.g., food, energy, housing) contributed most to the change in CPI?")
print(response)

> Running step 2d0f6bd6-2024-4d0a-b4f7-c7032616211e. Step input: Which categories (e.g., food, energy, housing) contributed most to the change in CPI?
[1;3;38;5;200mThought: The current language of the user is: English. I need to use a tool to help me answer the question.
Action: 202406
Action Input: {'input': 'Which categories contributed most to the change in CPI?'}
[0m[1;3;34mObservation: Categories that contributed most to the change in CPI include domestically produced farm food, other goods and services, personal care, utilities and public transportation, and medical care.
[0m> Running step 0df8f22d-a8ff-415d-bac1-8e6735d52bc0. Step input: None
[1;3;38;5;200mThought: I can answer without using any more tools. I'll use the user's language to answer.
Answer: The categories that contributed most to the change in CPI include domestically produced farm food, other goods and services, personal care, utilities and public transportation, and medical care.
[0mThe categories that con

In [21]:
response = react_agent.chat("Compare and contrast the food, energy, housing cost, then give an analysis")
print(str(response))

> Running step 5e6277dd-06b2-476b-bc61-89126615a736. Step input: Compare and contrast the food, energy, housing cost, then give an analysis
[1;3;38;5;200mThought: The user is asking for a comparison and analysis of the costs related to food, energy, and housing. I need to gather specific data on these categories from the CPI document to provide a detailed comparison and analysis.
Action: tool
Action Input: {'input': 'Compare and contrast the costs of food, energy, and housing in the CPI data.'}
[0m[1;3;34mObservation: Error: No such tool named `tool`.
[0m> Running step 88c8228b-5a65-4e3e-90ac-9f961559f29f. Step input: None
[1;3;38;5;200mThought: I need to use the available tool to gather information on the costs of food, energy, and housing from the CPI data.
Action: 202406
Action Input: {'input': 'Compare and contrast the costs of food, energy, and housing in the CPI data.'}
[0m[1;3;34mObservation: Food costs have shown a slight increase over the period, with food at home exper

In [23]:
llm_instruct = OpenAI(model="gpt-3.5-turbo-instruct")

react_agent = ReActAgent.from_tools(query_engine_tools, llm=llm_instruct, verbose=True)

In [24]:
response = react_agent.chat("Compare and contrast the food, energy, housing cost, then give an analysis")
print(str(response))

> Running step f9c434a0-d616-4a38-8578-8b8d3df0b799. Step input: Compare and contrast the food, energy, housing cost, then give an analysis
[1;3;34mObservation: Error: Could not parse output. Please follow the thought-action-input format. Try again.
[0m> Running step b733e4b7-93fa-4bad-a556-926206e0a168. Step input: None
[1;3;34mObservation: Error: Could not parse output. Please follow the thought-action-input format. Try again.
[0m> Running step 9e189b3f-2ddf-4b5e-ad0a-cfd03604b2d5. Step input: None
[1;3;34mObservation: Error: Could not parse output. Please follow the thought-action-input format. Try again.
[0m> Running step cae19237-6516-4fea-bc71-6d94bb91d34c. Step input: None
[1;3;34mObservation: Error: Could not parse output. Please follow the thought-action-input format. Try again.
[0m> Running step edd6692f-ad54-4d7f-86e8-0f89afa2a789. Step input: None
[1;3;38;5;200mThought: I cannot answer the question with the provided tools.
Answer: I am sorry, I am not able to answe