# ReActAgent

Data Agents are LLM-powered knowledge workers in LlamaIndex that can intelligently perform various tasks over your data, in both a “read” and “write” function. They are capable of the following:

- Perform automated search and retrieval over different types of data - unstructured, semi-structured, and structured.

- Calling any external service API in a structured fashion, and processing the response + storing it for later.

In that sense, agents are a step beyond our query engines in that they can not only “read” from a static source of data, but can dynamically ingest and modify data from a variety of different tools.

ReAct, short for Reasoning and Acting, was first introduced in the paper [ReAct: Synergizing Reasoning and Acting in Language Models](https://arxiv.org/pdf/2210.03629.pdf).  

ReAct Agent, introduced by LlamaIndex, is an agent-based chat mode built on top of a query engine over your data. ReAct Agent is one of LlamaIndex’s main chat engines. For each chat interaction, the agent enters a reasoning and acting loop:

- First, decide whether to use the query engine tool and which query engine tool to use to come up with appropriate input.
- Query with the query engine tool and observe its output.
- Based on the output, decide whether to repeat the process or give a final response.

We will use ReAct agent to analyze the U.S. government’s financial reports for fiscal years 2020, 2021, 2022, and 2023.

LlamaIndex notebook: https://docs.llamaindex.ai/en/stable/examples/chat_engine/chat_engine_react.html.

## Step 1: Setup the Query Tools

In [2]:
import os
from openai import OpenAI
from dotenv import load_dotenv, find_dotenv

_ = load_dotenv(find_dotenv()) # read local .env file
openai_api_key = os.getenv('OPENAI_API_KEY')

Be sure to save the files with file name containing underscore, not dash, otherwise, ReAct agent chat completion won't work.

In [6]:
# !mkdir reports
# !wget https://www.fiscal.treasury.gov/files/reports-statements/financial-report/2020/executive-summary-2020.pdf -O ./reports/executive_summary_2020.pdf
# !wget https://www.fiscal.treasury.gov/files/reports-statements/financial-report/2021/executive-summary-2021.pdf -O ./reports/executive_summary_2021.pdf
# !wget https://www.fiscal.treasury.gov/files/reports-statements/financial-report/2022/executive-summary-2022.pdf -O ./reports/executive_summary_2022.pdf
# !wget https://www.fiscal.treasury.gov/files/reports-statements/financial-report/2023/executive-summary-2023.pdf -O ./reports/executive_summary_2023.pdf

# !wget https://www.bls.gov/news.release/archives/cpi_07112024.pdf
# !wget https://www.bls.gov/news.release/archives/cpi_06122024.pdf
# !wget https://www.bls.gov/news.release/archives/cpi_05152024.pdf
# !wget https://www.bls.gov/news.release/archives/cpi_04102024.pdf
# !wget https://www.bls.gov/news.release/archives/cpi_03122024.pdf
# !wget https://www.bls.gov/news.release/archives/cpi_02132024.pddf


## Multi-Document Agent

In [3]:
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
from llama_index.core.node_parser import SentenceSplitter
from llama_index.core.tools import QueryEngineTool, ToolMetadata


query_engine_tools = []
for filename in os.listdir("reports"):
    if filename.endswith(".pdf"):
        file_path = os.path.join("reports", filename)

        with open(file_path, "r") as file:
            documents = SimpleDirectoryReader(input_files=[file_path]).load_data()
            print(f"Loaded {len(documents)} documents from {filename}")
            print(filename[:-4]) # print name without extension

            index = VectorStoreIndex.from_documents(documents)
            splitter = SentenceSplitter(chunk_size=512)
            nodes = splitter.get_nodes_from_documents(documents)
            vector_index = VectorStoreIndex(nodes)
            query_engine = vector_index.as_query_engine(similarity_top_k=10)
            
            query_engine_tool = QueryEngineTool(
                query_engine=query_engine,
                metadata=ToolMetadata(
                    name=f"{filename[:-4]}",  # Construct name without extension
                    description=(
                        f"Provides information about the U.S. government financial report {filename[:-4]}"
                    ),
                ),
            )
            query_engine_tools.append(query_engine_tool)

Loaded 10 documents from executive_summary_2022.pdf
executive_summary_2022
Loaded 10 documents from executive_summary_2023.pdf
executive_summary_2023
Loaded 11 documents from executive_summary_2021.pdf
executive_summary_2021
Loaded 11 documents from executive_summary_2020.pdf
executive_summary_2020


### ReAct Agent

In [4]:
from llama_index.core.agent import ReActAgent
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-4")

react_agent = ReActAgent.from_tools(query_engine_tools, llm=llm, verbose=True)

In [16]:
response = react_agent.chat("What is the U.S. government's net operating cost from 2020 to 2023?")
print(response)

[1;3;38;5;200mThought: The user is asking for the U.S. government's net operating cost from 2020 to 2023. I need to use the executive_summary tools for the years 2020 to 2023 to find this information.
Action: executive_summary_2020
Action Input: {'input': 'net operating cost'}
[0m[1;3;34mObservation: The net operating cost increased by $2.4 trillion during FY 2020 to $3.8 trillion.
[0m[1;3;38;5;200mThought: I have the net operating cost for 2020. Now I need to find the net operating cost for 2021.
Action: executive_summary_2021
Action Input: {'input': 'net operating cost'}
[0m[1;3;34mObservation: The net operating cost decreased by $746.5 billion (19.4 percent) during FY 2021 to $3.1 trillion.
[0m[1;3;38;5;200mThought: I have the net operating cost for 2021. Now I need to find the net operating cost for 2022.
Action: executive_summary_2022
Action Input: {'input': 'net operating cost'}
[0m[1;3;34mObservation: The net operating cost is calculated by subtracting earned program 

In [5]:
response = react_agent.chat("Compare and contrast the anual net operating cost from 2020 to 2023, then give an analysis")
print(str(response))

[1;3;38;5;200mThought: To answer this question, I need to use the executive_summary tools for each year from 2020 to 2023 to get the annual net operating cost.
Action: executive_summary_2020
Action Input: {'input': 'annual net operating cost'}
[0m[1;3;34mObservation: Annual net operating cost increased by $2.4 trillion in FY 2020 to $3.8 trillion.
[0m[1;3;38;5;200mThought: Now I have the annual net operating cost for 2020. I need to use the executive_summary_2021 tool to get the annual net operating cost for 2021.
Action: executive_summary_2021
Action Input: {'input': 'annual net operating cost'}
[0m[1;3;34mObservation: $3.1 trillion
[0m[1;3;38;5;200mThought: Now I have the annual net operating cost for 2021. I need to use the executive_summary_2022 tool to get the annual net operating cost for 2022.
Action: executive_summary_2022
Action Input: {'input': 'annual net operating cost'}
[0m[1;3;34mObservation: $4.2 trillion
[0m[1;3;38;5;200mThought: Now I have the annual net o