<a href="https://colab.research.google.com/github/Y-YHat/hands-on-llamaindex/blob/main/02_agents_openai.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# OpenAIAgent

`OpenAIAgent` is an OpenAI (function calling) Agent. It uses the OpenAI function API to reason about whether to use a tool, and returning the response to the user. It supports both a flat list of tools as well as retrieval over the tools.

LlamaIndex notebook: https://docs.llamaindex.ai/en/stable/examples/agent/openai_agent_with_query_engine.html.

## Step 1: Install and Setup

In [2]:
!pip install -q llama_index

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m15.3/15.3 MB[0m [31m41.6 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.0/2.0 MB[0m [31m51.3 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m75.6/75.6 kB[0m [31m7.4 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m108.0/108.0 kB[0m [31m9.8 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m227.4/227.4 kB[0m [31m12.5 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.8/1.8 MB[0m [31m63.9 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m4.4/4.4 MB[0m [31m59.5 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m286.1/286.1 kB[0m [31m18.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━

In [3]:
import logging, sys, os
import nest_asyncio
from google.colab import userdata

# set OpenAI API key in environment variable
os.environ["OPENAI_API_KEY"] = userdata.get("OPENAI_API_KEY")

# serves to enable nested asynchronous event loops, recommended for colab notebook
nest_asyncio.apply()

logging.basicConfig(stream=sys.stdout, level=logging.INFO)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

In [None]:
# !wget https://www.fiscal.treasury.gov/files/reports-statements/financial-report/2020/executive-summary-2020.pdf -O ./reports/2020-executive-summary.pdf
# !wget https://www.fiscal.treasury.gov/files/reports-statements/financial-report/2021/executive-summary-2021.pdf -O ./reports/2021-executive-summary.pdf
# !wget https://www.fiscal.treasury.gov/files/reports-statements/financial-report/2022/executive-summary-2022.pdf -O ./reports/2022-executive-summary.pdfmkdir reports
# !

--2024-02-16 17:07:00--  https://www.fiscal.treasury.gov/files/reports-statements/financial-report/2020/executive-summary-2020.pdf
Resolving www.fiscal.treasury.gov (www.fiscal.treasury.gov)... 166.123.218.167, 2610:108:4100:100c::8:118
Connecting to www.fiscal.treasury.gov (www.fiscal.treasury.gov)|166.123.218.167|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2323072 (2.2M) [application/pdf]
Saving to: ‘./reports/2020-executive-summary.pdf’


2024-02-16 17:07:02 (1.13 MB/s) - ‘./reports/2020-executive-summary.pdf’ saved [2323072/2323072]

--2024-02-16 17:07:02--  https://www.fiscal.treasury.gov/files/reports-statements/financial-report/2021/executive-summary-2021.pdf
Resolving www.fiscal.treasury.gov (www.fiscal.treasury.gov)... 166.123.218.167, 2610:108:4100:100c::8:118
Connecting to www.fiscal.treasury.gov (www.fiscal.treasury.gov)|166.123.218.167|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1001902 (978K) [application/pdf]
Sa

## Step 2: Load data, build indices, define OpenAIAgent

In [4]:
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
from llama_index.core.tools import QueryEngineTool, ToolMetadata
from llama_index.agent.openai import OpenAIAgent
import os

query_engine_tools = []

for filename in os.listdir("data"):
    if filename.endswith(".md"):
        file_path = os.path.join("data", filename)

        with open(file_path, "r") as file:
            documents = SimpleDirectoryReader(input_files=[file_path]).load_data()
            print(f"Loaded {len(documents)} documents from {filename}")
            print(filename[:-4])

            index = VectorStoreIndex.from_documents(documents)
            query_engine = index.as_query_engine(similarity_top_k=5)
            query_engine_tool = QueryEngineTool.from_defaults(
                query_engine=query_engine,
                name=f"{filename[:-4]}",  # Construct name without extension
                description=f"Provides information about agile document {filename[:-4]}",
            )
            query_engine_tools.append(query_engine_tool)

agent = OpenAIAgent.from_tools(query_engine_tools, verbose=True)

Loaded 1 documents from agile.md
agil


## Step 3: Execute Queries

In [5]:
from IPython.display import Markdown

response = agent.chat("Can you create an executive summary of the sprint tasks completed in the document")
display(Markdown(f"<b>{response}</b>"))

Added user message to memory: Can you create an executive summary of the sprint tasks completed in the document
=== Calling Function ===
Calling function: agil with args: {"input":"executive summary"}
Got output: The team is currently working on various tasks related to different themes such as user authentication, UI/UX design, backend development, testing, documentation, refactoring, planning, optimization, improvements, integration, feature development, security, and performance. Tasks are at different stages like in progress, to do, review, done, and in review. Team members are assigned to specific tasks, and some tasks are waiting for input, approval, or clarification from stakeholders or team members.



<b>Here is the executive summary of the sprint tasks completed in the document:

The team is currently working on various tasks related to different themes such as user authentication, UI/UX design, backend development, testing, documentation, refactoring, planning, optimization, improvements, integration, feature development, security, and performance. Tasks are at different stages like in progress, to do, review, done, and in review. Team members are assigned to specific tasks, and some tasks are waiting for input, approval, or clarification from stakeholders or team members.</b>