<a href="https://colab.research.google.com/github/sudarshan-koirala/youtube-stuffs/blob/main/llamaindex/llamaindex_openai_assistant_agent.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# OpenAI Assistant Agent

This shows you how to use our agent abstractions built on top of the [OpenAI Assistant API](https://platform.openai.com/docs/assistants/overview).


In [1]:
%%capture
!pip install llama-index watermark openai

In [2]:
%load_ext watermark
%watermark -a "Sudarshan Koirala" -vmp llama_index,openai

Author: Sudarshan Koirala

Python implementation: CPython
Python version       : 3.10.12
IPython version      : 7.34.0

llama_index: 0.11.11
openai     : 1.47.0

Compiler    : GCC 11.4.0
OS          : Linux
Release     : 6.1.85+
Machine     : x86_64
Processor   : x86_64
CPU cores   : 2
Architecture: 64bit



In [None]:
# get your openai api key from https://platform.openai.com/account/api-keys 🔑
import openai
import os
from getpass import getpass

OPENAI_API_KEY = getpass()
os.environ["OPENAI_API_KEY"] = "YOUR API KEY"

··········


## Simple Agent (no external tools)

Here we show a simple example with the built-in code interpreter.

In [6]:
from llama_index.agent.openai import OpenAIAssistantAgent


In [10]:
OpenAIAssistantAgent??

In [11]:
agent = OpenAIAssistantAgent.from_new(
    name="Math Tutor",
    instructions="You are a personal math tutor. Write and run code to answer math questions.",
    openai_tools=[{"type": "code_interpreter"}],
    instructions_prefix="Please address the user as Sudarshan Koirala.",
)

In [12]:
agent.thread_id

'thread_tEIZPKng47drGQvA8Nf7uat8'

In [13]:
response = agent.chat(
    "I need to solve the equation `3x + 11 = 0`. Can you help me?"
)

In [14]:
print(str(response))

Certainly, Sudarshan Koirala. The solution to the equation `3x + 11 = 0` is `x = -11/3`.


## Assistant with Query Engine Tools

Here we showcase the function calling capabilities of the OpenAIAssistantAgent by integrating it with our query engine tools over different documents.

### 1. Setup: Load Data

In [18]:
from llama_index.agent.openai import OpenAIAssistantAgent
from llama_index.core import (
    SimpleDirectoryReader,
    VectorStoreIndex,
    StorageContext,
    load_index_from_storage,
)

from llama_index.core.tools import QueryEngineTool, ToolMetadata

In [19]:
try:
    storage_context = StorageContext.from_defaults(
        persist_dir="./storage/lyft"
    )
    lyft_index = load_index_from_storage(storage_context)

    storage_context = StorageContext.from_defaults(
        persist_dir="./storage/uber"
    )
    uber_index = load_index_from_storage(storage_context)

    index_loaded = True
except:
    index_loaded = False

In [20]:
!mkdir -p 'data/10k/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/10k/uber_2021.pdf' -O 'data/10k/uber_2021.pdf'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/10k/lyft_2021.pdf' -O 'data/10k/lyft_2021.pdf'

--2024-09-22 00:45:14--  https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/10k/uber_2021.pdf
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.111.133, 185.199.109.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2024-09-22 00:45:14 ERROR 404: Not Found.

--2024-09-22 00:45:14--  https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/10k/lyft_2021.pdf
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2024-09-22 00:45:14 ERROR 404: Not Found.



In [21]:
%%capture
!pip install pypdf

In [22]:
if not index_loaded:
    # load data
    lyft_docs = SimpleDirectoryReader(
        input_files=["./data/10k/lyft_2021.pdf"]
    ).load_data()

    uber_docs = SimpleDirectoryReader(
        input_files=["./data/10k/uber_2021.pdf"]
    ).load_data()

    # build index
    lyft_index = VectorStoreIndex.from_documents(lyft_docs)
    uber_index = VectorStoreIndex.from_documents(uber_docs)

    # persist index
    lyft_index.storage_context.persist(persist_dir="./storage/lyft")
    uber_index.storage_context.persist(persist_dir="./storage/uber")

Failed to load file data/10k/lyft_2021.pdf with error: RetryError[<Future at 0x7f6ef8c941c0 state=finished raised EmptyFileError>]. Skipping...
Failed to load file data/10k/uber_2021.pdf with error: RetryError[<Future at 0x7f6eefef6440 state=finished raised EmptyFileError>]. Skipping...


### 2. Create Engine

In [23]:
lyft_engine = lyft_index.as_query_engine(similarity_top_k=3)
uber_engine = uber_index.as_query_engine(similarity_top_k=3)

In [24]:
lyft_engine

<llama_index.core.query_engine.retriever_query_engine.RetrieverQueryEngine at 0x7f6ef8c97df0>

In [25]:
uber_engine

<llama_index.core.query_engine.retriever_query_engine.RetrieverQueryEngine at 0x7f6ef8c94400>

In [26]:
query_engine_tools = [
    QueryEngineTool(
        query_engine=lyft_engine,
        metadata=ToolMetadata(
            name="lyft_10k",
            description=(
                "Provides information about Lyft financials for year 2021. "
                "Use a detailed plain text question as input to the tool."
            ),
        ),
    ),
    QueryEngineTool(
        query_engine=uber_engine,
        metadata=ToolMetadata(
            name="uber_10k",
            description=(
                "Provides information about Uber financials for year 2021. "
                "Use a detailed plain text question as input to the tool."
            ),
        ),
    ),
]

### 3. Now the query engine tools is being created, let's try it out with this tools.

In [27]:
agent = OpenAIAssistantAgent.from_new(
    name="SEC Analyst",
    instructions="You are a QA assistant designed to analyze sec filings.",
    tools=query_engine_tools,
    instructions_prefix="Please address the user as Sudarshan.",
    verbose=True,
    run_retrieve_sleep_time=1.0,
)

In [28]:
response = agent.chat("What was Lyft's revenue growth in 2021?")

=== Calling Function ===
Calling function: lyft_10k with args: {"input":"What was Lyft's revenue growth in 2021?"}
Got output: Empty Response


In [29]:
response.response

"I apologize, but it seems there was an issue retrieving Lyft's revenue growth for 2021. To provide you with accurate information, I can suggest looking into Lyft's annual report, specifically the Form 10-K for the year 2021, which will include detailed financial statements and discussions of the company's performance, including revenue growth. These documents are publicly available on the investor relations section of Lyft's website or through the U.S. Securities and Exchange Commission's EDGAR database.\n\nIf you need further assistance or specific details from the report, please let me know, and I'll do my best to help."

In [30]:
response.source_nodes

[]

## Assistant with Built-In Retrieval

Let's test the assistant by having it use the built-in OpenAI Retrieval tool over a user-uploaded file.

Here, we upload and pass in the file during assistant-creation time.

The other option is you can upload/pass the file-id in for a message in a given thread with `upload_files` and `add_message`. [Link](https://platform.openai.com/docs/assistants/tools/uploading-files-for-retrieval) to openai documentation.

In [32]:
from llama_index.agent.openai import OpenAIAssistantAgent


In [34]:

agent = OpenAIAssistantAgent.from_new(
    name="Math Tutor",
    instructions="You are a personal math tutor. Write and run code to answer math questions.",
    openai_tools=[{"type": "code_interpreter"}],
    instructions_prefix="Please address the user as Sudarshan Koirala.",
)

In [35]:
response = agent.chat("What was Lyft's revenue growth in 2021?")

In [36]:
from pprint import pprint
pprint(response.response)

('As of my knowledge cutoff in April 2023, Lyft, Inc., the American '
 'ride-sharing company, released its financial results for the year ending '
 'December 31, 2021, which showed significant recovery and revenue growth '
 'compared to 2020. The COVID-19 pandemic heavily impacted the ride-sharing '
 'industry in 2020, leading to reduced revenues as travel restrictions and '
 'lockdowns were in place.\n'
 '\n'
 'In 2021, Lyft reported that its revenue had grown compared to the previous '
 "year. However, to provide you with a specific percentage for Lyft's revenue "
 'growth in 2021, I would need to check the most current and accurate '
 "financial data. Since my internet access is disabled and I can't access "
 "real-time information, I suggest you refer to Lyft's investor relations page "
 'or their annual financial report (Form 10-K) for the detailed financial '
 'figures reported for 2021, which would include their annual revenue growth.\n'
 '\n'
 'If you have access to the specifi