# OpenAI Agent with LlamaIndex

## Install Dependencies

In [1]:
!pip install uv
!uv pip install --system -qU llama-index==0.11.6 llama-index-llms-openai llama-index-readers-file llama-index-embeddings-openai llama-index-llms-openai-like "openinference-instrumentation-llama-index>=2" arize-phoenix python-dotenv



## Setup API Keys
To run the rest of the notebook you will need access to an OctoAI API key. You can sign up for an account [here](https://octoai.cloud/). If you need further guidance you can check OctoAI's [documentation page](https://octo.ai/docs/getting-started/how-to-create-octoai-access-token).

In [2]:
from os import environ
from dotenv import load_dotenv

load_dotenv()

OPENAI_API_KEY = environ["OPENAI_API_KEY"]

## Import libraries and setup LlamaIndex

In [3]:
from llama_index.core import (
    SimpleDirectoryReader,
    VectorStoreIndex,
    StorageContext,
    load_index_from_storage,
)
from llama_index.core.tools import QueryEngineTool, ToolMetadata
from llama_index.core.agent import ReActAgent
from llama_index.llms.openai import OpenAI


# Create an llm object to use for the QueryEngine and the ReActAgent
llm = OpenAI(model="gpt-4")

# Set up Phoenix

In [4]:
import phoenix as px
session = px.launch_app()

  from .autonotebook import tqdm as notebook_tqdm


🌍 To view the Phoenix app in your browser, visit http://localhost:6006/
📖 For more information on how to use Phoenix, check out https://docs.arize.com/phoenix


In [5]:
from openinference.instrumentation.llama_index import LlamaIndexInstrumentor
from phoenix.otel import register

tracer_provider = register()
LlamaIndexInstrumentor().instrument(tracer_provider=tracer_provider)

🔭 OpenTelemetry Tracing Details 🔭
|  Phoenix Project: default
|  Span Processor: SimpleSpanProcessor
|  Collector Endpoint: localhost:4317
|  Transport: gRPC
|  Transport Headers: {'user-agent': '****'}
|  
|  Using a default SpanProcessor. `add_span_processor` will overwrite this default.
|  
|  `register` has set this TracerProvider as the global OpenTelemetry default.
|  To disable this behavior, call `register` with `set_global_tracer_provider=False`.



## Load Documents

In [6]:
'''
10ktry:
    storage_context = StorageContext.from_defaults(
        persist_dir="./storage/lyft"
    )
    lyft_index = load_index_from_storage(storage_context)

    storage_context = StorageContext.from_defaults(
        persist_dir="./storage/uber"
    )
    uber_index = load_index_from_storage(storage_context)

    index_loaded = True
except:
    index_loaded = False
'''

'\n10ktry:\n    storage_context = StorageContext.from_defaults(\n        persist_dir="./storage/lyft"\n    )\n    lyft_index = load_index_from_storage(storage_context)\n\n    storage_context = StorageContext.from_defaults(\n        persist_dir="./storage/uber"\n    )\n    uber_index = load_index_from_storage(storage_context)\n\n    index_loaded = True\nexcept:\n    index_loaded = False\n'

In [7]:
try:
    storage_context = StorageContext.from_defaults(
        persist_dir="./storage/p2025"
    )
    p2025_index = load_index_from_storage(storage_context)

    storage_context = StorageContext.from_defaults(
        persist_dir="./storage/eo"
    )
    eo_index = load_index_from_storage(storage_context)

    index_loaded = True
except:
    index_loaded = False

This is the point we create our vector indexes, by calculating the embedding vectors for each of the chunks. You only need to run this once.

In [8]:
'''
if not index_loaded:
    # load data
    lyft_docs = SimpleDirectoryReader(
        input_files=["./10k/lyft_2021.pdf"]
    ).load_data()
    uber_docs = SimpleDirectoryReader(
        input_files=["./10k/uber_2021.pdf"]
    ).load_data()

    # build index
    lyft_index = VectorStoreIndex.from_documents(lyft_docs, show_progress=True)
    uber_index = VectorStoreIndex.from_documents(uber_docs, swow_progress=True)

    # persist index
    lyft_index.storage_context.persist(persist_dir="./storage/lyft")
    uber_index.storage_context.persist(persist_dir="./storage/uber")

'''

'\nif not index_loaded:\n    # load data\n    lyft_docs = SimpleDirectoryReader(\n        input_files=["./10k/lyft_2021.pdf"]\n    ).load_data()\n    uber_docs = SimpleDirectoryReader(\n        input_files=["./10k/uber_2021.pdf"]\n    ).load_data()\n\n    # build index\n    lyft_index = VectorStoreIndex.from_documents(lyft_docs, show_progress=True)\n    uber_index = VectorStoreIndex.from_documents(uber_docs, swow_progress=True)\n\n    # persist index\n    lyft_index.storage_context.persist(persist_dir="./storage/lyft")\n    uber_index.storage_context.persist(persist_dir="./storage/uber")\n\n'

In [9]:
if not index_loaded:
    # load data
    p2025_docs = SimpleDirectoryReader(
        input_files=["./p2025/2025_MandateForLeadership_FULL.pdf"]
    ).load_data()
    
    '''
    eo_docs = SimpleDirectoryReader(
        input_files=["./eo/enforcing-the-hyde-amendment.pdf"]
    ).load_data()
    '''
    eo_docs = SimpleDirectoryReader(
        input_dir="./eo"
    ).load_data()

    # build index
    p2025_index = VectorStoreIndex.from_documents(p2025_docs, show_progress=True)
    eo_index = VectorStoreIndex.from_documents(eo_docs, show_progress=True)

    # persist index
    p2025_index.storage_context.persist(persist_dir="./storage/p2025")
    eo_index.storage_context.persist(persist_dir="./storage/eo")

Now create the query engines.

In [10]:
'''
lyft_engine = lyft_index.as_query_engine(similarity_top_k=3, llm=llm)
uber_engine = uber_index.as_query_engine(similarity_top_k=3, llm=llm)
'''

'\nlyft_engine = lyft_index.as_query_engine(similarity_top_k=3, llm=llm)\nuber_engine = uber_index.as_query_engine(similarity_top_k=3, llm=llm)\n'

In [11]:
p2025_engine = p2025_index.as_query_engine(similarity_top_k=3, llm=llm)
eo_engine = eo_index.as_query_engine(similarity_top_k=3, llm=llm)

We can now define the query engines as tools that will be used by the agent.

As there is a query engine per document we need to also define one tool for each of them.

In [12]:
'''

query_engine_tools = [
    QueryEngineTool(
        query_engine=lyft_engine,
        metadata=ToolMetadata(
            name="lyft_10k",
            description=(
                "Provides information about Lyft financials for year 2021. "
                "Use a detailed plain text question as input to the tool."
            ),
        ),
    ),
    QueryEngineTool(
        query_engine=uber_engine,
        metadata=ToolMetadata(
            name="uber_10k",
            description=(
                "Provides information about Uber financials for year 2021. "
                "Use a detailed plain text question as input to the tool."
            ),
        ),
    ),
]

'''

'\n\nquery_engine_tools = [\n    QueryEngineTool(\n        query_engine=lyft_engine,\n        metadata=ToolMetadata(\n            name="lyft_10k",\n            description=(\n                "Provides information about Lyft financials for year 2021. "\n                "Use a detailed plain text question as input to the tool."\n            ),\n        ),\n    ),\n    QueryEngineTool(\n        query_engine=uber_engine,\n        metadata=ToolMetadata(\n            name="uber_10k",\n            description=(\n                "Provides information about Uber financials for year 2021. "\n                "Use a detailed plain text question as input to the tool."\n            ),\n        ),\n    ),\n]\n\n'

In [13]:
query_engine_tools = [
    QueryEngineTool(
        query_engine=eo_engine,
        metadata=ToolMetadata(
            name="eo",
            description=(
                "A collection of presidential presidential actions."
                "Use a detailed plain text question as input to the tool."
            ),
        ),
    ),
    QueryEngineTool(
        query_engine=p2025_engine,
        metadata=ToolMetadata(
            name="p2025",
            description=(
                "Provides information about Project 2025."
                "Use a detailed plain text question as input to the tool."
            ),
        ),
    ),
]

## Creating the Agent
Now we have all the elements to create a LlamaIndex ReactAgent

In [14]:
agent = ReActAgent.from_tools(
    query_engine_tools,
    llm=llm,
    verbose=True,
    max_turns=10,
)

Now we can interact with the agent and ask a question.

In [15]:
'''

response = agent.chat("Who had more profit in 2021, Lyft or Uber?")
print(str(response))

'''

'\n\nresponse = agent.chat("Who had more profit in 2021, Lyft or Uber?")\nprint(str(response))\n\n'

In [16]:

'''
response = agent.chat("Identify any sections of the Project 2025 that are supported by the presidential action to enforce the Hyde Amendment. \
                      Provide your answer in the form of Section Name, Chapter Name, and Page Number.")
print(str(response))

'''

'\nresponse = agent.chat("Identify any sections of the Project 2025 that are supported by the presidential action to enforce the Hyde Amendment.                       Provide your answer in the form of Section Name, Chapter Name, and Page Number.")\nprint(str(response))\n\n'

In [17]:
response = agent.chat("Identify any sections of the Project 2025 that are supported by the presidential action to enforce the Hyde Amendment.")
print(str(response))

> Running step d7bbd871-2544-436e-b271-4d6a99cbfc6e. Step input: Identify any sections of the Project 2025 that are supported by the presidential action to enforce the Hyde Amendment.
[1;3;38;5;200mThought: The user is asking for sections of Project 2025 that are supported by the presidential action to enforce the Hyde Amendment. I need to first understand what the Hyde Amendment is and how it is enforced by presidential action. Then, I need to find out if any sections of Project 2025 are supported by this action. I'll use the eo tool to get information about the presidential action related to the Hyde Amendment.
Action: eo
Action Input: {'input': 'What is the presidential action to enforce the Hyde Amendment?'}
[0m[1;3;34mObservation: The presidential action to enforce the Hyde Amendment is an executive order issued on January 24, 2025. This order aims to end the forced use of Federal taxpayer dollars to fund or promote elective abortion, consistent with the Hyde Amendment. It also

In [18]:
response = agent.chat("Identify any sections of the Project 2025 that are supported by the presidential action to remove barriers to american leadership in artificial intelligence.")
print(str(response))

> Running step 4c765f2a-f5c7-450f-b96d-a5647b695298. Step input: Identify any sections of the Project 2025 that are supported by the presidential action to remove barriers to american leadership in artificial intelligence.
[1;3;38;5;200mThought: The current language of the user is: English. I need to use a tool to help me answer the question.
Action: p2025
Action Input: {'input': 'What sections of Project 2025 are supported by the presidential action to remove barriers to American leadership in artificial intelligence?'}
[0m[1;3;34mObservation: The sections of Project 2025 that are supported by the presidential action to remove barriers to American leadership in artificial intelligence include the mandate to better exploit publicly available information (PAI) data and foster innovation to improve collection and analysis, with a specific emphasis on ending the practice of multiple DIE organizations paying for the same PAI data and investing more in machine learning and artificial int

response = agent.chat("For each presidential action, provide a one-sentence description.")
print(str(response))

In [19]:
# response = agent.chat("Which presidential actions align with the goals of Project 2025?")
# response = agent.chat("For each presidential action, describe how it aligns with the goals of Project 2025?")
response = agent.chat("For each presidential action, describe whether or not it aligns with the goals of Project 2025? Provide your answer as a list, with one sentence per presidential action.")



> Running step cc0d514b-5f73-4e45-905c-c12184153f25. Step input: For each presidential action, describe whether or not it aligns with the goals of Project 2025? Provide your answer as a list, with one sentence per presidential action.
[1;3;38;5;200mThought: The user is asking for a summary of how each presidential action aligns with the goals of Project 2025. I need to use the 'eo' tool to get a list of all presidential actions, then I will use the 'p2025' tool to understand the goals of Project 2025. I will then compare each presidential action with the goals of Project 2025.
Action: eo
Action Input: {'input': 'List all presidential actions'}
[0m[1;3;34mObservation: 1. The Department of Education action on May 02, 1980 by President Jimmy Carter. This action set the date for the provisions of the Department of Education Organization Act to take effect and directed the Director of the Office of Management and Budget to effectuate the transfers provided for in the Act.

2. The Enforci

In [20]:
response = agent.chat("Which presidential action involves the Department of Education?")


> Running step 77fee3c7-3965-4dae-87a0-a3b597b1d5f7. Step input: Which presidential action involves the Department of Education?
[1;3;38;5;200mThought: The user is asking about a presidential action involving the Department of Education. I can use the 'eo' tool to find this information.
Action: eo
Action Input: {'input': 'Which presidential action involves the Department of Education?'}
[0m[1;3;34mObservation: The presidential action that involves the Department of Education is the executive order issued on May 2, 1980. This order, authorized by the President under the Constitution and the statutes of the United States, includes the Department of Education Organization Act. It stipulates that the provisions of the Act would take effect on Sunday, May 4, 1980. The Director of the Office of Management and Budget was tasked with taking all necessary actions to effectuate the transfers provided for in the Act.
[0m> Running step 75a66a9b-66ff-4ae9-945e-a98369f470f2. Step input: None
[1