<a href="https://colab.research.google.com/github/brikesh987/intro_to_llm_agents/blob/main/LlmaIndex_AgentIntro.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Create LlmaIndex Agent to interact with a complex document**
1. Example uses interacting with [Amazon's 2023 annual report](https://s2.q4cdn.com/299287126/files/doc_financials/2024/ar/Amazon-com-Inc-2023-Annual-Report.pdf) :
2. Download the Aamazon 2023 Report. It's 92 page document
3. Save the report in local directory
4. Setting: Import the LlamaIndex. OpenAI libraries and keys
5. Specify the LLM to use in the LlmaIndex settings object
6. Create and add tools. This examle creates VectorStoreIndex tool , which is storing only the embeddings
7. Create an Agent. Note we can create new Agent using the AgentRunner and AgentWorker classes. In this example I've used the StructuredPlannerAgent class which wraps any agent worker (ReAct, Function Calling, Chain-of-Abstraction, etc.) and decomposes an initial input into several sub-tasks. Each sub-task is represented by an input, expected outcome, and any dependendant sub-tasks that should be completed first.
8. Give a complex question to the agent

In [1]:
%pip install llama-index-agent-openai
%pip install llama-index-llms-openai

Collecting llama-index-agent-openai
  Downloading llama_index_agent_openai-0.2.5-py3-none-any.whl (13 kB)
Collecting llama-index-core<0.11.0,>=0.10.35 (from llama-index-agent-openai)
  Downloading llama_index_core-0.10.39.post1-py3-none-any.whl (15.4 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m15.4/15.4 MB[0m [31m38.2 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting llama-index-llms-openai<0.2.0,>=0.1.5 (from llama-index-agent-openai)
  Downloading llama_index_llms_openai-0.1.21-py3-none-any.whl (11 kB)
Collecting openai>=1.14.0 (from llama-index-agent-openai)
  Downloading openai-1.30.3-py3-none-any.whl (320 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m320.6/320.6 kB[0m [31m20.3 MB/s[0m eta [36m0:00:00[0m
Collecting dataclasses-json (from llama-index-core<0.11.0,>=0.10.35->llama-index-agent-openai)
  Downloading dataclasses_json-0.6.6-py3-none-any.whl (28 kB)
Collecting deprecated>=1.2.9.3 (from llama-index-core<0.11.0,>=0.10.35->lla

In [2]:
!pip install llama-index

Collecting llama-index
  Downloading llama_index-0.10.39-py3-none-any.whl (6.8 kB)
Collecting llama-index-cli<0.2.0,>=0.1.2 (from llama-index)
  Downloading llama_index_cli-0.1.12-py3-none-any.whl (26 kB)
Collecting llama-index-embeddings-openai<0.2.0,>=0.1.5 (from llama-index)
  Downloading llama_index_embeddings_openai-0.1.10-py3-none-any.whl (6.2 kB)
Collecting llama-index-indices-managed-llama-cloud<0.2.0,>=0.1.2 (from llama-index)
  Downloading llama_index_indices_managed_llama_cloud-0.1.6-py3-none-any.whl (6.7 kB)
Collecting llama-index-legacy<0.10.0,>=0.9.48 (from llama-index)
  Downloading llama_index_legacy-0.9.48-py3-none-any.whl (2.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.0/2.0 MB[0m [31m25.7 MB/s[0m eta [36m0:00:00[0m
Collecting llama-index-multi-modal-llms-openai<0.2.0,>=0.1.3 (from llama-index)
  Downloading llama_index_multi_modal_llms_openai-0.1.6-py3-none-any.whl (5.8 kB)
Collecting llama-index-program-openai<0.2.0,>=0.1.3 (from llam

In [3]:
!mkdir -p 'sample_data/'
!wget 'https://s2.q4cdn.com/299287126/files/doc_financials/2024/ar/Amazon-com-Inc-2023-Annual-Report.pdf' -O 'sample_data/amazon_2023.pdf'


--2024-05-25 15:16:00--  https://s2.q4cdn.com/299287126/files/doc_financials/2024/ar/Amazon-com-Inc-2023-Annual-Report.pdf
Resolving s2.q4cdn.com (s2.q4cdn.com)... 68.70.205.2, 68.70.205.3, 68.70.205.1, ...
Connecting to s2.q4cdn.com (s2.q4cdn.com)|68.70.205.2|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1314396 (1.3M) [application/pdf]
Saving to: ‘sample_data/amazon_2023.pdf’


2024-05-25 15:16:00 (6.09 MB/s) - ‘sample_data/amazon_2023.pdf’ saved [1314396/1314396]



In [22]:
from google.colab import userdata
openai_api_key = userdata.get('openai_api_key')
print(openai_api_key)

sk-proj-6MIiP3fTaVQKM8o6ygVIT3BlbkFJvCjoqX0ETI9JTQaJKmgv


In [23]:
import os

os.environ["OPENAI_API_KEY"] = openai_api_key

**Import LlmaIndex libraries and initialize the OpenAI**

In [24]:
from llama_index.core import Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding

# Use ollama in JSON mode
Settings.llm = OpenAI(
    model="gpt-4o",
    temperature=0.1,
)
Settings.embed_model = OpenAIEmbedding(model_name="text-embedding-3-small")

**Read the document and create the tool**
1. In this example we're using the inbuild SimpleDIrectoryReader and VectorStore index, but this could be any other data source as well.

In [25]:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.core.tools import QueryEngineTool

# Load documents, create tools
amazon_documents = SimpleDirectoryReader(
    input_files=["./sample_data/amazon_2023.pdf"]
).load_data()

amazon_index = VectorStoreIndex.from_documents(amazon_documents)

amazon_tool = QueryEngineTool.from_defaults(
    amazon_index.as_query_engine(),
    name="amazon_2023",
    description="Useful for asking questions about amazon's 2023 report filling.",
)



**Create an Agent**

In [26]:
from llama_index.core.agent import (
    StructuredPlannerAgent,
    FunctionCallingAgentWorker,
    ReActAgentWorker,
)

# create the function calling worker for reasoning
worker = FunctionCallingAgentWorker.from_tools(
    [amazon_tool], verbose=True
)

# wrap the worker in the top-level planner
agent = StructuredPlannerAgent(
    worker, tools=[amazon_tool], verbose=True
)

In [27]:
import nest_asyncio

nest_asyncio.apply()

**Give a complex question**

In [28]:
response = agent.chat(
    "Summarize the key risk factors for Amazon in their 20203 report."
)

=== Initial plan ===
Identify Key Sections:
Identify the key sections in Amazon's 2023 report that discuss risk factors. -> A list of key sections in the report that discuss risk factors.
deps: []


Extract Risk Factors:
Extract the risk factors from the identified sections in Amazon's 2023 report. -> A list of risk factors mentioned in the identified sections.
deps: ['Identify Key Sections']


Summarize Risk Factors:
Summarize the extracted risk factors into key points. -> A summarized list of key risk factors for Amazon in their 2023 report.
deps: ['Extract Risk Factors']


> Running step 68e17771-8ea0-496c-bb6a-a06895785bb1. Step input: Identify the key sections in Amazon's 2023 report that discuss risk factors.
Added user message to memory: Identify the key sections in Amazon's 2023 report that discuss risk factors.
=== Calling Function ===
Calling function: amazon_2023 with args: {"input": "Identify the key sections in Amazon's 2023 report that discuss risk factors."}
=== Function