<a href="https://colab.research.google.com/github/imusicmash/agents/blob/main/Agents_ReAct_Llamaindex.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Colab for Agentic AI using the ReAct agent framework
by Al Nevarez

In [None]:
!pip install openai
!pip install sentence-transformers
!pip install langchain pypdf langchain-openai #tiktoken chromadb

In [None]:
# need thes aysnc stuff later for the agent summary to work
!pip install nest-asyncio
import nest_asyncio
nest_asyncio.apply()

# RAG

In [None]:
!pip install llama-index --upgrade

In [None]:
!pip install pypdf



In [None]:
# !wget https://www.goldmansachs.com/intelligence/pages/gs-research/2024-us-equity-outlook-all-you-had-to-do-was-stay/report.pdf
!wget https://www.goldmansachs.com/pdfs/insights/pages/gs-research/2024-us-equity-outlook-all-you-had-to-do-was-stay/report.pdf

In [None]:
from openai import OpenAI
from google.colab import userdata

open_ai_key = userdata.get('openai')
# client = OpenAI(api_key=open_ai_key)

In [None]:
import os
os.environ["OPENAI_API_KEY"] = open_ai_key

# ReAct

In [None]:
# he commented that if we got to just before here we're on a good place
# from here, it's beyhond simple sub query or routing, how do we make a decision
# get a response and based on the response
# more advanced.. it will choose the next best action
# seems these next few lines are about persistence in vector db
try:
    storage_context = StorageContext.from_defaults(
        persist_dir="./storage/lyft"
    )
    lyft_index = load_index_from_storage(storage_context)

    storage_context = StorageContext.from_defaults(
        persist_dir="./storage/uber"
    )
    uber_index = load_index_from_storage(storage_context)

    index_loaded = True
except:
    index_loaded = False

In [None]:
# download 2 10ks
!mkdir -p 'data/10k/'
# these pdfs were not longer accessible!
# !wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/10k/uber_2021.pdf' -O 'data/10k/uber_2021.pdf'
# !wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/10k/lyft_2021.pdf' -O 'data/10k/lyft_2021.pdf'

!wget 'https://stocklight.com/stocks/us/nyse-uber/uber-technologies/annual-reports/nyse-uber-2021-10K-21693896.pdf' -O 'data/10k/uber_2021.pdf'
!wget 'https://stocklight.com/stocks/us/nasdaq-lyft/lyft-inc-cls-a/annual-reports/nasdaq-lyft-2021-10K-21697690.pdf' -O 'data/10k/lyft_2021.pdf'

In [None]:
# download the data and index them..
if not index_loaded:
    # load data
    lyft_docs = SimpleDirectoryReader(
        input_files=["./data/10k/lyft_2021.pdf"]
    ).load_data()
    uber_docs = SimpleDirectoryReader(
        input_files=["./data/10k/uber_2021.pdf"]
    ).load_data()

    # build index
    lyft_index = VectorStoreIndex.from_documents(lyft_docs)
    uber_index = VectorStoreIndex.from_documents(uber_docs)

    # persist index
    lyft_index.storage_context.persist(persist_dir="./storage/lyft")
    uber_index.storage_context.persist(persist_dir="./storage/uber")

**********
Trace: index_construction
    |_embedding -> 0.506651 seconds
**********
**********
Trace: index_construction
    |_embedding -> 0.451118 seconds
**********


In [None]:
lyft_engine = lyft_index.as_query_engine(similarity_top_k=3)
uber_engine = uber_index.as_query_engine(similarity_top_k=3)

In [None]:
# create tools..
query_engine_tools = [
    QueryEngineTool(
        query_engine=lyft_engine,
        metadata=ToolMetadata(
            name="lyft_10k",
            description=(
                "Provides information about Lyft financials for year 2021. "
                "Use a detailed plain text question as input to the tool."
            ),
        ),
    ),
    QueryEngineTool(
        query_engine=uber_engine,
        metadata=ToolMetadata(
            name="uber_10k",
            description=(
                "Provides information about Uber financials for year 2021. "
                "Use a detailed plain text question as input to the tool."
            ),
        ),
    ),
]

In [None]:
# this is where it gets diferent.. we're importing a react agent
# beyond just rasoning or acting only.. it's going to multi step
# it uses open ai models by default..
# this will require complex reasoning.. hence the better the model, the better the reasoning
# openai has best model, so it's your safest.. but u can use open source mode
# this is all very cutting edge..
# see this website for more detail. not how you can alter the LLM here
# https://docs.llamaindex.ai/en/stable/examples/agent/react_agent_with_query_engine/

from llama_index.core.agent import ReActAgent
agent = ReActAgent.from_tools(
    query_engine_tools,
    verbose=True,
    # context=context
)

In [None]:
# remember that models are not good at math
# he'd suggest ot call open ai code interpretter tool to do the analysis with python code.
response = agent.chat(
    "Compare the risk of investing in Uber and Lyft and return a table"
)
print(str(response))

[1;3;38;5;200mThought: I need to use the financial data from both Uber and Lyft to compare the risk of investing in these companies.
Action: uber_10k
Action Input: {'input': 'Please provide information on the risk factors for investing in Uber in 2021.'}
[0m[1;3;34mObservation: Investing in Uber in 2021 carries certain risk factors. These include regulatory challenges related to how drivers are classified, such as the impact of regulations like California's Assembly Bill 5 and Proposition 22. Additionally, Uber faces regulatory scrutiny and operational challenges in various jurisdictions globally, such as license reviews in London, operational requirements in Mexico City, and regulatory changes affecting services in cities like Barcelona and New York City. Moreover, Uber competes in highly fragmented markets against well-established alternatives and new market entrants, which could impact its financial performance.
[0m[1;3;38;5;200mThought: I have gathered information on the risk 

In [None]:
# question came up in class.. what if you just ask it to compare the 2 docs and nothing else
# rag is primary tool it's using here
# quality here is how good the base llm is.. and models are not that good at reasoning yet
response = agent.chat(
    "Conduct an investment analysis on Lyft and Uber"
)
print(str(response))

[1;3;38;5;200mThought: I can answer without using any more tools. I'll use the user's language to answer
Answer: When conducting an investment analysis on Lyft and Uber, it's essential to consider various factors such as financial performance, market position, growth potential, competitive landscape, regulatory environment, and overall industry trends. Analyzing key financial metrics, growth projections, strategic initiatives, and risk factors can help investors make informed decisions about investing in Lyft or Uber. Additionally, comparing valuation metrics, profitability, revenue growth, and market share can provide insights into the investment potential of each company.
[0m**********
Trace: chat
    |_agent_step -> 2.022413 seconds
      |_llm -> 2.018774 seconds
**********
When conducting an investment analysis on Lyft and Uber, it's essential to consider various factors such as financial performance, market position, growth potential, competitive landscape, regulatory environme

In [None]:
# in chatgpr he also typeed
# return a vis if stock returns

# another one is LLMCompiler.. similar idea
# see slides for class 5

# currently react does no long term planning.. it's sequential
# once it can do long term.. then it's like AGI..