<a href="https://colab.research.google.com/github/imusicmash/Stanford-Tech16-LLM-class/blob/main/StanfordLLMClassWeek5_RAG_Agents.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Colab we used for the class on RAG and Agentic AI
by Al Nevarez

In [None]:
!pip install openai
!pip install sentence-transformers
!pip install langchain pypdf langchain-openai #tiktoken chromadb

In [None]:
# need thes aysnc stuff later for the agent summary to work
!pip install nest-asyncio
import nest_asyncio
nest_asyncio.apply()

# RAG

In [None]:
!pip install llama-index --upgrade

In [None]:
!pip install pypdf



In [None]:
# !wget https://www.goldmansachs.com/intelligence/pages/gs-research/2024-us-equity-outlook-all-you-had-to-do-was-stay/report.pdf
!wget https://www.goldmansachs.com/pdfs/insights/pages/gs-research/2024-us-equity-outlook-all-you-had-to-do-was-stay/report.pdf

In [None]:
from openai import OpenAI
from google.colab import userdata

open_ai_key = userdata.get('openai')
# client = OpenAI(api_key=open_ai_key)

In [None]:
import os
os.environ["OPENAI_API_KEY"] = open_ai_key

# Routing

In [None]:
# Import necessary classes from the llama_index package
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, SummaryIndex
from llama_index.core import Settings

# Read documents from the specified directory and load a specific document, "report.pdf".
documents = SimpleDirectoryReader("./").load_data("report.pdf")

# initialize settings (set chunk size)
Settings.chunk_size = 1024
# think of nodes like the chunks.
nodes = Settings.node_parser.get_nodes_from_documents(documents)

# Create a VectorStoreIndex object from the documents. This will involve processing the documents
# and creating a vector representation for each of them, suitable for semantic searching.
summary_index = SummaryIndex(nodes)
vector_index = VectorStoreIndex(nodes)

summary_query_engine = summary_index.as_query_engine(
    response_mode="tree_summarize",
    use_async=True,
)
vector_query_engine = vector_index.as_query_engine()


Loading files: 100%|██████████| 1/1 [00:01<00:00,  1.17s/file]


In [None]:
nodes[1]

TextNode(id_='457931bd-2a1c-4ad2-8507-52f67f1503b7', embedding=None, metadata={'page_label': '2', 'file_name': 'report.pdf', 'file_path': '/content/report.pdf', 'file_type': 'application/pdf', 'file_size': 491250, 'creation_date': '2024-05-05', 'last_modified_date': '2023-11-21'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='4ceebfda-de3f-4a7a-8a7f-7342f9047349', node_type=<ObjectType.DOCUMENT: '4'>, metadata={'page_label': '2', 'file_name': 'report.pdf', 'file_path': '/content/report.pdf', 'file_type': 'application/pdf', 'file_size': 491250, 'creation_date': '2024-05-05', 'last_modified_date': '2023-11-21'}, hash='9cda9528c7e9585c03c47cba0f2d53f8a1e7a7d7fcab82284544e9958588da6f')}, text='PM Summary: “Lo

In [None]:
# build the one central query engine, which will have several query engines
from llama_index.core.query_engine import RouterQueryEngine
from llama_index.core.selectors import PydanticSingleSelector
from llama_index.core.tools import QueryEngineTool

# it will be up to the router to decide which of the following tools to use
# charlie says that it reasons and makes decision for itself
# we give it 2 tool options


summary_tool = QueryEngineTool.from_defaults(
    query_engine=summary_query_engine,
    description="Useful for summarization questions related to the data source",
    #description = "",
)
vector_tool = QueryEngineTool.from_defaults(
    query_engine=vector_query_engine,
    description="Useful for retrieving specific context related to the data source",
    #description="Useful for generating pictures",
    #description = "",
)

query_engine = RouterQueryEngine(
    selector=PydanticSingleSelector.from_defaults(),
    query_engine_tools=[
        summary_tool,
        vector_tool,
    ],
    verbose=True # this was added later to try to undersand
)


In [None]:
response = query_engine.query("What is the 2024 outlook for US GDP?")
print(response)

# note i tried to trick it by saying the vector tool description is for generating pictures.
# it properly told me it cannot answer with the context it has.

[1;3;38;5;200mSelecting query engine 1: The question 'What is the 2024 outlook for US GDP?' requires retrieving specific context related to the data source..
[0mThe 2024 outlook for US GDP is forecasted to be above-consensus with a growth rate of 2.1%.


In [None]:
response = query_engine.query("Summarize the document")
print(response)

[1;3;38;5;200mSelecting query engine 0: The choice is useful for summarization questions related to the data source..
[0mThe document provides an outlook on the US equity market for 2024, forecasting the S&P 500 index to end the year with a 5% price gain and a total return of 6% including dividends. It discusses factors influencing equity market performance such as GDP growth, profit margins, AI impact on earnings, and interest rates. The report emphasizes the outperformance of mega-cap tech stocks and suggests quality stocks as favorable investments. It covers concerns like recession fears, commercial real estate challenges, antitrust rulings, the US presidential election, and geopolitical tensions. The document also addresses investment recommendations, the impact of economic factors, interest rates, and geopolitical events on market outlook, and the dominance of mega-cap technology stocks in the market. Additionally, it provides insights into earnings growth forecasts, margin expa

In [None]:
response = query_engine.query("Summarize the document in three bullet points and each bullet no more than 20 words")
print(response)

[1;3;38;5;200mSelecting query engine 0: Summarization questions related to the data source.
[0m- Positive market outlook for 2024 with S&P 500 forecasted at 4700, driven by economic growth and stable margins.
- Recommendations include owning quality and growth stocks, with earnings forecast predicting 5% growth in 2024 and 2025.
- Mega-cap tech stocks expected to outperform but with unattractive risk/reward profile due to high expectations.


In [None]:
response = query_engine.query("Is there any particular sector or vertical that it would be best to invest in for most gain?")
print(response)

[1;3;38;5;200mSelecting query engine 1: The question is seeking specific context related to the data source, which would be best addressed by choice (2).
[0mInvesting in quality stocks with strong attributes, growth stocks with high returns on capital, and beaten-down cyclicals could potentially lead to significant gains based on the recommendations provided in the report. These sectors are highlighted as having favorable opportunities due to factors such as quality attributes, stable economic growth, and lower recession risk.


In [None]:
# we discussed exactly how it knows which to use
# someone mentined that Pydantic can also know that it's summary engine underneath.. so it may use both descriptin and type of q engine..
# someone said that if u leave off desciption it may still work
# watch pydanic video https://www.youtube.com/watch?v=yj-wSRJwrrc

# Sub Question Query Engine

In [None]:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.core.tools import QueryEngineTool, ToolMetadata
from llama_index.core.query_engine import SubQuestionQueryEngine
from llama_index.core.callbacks import CallbackManager, LlamaDebugHandler
from llama_index.core import Settings

# Using the LlamaDebugHandler to print the trace of the sub questions
# captured by the SUB_QUESTION callback event type
llama_debug = LlamaDebugHandler(print_trace_on_end=True)
callback_manager = CallbackManager([llama_debug])

Settings.callback_manager = callback_manager

documents = SimpleDirectoryReader("./").load_data("report.pdf")

# build index and query engine and store it in memory
vector_query_engine = VectorStoreIndex.from_documents(
    documents,
    use_async=True,
).as_query_engine()


# setup base query engine as tool
query_engine_tools = [
    QueryEngineTool(
        query_engine=vector_query_engine,
        metadata=ToolMetadata(
            name="documents",
            description="Report",
        ),
    ),
]

query_engine = SubQuestionQueryEngine.from_defaults(
    query_engine_tools=query_engine_tools,
    use_async=True,
)

response = query_engine.query(
    "What is the outlook for the US economy?"
)

print(response)

Loading files: 100%|██████████| 1/1 [00:01<00:00,  1.05s/file]


**********
Trace: index_construction
    |_embedding -> 0.632457 seconds
**********
Generated 2 sub questions.
[1;3;38;2;237;90;200m[documents] Q: What are the current economic indicators for the US economy?
[0m[1;3;38;2;90;149;237m[documents] Q: What are the recent trends in the US GDP growth rate?
[0m[1;3;38;2;90;149;237m[documents] A: The recent trends in the US GDP growth rate indicate that economists are forecasting above-consensus full-year GDP growth of 2.1% in 2024. However, this view is already reflected in current equity prices. Despite some economists forecasting a recession, the performance of cyclical stocks versus defensive stocks is consistent with a 2% real GDP growth regime.
[0m[1;3;38;2;237;90;200m[documents] A: The current economic indicators for the US economy include the fear of a recession among investors despite a low likelihood according to Goldman Sachs, the commercial real estate challenges faced by regional banks, upcoming consequential antitrust rulin

In [None]:
documents

[Document(id_='50f61222-af54-462f-a85c-9fd150cf4a09', embedding=None, metadata={'page_label': '1', 'file_name': 'report.pdf', 'file_path': '/content/report.pdf', 'file_type': 'application/pdf', 'file_size': 491913, 'creation_date': '2024-09-01', 'last_modified_date': '2024-08-30'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={}, text='We forecast the S&P 500 index will end 2024 at 4700, representing a 12-month \nprice gain of 5% and a total return of 6% including dividends. Our baseline \nassumption during the next year is the US economy continues to expand at a \nmodest pace and avoids a recession, earnings rise by 5%, and the valuation of the \nequity market equals 18x, close to the current P/E level. Our forecast falls slightly \nbelow the typical 8% return

In [None]:
# prompt: how do i pull out and list the Document and page numbers and 1st 100 characters for that documents object?

for doc in documents:
  print(f"Document: {doc.metadata['file_name']}")
  print(f"Page Number: {doc.metadata['page_label']}")
  print(f"First 100 characters: {doc.text[:100]}")
  print("-" * 20)


Document: report.pdf
Page Number: 1
First 100 characters: We forecast the S&P 500 index will end 2024 at 4700, representing a 12-month 
price gain of 5% and a
--------------------
Document: report.pdf
Page Number: 2
First 100 characters: PM Summary: “Long Story Short” — S&P 500 will end 2024 at 4700 (+5%)  
The launch of singe r-songwri
--------------------
Document: report.pdf
Page Number: 3
First 100 characters: We remain constructive on US equities, but the current starting point will limit the 
potential appr
--------------------
Document: report.pdf
Page Number: 4
First 100 characters: Third, our top-down valuation model suggests that at a forward P/E multiple of  
19x, the aggregate 
--------------------
Document: report.pdf
Page Number: 5
First 100 characters: We were right, but wrong. 
Published exactly one year ago, our 2023 US Equity Outlook was subtitled 
--------------------
Document: report.pdf
Page Number: 6
First 100 characters: Although we forecast a below-average annua

In [None]:
# just me playing around with other questions
# note in the past example, we didn't ask about GDP growth.. it figure out on it's own it's important to use this
response = query_engine.query(
    "How will artificial intelligence impact the outlook for the US economy?"
)

print(response)

Generated 2 sub questions.
[1;3;38;2;237;90;200m[documents] Q: What are the current trends in artificial intelligence adoption in the US economy?
[0m[1;3;38;2;90;149;237m[documents] Q: How has artificial intelligence influenced job creation and automation in the US economy?
[0m[1;3;38;2;90;149;237m[documents] A: Artificial intelligence has had an impact on both job creation and automation in the US economy. Some companies have benefited in the short term from the demand for computing power to run AI large language models, which has led to job creation in certain areas. Additionally, AI adoption is expected to enhance labor productivity in the long term, potentially leading to automation of certain tasks and processes. This could result in increased efficiency and changes in the nature of work in the future.
[0m[1;3;38;2;237;90;200m[documents] A: Artificial intelligence adoption in the US economy has shown some recent trends. There has been a surge in enthusiasm regarding AI, wit

# Calling OpenAI AssistantAPI from llama index (Code interpreter)

In [None]:
# gives model permssion to run the python code on the server
# not using any document for this.. just the model and assistant api alone
# assume it's using the gpt4 turbo model.
# darn getting an error on file_ids in may 2024

from llama_index.agent.openai import OpenAIAssistantAgent

# these are tools specific to openai assistant api

agent = OpenAIAssistantAgent.from_new(
    name="Python agent",
    openai_tools=[{"type": "code_interpreter"}],
    instructions="You are an expert at writing python code to solve problems.",
    verbose=True
)

response = agent.chat("Calculate 2+2 and show the python code")

TypeError: Assistants.create() got an unexpected keyword argument 'file_ids'

In [None]:
# gives model permssion to run the python code on the server
# not using any document for this.. just the model and assistant api alone
# assume it's using the gpt4 turbo model.
from llama_index.agent.openai import OpenAIAssistantAgent

# these are tools specific to openai assistant api

agent = OpenAIAssistantAgent.from_new(
    name="Python agent",
    openai_tools=[{"type": "code_interpreter"}],
    instructions="You are an expert at writing python code to solve problems.",
    verbose=True
)

response = agent.chat(
    """Generate code to answer the following question:
    How much is the us population likely to grow to by 2030?
    To calculte the year, call python code to figure out what year it is.
    Return and answer and the code used."""
)

TypeError: Assistants.create() got an unexpected keyword argument 'file_ids'

In [None]:
print(str(response))

Based on the hypothetical figures provided, the estimated US population by the year 2030 is approximately 348.6 million people.

Here is the Python code used to perform the estimation:

```python
# Given data
current_population = 332_000_000  # Current US population estimate in 2023
annual_growth_rate = 0.007  # Average annual growth rate (in decimal form)
target_year = 2030
current_year = 2023
years_into_future = target_year - current_year

# Calculate the future population
future_population = current_population * ((1 + annual_growth_rate) ** years_into_future)
```

This calculation is a simplified model that assumes a constant growth rate and does not account for other demographic factors (like immigration, emigration, birth rates, and death rates) that could influence the actual future population. For a more precise prediction, a more detailed demographic model would need to be used, and updated population statistics and growth rates should be obtained from a reliable source such as

In [None]:
response = agent.chat("Calculate 2+2 and show the python code")
print(str(response))

The result of calculating 2 + 2 is 4.

Here is the Python code used to perform the calculation:

```python
# Python code to calculate 2+2
result = 2 + 2
```


In [None]:
response = agent.chat(
    """Generate code to answer the following question:
    Use the Titanic data set from Kaggle, and write python code to build a decision tree model that can predict if a passenger survived or not.
    Return and answer and the code used."""
)
print(str(response))

To build a decision tree model to predict passenger survival on the Titanic, we would need the Titanic dataset from Kaggle. Typically, this would involve downloading the data, performing data analysis, cleaning, and preprocessing, then training and testing a machine learning model.

As I don't have access to the internet to download the dataset from Kaggle, I'll assume the dataset is in the commonly used format with features like 'Pclass' (passenger class), 'Sex', 'Age', 'SibSp' (siblings/spouses aboard), 'Parch' (parents/children aboard), 'Fare', 'Embarked' (port of embarkation), and the target variable 'Survived'.

Here is a general outline of the steps you'd take to build the decision tree model using the scikit-learn library in Python:

1. Load the dataset.
2. Perform exploratory data analysis to understand the data.
3. Preprocess the data (handle missing values, convert categorical variables to numeric, etc.).
4. Split the dataset into a training set and a test set.
5. Instantiate

In [None]:
# he did an exampl.. of providing code and it explains the code
# chatgpt front end:
# comment this code... and past your code snippet

# and if you're in vscode
# there are extnsions.. "comment this code"
# there's a colab ai also he used..
# you can also use the google colab ai on the upper right..
# very cool
# this doesn't know the cell context.. so you can just paste it in.. i.e. comment this code, etc..


# ReAct

In [None]:
# he commented that if we got to just before here we're on a good place
# from here, it's beyhond simple sub query or routing, how do we make a decision
# get a response and based on the response
# more advanced.. it will choose the next best action
# seems these next few lines are about persistence in vector db
try:
    storage_context = StorageContext.from_defaults(
        persist_dir="./storage/lyft"
    )
    lyft_index = load_index_from_storage(storage_context)

    storage_context = StorageContext.from_defaults(
        persist_dir="./storage/uber"
    )
    uber_index = load_index_from_storage(storage_context)

    index_loaded = True
except:
    index_loaded = False

In [None]:
# download 2 10ks
!mkdir -p 'data/10k/'
# these pdfs were not longer accessible!
# !wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/10k/uber_2021.pdf' -O 'data/10k/uber_2021.pdf'
# !wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/10k/lyft_2021.pdf' -O 'data/10k/lyft_2021.pdf'

!wget 'https://stocklight.com/stocks/us/nyse-uber/uber-technologies/annual-reports/nyse-uber-2021-10K-21693896.pdf' -O 'data/10k/uber_2021.pdf'
!wget 'https://stocklight.com/stocks/us/nasdaq-lyft/lyft-inc-cls-a/annual-reports/nasdaq-lyft-2021-10K-21697690.pdf' -O 'data/10k/lyft_2021.pdf'

--2024-05-06 01:06:12--  https://stocklight.com/stocks/us/nyse-uber/uber-technologies/annual-reports/nyse-uber-2021-10K-21693896.pdf
Resolving stocklight.com (stocklight.com)... 3.226.182.14, 54.237.159.171, 52.21.227.162, ...
Connecting to stocklight.com (stocklight.com)|3.226.182.14|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 172153 (168K) [application/pdf]
Saving to: ‘data/10k/uber_2021.pdf’


2024-05-06 01:06:13 (1.39 MB/s) - ‘data/10k/uber_2021.pdf’ saved [172153/172153]

--2024-05-06 01:06:13--  https://stocklight.com/stocks/us/nasdaq-lyft/lyft-inc-cls-a/annual-reports/nasdaq-lyft-2021-10K-21697690.pdf
Resolving stocklight.com (stocklight.com)... 3.226.182.14, 54.237.159.171, 52.21.227.162, ...
Connecting to stocklight.com (stocklight.com)|3.226.182.14|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 171758 (168K) [application/pdf]
Saving to: ‘data/10k/lyft_2021.pdf’


2024-05-06 01:06:13 (1.39 MB/s) - ‘data/10k/lyft_2021.pdf

In [None]:
# download the data and index them..
if not index_loaded:
    # load data
    lyft_docs = SimpleDirectoryReader(
        input_files=["./data/10k/lyft_2021.pdf"]
    ).load_data()
    uber_docs = SimpleDirectoryReader(
        input_files=["./data/10k/uber_2021.pdf"]
    ).load_data()

    # build index
    lyft_index = VectorStoreIndex.from_documents(lyft_docs)
    uber_index = VectorStoreIndex.from_documents(uber_docs)

    # persist index
    lyft_index.storage_context.persist(persist_dir="./storage/lyft")
    uber_index.storage_context.persist(persist_dir="./storage/uber")

**********
Trace: index_construction
    |_embedding -> 0.506651 seconds
**********
**********
Trace: index_construction
    |_embedding -> 0.451118 seconds
**********


In [None]:
lyft_engine = lyft_index.as_query_engine(similarity_top_k=3)
uber_engine = uber_index.as_query_engine(similarity_top_k=3)

In [None]:
# create tools..
query_engine_tools = [
    QueryEngineTool(
        query_engine=lyft_engine,
        metadata=ToolMetadata(
            name="lyft_10k",
            description=(
                "Provides information about Lyft financials for year 2021. "
                "Use a detailed plain text question as input to the tool."
            ),
        ),
    ),
    QueryEngineTool(
        query_engine=uber_engine,
        metadata=ToolMetadata(
            name="uber_10k",
            description=(
                "Provides information about Uber financials for year 2021. "
                "Use a detailed plain text question as input to the tool."
            ),
        ),
    ),
]

In [None]:
# this is where it gets diferent.. we're importing a react agent
# beyond just rasoning or acting only.. it's going to multi step
# it uses open ai models by default..
# this will require complex reasoning.. hence the better the model, the better the reasoning
# openai has best model, so it's your safest.. but u can use open source mode
# this is all very cutting edge..
# see this website for more detail. not how you can alter the LLM here
# https://docs.llamaindex.ai/en/stable/examples/agent/react_agent_with_query_engine/

from llama_index.core.agent import ReActAgent
agent = ReActAgent.from_tools(
    query_engine_tools,
    verbose=True,
    # context=context
)

In [None]:
# remember that models are not good at math
# he'd suggest ot call open ai code interpretter tool to do the analysis with python code.
response = agent.chat(
    "Compare the risk of investing in Uber and Lyft and return a table"
)
print(str(response))

[1;3;38;5;200mThought: I need to use the financial data from both Uber and Lyft to compare the risk of investing in these companies.
Action: uber_10k
Action Input: {'input': 'Please provide information on the risk factors for investing in Uber in 2021.'}
[0m[1;3;34mObservation: Investing in Uber in 2021 carries certain risk factors. These include regulatory challenges related to how drivers are classified, such as the impact of regulations like California's Assembly Bill 5 and Proposition 22. Additionally, Uber faces regulatory scrutiny and operational challenges in various jurisdictions globally, such as license reviews in London, operational requirements in Mexico City, and regulatory changes affecting services in cities like Barcelona and New York City. Moreover, Uber competes in highly fragmented markets against well-established alternatives and new market entrants, which could impact its financial performance.
[0m[1;3;38;5;200mThought: I have gathered information on the risk 

In [None]:
# question came up in class.. what if you just ask it to compare the 2 docs and nothing else
# rag is primary tool it's using here
# quality here is how good the base llm is.. and models are not that good at reasoning yet
response = agent.chat(
    "Conduct an investment analysis on Lyft and Uber"
)
print(str(response))

[1;3;38;5;200mThought: I can answer without using any more tools. I'll use the user's language to answer
Answer: When conducting an investment analysis on Lyft and Uber, it's essential to consider various factors such as financial performance, market position, growth potential, competitive landscape, regulatory environment, and overall industry trends. Analyzing key financial metrics, growth projections, strategic initiatives, and risk factors can help investors make informed decisions about investing in Lyft or Uber. Additionally, comparing valuation metrics, profitability, revenue growth, and market share can provide insights into the investment potential of each company.
[0m**********
Trace: chat
    |_agent_step -> 2.022413 seconds
      |_llm -> 2.018774 seconds
**********
When conducting an investment analysis on Lyft and Uber, it's essential to consider various factors such as financial performance, market position, growth potential, competitive landscape, regulatory environme

In [None]:
# in chatgpr he also typeed
# return a vis if stock returns

# another one is LLMCompiler.. similar idea
# see slides for class 5

# currently react does no long term planning.. it's sequential
# once it can do long term.. then it's like AGI..

# MSFT AutoGen