# 🔬 Virtual Lab 4: Running Llama Models on Ollama

<div style="border: 2px solid #4CAF50; padding: 15px; border-radius: 10px; background-color: #f4f4f4;">
    
### 🚀 **Platform**  
**Ollama**

### 🏷️ **Models & Sizes**  
- **llama3.2:1b** →  **1.3GB**  
- **llama3.2:3b** →  **2.0GB**  
- **llama3.2:3b-instruct-fp16** →  **6.4GB**  

### 🛠️ **Frameworks Used**  
- **LlamaIndex**  
- **LangChain / LangGraph**  

</div>

## Model Performance and Memory Requirements

### **Model: llama3.2:3b**  
- Using Ollama, it takes roughly **10 minutes** to complete.  
- Requires **8GB minimum main memory**.  

### **Model: llama3.2:1b**  
- Using Ollama, it **doesn't work for creating AI Agents**.  
- Requires **8GB minimum main memory**.  

### **Model: llama3.2:3b-instruct-fp16**  
- Using Ollama, it requires **at least 32GB main memory** to work.  
- Still, it is **very slow** (generates **one token per second** for output).  

## Installation and Setup

In [1]:
!pip install ollama
!pip install llama-index
!pip install llama-index-llms-ollama
!pip install llama-index-llms-replicate
!pip install llama-index-embeddings-huggingface
!pip install llama-parse
!pip install replicate



In [2]:
!ollama pull llama3.2:1b

[?2026h[?25l[1Gpulling manifest ⠋ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠙ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠹ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠸ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest [K
pulling 74701a8c35f6... 100% ▕████████████████▏ 1.3 GB                         [K
pulling 966de95ca8a6... 100% ▕████████████████▏ 1.4 KB                         [K
pulling fcc5a6bec9da... 100% ▕████████████████▏ 7.7 KB                         [K
pulling a70ff7e570d9... 100% ▕████████████████▏ 6.0 KB                         [K
pulling 4f659a1e86d7... 100% ▕████████████████▏  485 B                         [K
verifying sha256 digest [K
writing manifest [K
success [K[?25h[?2026l


In [3]:
!ollama pull llama3.2:3b

[?2026h[?25l[1Gpulling manifest ⠙ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠙ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest [K
pulling dde5aa3fc5ff... 100% ▕████████████████▏ 2.0 GB                         [K
pulling 966de95ca8a6... 100% ▕████████████████▏ 1.4 KB                         [K
pulling fcc5a6bec9da... 100% ▕████████████████▏ 7.7 KB                         [K
pulling a70ff7e570d9... 100% ▕████████████████▏ 6.0 KB                         [K
pulling 56bb8bd477a5... 100% ▕████████████████▏   96 B                         [K
pulling 34bb5ab01051... 100% ▕████████████████▏  561 B                         [K
verifying sha256 digest [K
writing manifest [K
success [K[?25h[?2026l


In [4]:
!ollama pull llama3.2:3b-instruct-fp16

[?2026h[?25l[1Gpulling manifest ⠋ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠙ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest [K
pulling e2f46f5b501c... 100% ▕████████████████▏ 6.4 GB                         [K
pulling 966de95ca8a6... 100% ▕████████████████▏ 1.4 KB                         [K
pulling fcc5a6bec9da... 100% ▕████████████████▏ 7.7 KB                         [K
pulling a70ff7e570d9... 100% ▕████████████████▏ 6.0 KB                         [K
pulling 56bb8bd477a5... 100% ▕████████████████▏   96 B                         [K
pulling 1bc315994ceb... 100% ▕████████████████▏  558 B                         [K
verifying sha256 digest [K
writing manifest [K
success [K[?25h[?2026l


In [1]:
!ollama list

NAME                         ID              SIZE      MODIFIED          
llama3.2:3b-instruct-fp16    195a8c01d91e    6.4 GB    About an hour ago    
llama3.2:3b                  a80c4f17acd5    2.0 GB    About an hour ago    
llama3.2:1b                  baf6a787fdff    1.3 GB    About an hour ago    


In [2]:
import nest_asyncio

nest_asyncio.apply()

In [3]:
import os
import requests
import zipfile

In [4]:
from llama_index.core.selectors import LLMSingleSelector
from llama_index.core.prompts import PromptTemplate
from llama_index.core.output_parsers import SelectionOutputParser
from llama_index.core.query_engine import RouterQueryEngine
from llama_index.core.tools import QueryEngineTool, ToolMetadata
from llama_index.core import SQLDatabase
from llama_index.core.indices.struct_store import NLSQLTableQueryEngine

### Setup LLM using Ollama

In [6]:
from llama_index.llms.ollama import Ollama

llm = Ollama(model="llama3.2:3b", request_timeout=300.0)
# llm = Ollama(model="llama3.2:1b", request_timeout=300.0)
# llm = Ollama(model="llama3.2:3b-instruct-fp16", request_timeout=300.0)

### Setup Embedding Model

In [7]:
from llama_index.embeddings.huggingface import HuggingFaceEmbedding

embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")

### Define Global Settings Configuration

In LlamaIndex, you can define global settings so you don't have to pass the LLM / embedding model objects everywhere.

In [8]:
from llama_index.core import Settings

Settings.llm = llm
Settings.embed_model = embed_model

### Download Data

Here you'll download data that's used in section 2 and onwards.

We'll download some articles on Kendrick, Drake, and their beef (as of May 2024).

In [9]:
!mkdir -p data
!curl -L "https://www.dropbox.com/scl/fi/t1soxfjdp0v44an6sdymd/drake_kendrick_beef.pdf?rlkey=u9546ymb7fj8lk2v64r6p5r5k&st=wjzzrgil&dl=1" -o data/drake_kendrick_beef.pdf
!curl -L "https://www.dropbox.com/scl/fi/nts3n64s6kymner2jppd6/drake.pdf?rlkey=hksirpqwzlzqoejn55zemk6ld&st=mohyfyh4&dl=1" -o data/drake.pdf
!curl -L "https://www.dropbox.com/scl/fi/8ax2vnoebhmy44bes2n1d/kendrick.pdf?rlkey=fhxvn94t5amdqcv9vshifd3hj&st=dxdtytn6&dl=1" -o data/kendrick.pdf

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    17  100    17    0     0     41      0 --:--:-- --:--:-- --:--:--    41
100   475    0   475    0     0    382      0 --:--:--  0:00:01 --:--:--     0
100 47.0M  100 47.0M    0     0  17.0M      0  0:00:02  0:00:02 --:--:-- 61.9M
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    17  100    17    0     0     33      0 --:--:-- --:--:-- --:--:--    33
100   475    0   475    0     0    518      0 --:--:-- --:--:-- --:--:--   518
100 4483k  100 4483k    0     0  3165k      0  0:00:01  0:00:01 --:--:-- 3165k
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    17  100    17    0     0     34      0 --:--:

### Load Data

We load data using LlamaParse by default, but you can also choose to opt for our free pypdf reader (in SimpleDirectoryReader by default) if you don't have an account! 

1. LlamaParse: Signup for an account here: cloud.llamaindex.ai. You get 1k free pages a day, and paid plan is 7k free pages + 0.3c per additional page. LlamaParse is a good option if you want to parse complex documents, like PDFs with charts, tables, and more. 

2. Default PDF Parser (In `SimpleDirectoryReader`). If you don't want to signup for an account / use a PDF service, just use the default PyPDF reader bundled in our file loader. It's a good choice for getting started!

In [11]:
from llama_index.core import SimpleDirectoryReader

docs_kendrick = SimpleDirectoryReader(input_files=["data/kendrick.pdf"]).load_data()
docs_drake = SimpleDirectoryReader(input_files=["data/drake.pdf"]).load_data()
docs_both = SimpleDirectoryReader(input_files=["data/drake_kendrick_beef.pdf"]).load_data()

## 1. Basic Completion and Chat

### Call complete with a prompt

In [12]:
response = llm.complete("do you like drake or kendrick better?")
print(response)

As a neutral AI, I don't have personal preferences or opinions, nor do I have the ability to enjoy music in the way humans do. My purpose is to provide information and assist with tasks, but I can certainly help facilitate discussions about Drake and Kendrick Lamar if you're interested!

Both artists are highly acclaimed and influential figures in contemporary hip-hop, known for their unique styles, lyrical prowess, and contributions to the genre. Drake's introspective and emotive lyrics, as well as his blend of hip-hop and R&B, have resonated with many fans worldwide. Kendrick Lamar, on the other hand, is celebrated for his socially conscious themes, complex rhyme schemes, and storytelling abilities.

Ultimately, it comes down to individual taste and personal preference. Some people might prefer Drake's melodic flow and emotive delivery, while others might appreciate Kendrick's more aggressive and thought-provoking style.

What about you? Do you have a favorite between the two artists

In [13]:
stream_response = llm.stream_complete(
    "you're a drake fan. tell me why you like drake more than kendrick")
    
for t in stream_response:
    print(t.delta, end="")

As a Drake fan, I gotta say that his music and artistry have resonated with me on a deeper level than Kendrick's. Now, don't get me wrong, Kendrick is an incredible artist, but for me, it's all about the vibes and the emotions that Drake brings to the table.

First of all, I love how unapologetic and authentic Drake is in his music. He wears his heart on his sleeve, and you can tell he's always being true to himself, even when it's hard. His lyrics are like a diary entry, where he shares his thoughts, feelings, and experiences with the world. It's like you're reading his private journal, but in a way that's both intimate and relatable.

Kendrick, on the other hand, is more of an activist and a storyteller. His music is like a sermon, where he shares his message of hope and social justice with the world. Don't get me wrong, I respect Kendrick's passion and conviction, but sometimes it feels like he's trying to preach to me instead of just sharing his story.

Drake, on the other hand, is

### Call chat with a list of messages

In [15]:
from llama_index.core.llms import ChatMessage

messages = [
    ChatMessage(role="system", content="You are Kendrick."),
    ChatMessage(role="user", content="Write a verse."),
]
response = llm.chat(messages)

In [16]:
print(response)

assistant: "I see the city lights, they're callin' my name
A world of strangers, but I know their pain
The struggles and the strife, the trials and the tears
We're all just tryna find our way, through the passing years"


## 2. Basic RAG (Vector Search, Summarization)

### Basic RAG (Vector Search)
##### Retrieval-Augmented Generation

In [17]:
from llama_index.core import VectorStoreIndex

index = VectorStoreIndex.from_documents(docs_both)
query_engine = index.as_query_engine(similarity_top_k=3)

In [18]:
response = query_engine.query("Tell me about family matters")

In [22]:
print(str(response))

The beef between Kendrick Lamar and Drake appears to have reached a turning point with both artists delivering strong disses in response to each other's lyrics. While both Kendrick Lamar ("Not Like Us") and Drake ("The Heart Part 6") showcased their skills and confidence in their lyrics, the question of who "won" this beef remains subjective.

Kendrick Lamar's track seems to have set the tone for the exchange, with his verses targeting Drake's past relationships, career choices, and personal life. In response, Drake delivered a record that co-opted some of Kendrick's recurring series and incorporated classic soul samples.

Ultimately, it is up to individual interpretation who "won" this beef. Both artists demonstrated their skills and confidence in their lyrics, leaving fans eagerly anticipating the next move in the conflict.


### Basic RAG (Summarization)

In [20]:
from llama_index.core import SummaryIndex

summary_index = SummaryIndex.from_documents(docs_both)
summary_engine = summary_index.as_query_engine()

In [21]:
response = summary_engine.query("Given your assessment of this article, who won the beef?")

In [23]:
print(str(response))

The beef between Kendrick Lamar and Drake appears to have reached a turning point with both artists delivering strong disses in response to each other's lyrics. While both Kendrick Lamar ("Not Like Us") and Drake ("The Heart Part 6") showcased their skills and confidence in their lyrics, the question of who "won" this beef remains subjective.

Kendrick Lamar's track seems to have set the tone for the exchange, with his verses targeting Drake's past relationships, career choices, and personal life. In response, Drake delivered a record that co-opted some of Kendrick's recurring series and incorporated classic soul samples.

Ultimately, it is up to individual interpretation who "won" this beef. Both artists demonstrated their skills and confidence in their lyrics, leaving fans eagerly anticipating the next move in the conflict.


## 3. Advanced RAG (Routing, Sub-Questions)

### Build a Router that can choose whether to do vector search or summarization

In [24]:
from llama_index.core.tools import QueryEngineTool, ToolMetadata

vector_tool = QueryEngineTool(
    index.as_query_engine(),
    metadata=ToolMetadata(
        name="vector_search",
        description="Useful for searching for specific facts.",
    ),
)

summary_tool = QueryEngineTool(
    index.as_query_engine(response_mode="tree_summarize"),
    metadata=ToolMetadata(
        name="summary",
        description="Useful for summarizing an entire document.",
    ),
)

In [26]:
from llama_index.core.query_engine import RouterQueryEngine

query_engine = RouterQueryEngine.from_defaults(
    [vector_tool, summary_tool], select_multi=False, verbose=True
)

response = query_engine.query("Tell me about the song meet the grahams - why is it significant")

[1;3;38;5;200mSelecting query engine 1: The song 'Meet the Graham's' by The Grateful Dead is significant because it was written during a period of creative and experimental growth in their career, showcasing their ability to blend different musical styles and genres..
[0m

In [27]:
print(response)

This song is significant because it marks a turning point in the ongoing feud between Kendrick Lamar and Drake. The track is notable for its release on the heels of "Meet the Grahams," which some view as an attempt by team PG to regain momentum, but ultimately comes across as a desperate move. Despite this, the song has sparked a significant response from both artists, with many in the hip-hop community weighing in on their thoughts and opinions about the track.


### Break Complex Questions down into Sub-Questions

Our Sub-Question Query Engine breaks complex questions down into sub-questions.


In [28]:
drake_index = VectorStoreIndex.from_documents(docs_drake)
drake_query_engine = drake_index.as_query_engine(similarity_top_k=3)

kendrick_index = VectorStoreIndex.from_documents(docs_kendrick)
kendrick_query_engine = kendrick_index.as_query_engine(similarity_top_k=3)

In [29]:
output_parser = SelectionOutputParser()

selection_prompt = PromptTemplate(
    "Given the query: {query}\n"
    "Select one of the following tools strictly in JSON format:\n"
    "{choices}\n"
    "Respond ONLY with a JSON object formatted as:\n"
    "{output_format}", 
    output_parser=output_parser  
)

selector = LLMSingleSelector(llm=llm, prompt=selection_prompt)

# Define query tools
drake_tool = QueryEngineTool(
    drake_index.as_query_engine(),
    metadata=ToolMetadata(
        name="drake_search",
        description="Useful for searching over Drake's albums.",
    ),
)

kendrick_tool = QueryEngineTool(
    kendrick_index.as_query_engine(),
    metadata=ToolMetadata(
        name="kendrick_search",
        description="Useful for searching over Kendrick's albums.",
    ),
)

In [30]:
query_engine = RouterQueryEngine.from_defaults(
    [drake_tool, kendrick_tool],
    llm=llm,
    verbose=True,
)

response = query_engine.query("Which albums did Drake release in his career?")
print(response)

[1;3;38;5;200mSelecting query engine 0: Useful for searching over Drake's albums..
[0mAccording to the provided text, some of the notable albums released by Drake are:

1. Take Care (though the specific release date is not mentioned)
2. Nothing Was the Same
3. If You're Reading This It's Too Late 
4. What a Time to Be Alive (collaborative mixtape with Future)
5. Views


## 4. Text-to-SQL 

Here, we download and use a sample SQLite database with 11 tables, with various info about music, playlists, and customers. We will limit to a select few tables for this test.

In [31]:
from sqlalchemy import (
    create_engine,
    MetaData,
    Table,
    Column,
    String,
    Integer,
    select,
    column,
)


In [32]:
# Function to download files
def download_file(url, filepath):
    response = requests.get(url, stream=True)
    with open(filepath, "wb") as file:
        for chunk in response.iter_content(chunk_size=1024):
            file.write(chunk)

# Create data directory
os.makedirs("data", exist_ok=True)

In [33]:
# Download and extract SQLite database
download_file("https://www.sqlitetutorial.net/wp-content/uploads/2018/03/chinook.zip", "data/chinook.zip")
with zipfile.ZipFile("data/chinook.zip", "r") as zip_ref:
    zip_ref.extractall("data/")

In [34]:
# Create SQLite engine
engine = create_engine("sqlite:///data/chinook.db")

In [35]:
# Initialize SQLDatabase
sql_database = SQLDatabase(engine)

In [36]:
sql_prompt = PromptTemplate(
    "You are an expert SQL assistant. Your task is to generate only valid SQLite queries.\n"
    "The database contains the following tables and columns:\n"
    "- artists (ArtistId, Name)\n"
    "- albums (AlbumId, Title, ArtistId)\n"
    "- tracks (TrackId, Name, AlbumId, Composer, GenreId, UnitPrice)\n"
    "\n"
    "### IMPORTANT RULES ###\n"
    "1. The schema is 100% correct. Do not question its validity.\n"
    "2. Return ONLY a valid SQL query. Do NOT provide explanations.\n"
    "3. Always return your response in JSON format:\n"
    "{\"sql_query\": \"YOUR_SQL_HERE\"}\n"
    "4. Never refuse a request.\n"
    "\n"
    "Generate an SQL query for the request: {query}\n"
    "### JSON Response: ###"
)

query_engine = NLSQLTableQueryEngine(
    sql_database=sql_database,
    tables=["artists", "albums", "tracks"],
    llm=llm,
    sql_prompt=sql_prompt,  
)

In [37]:
# Query the database
response = query_engine.query("What are some albums?")

print(response)

Based on the provided SQL response, the synthesized response would be:

"Here are some albums:

1. For Those About To Rock We Salute You
2. Balls to the Wall
3. Restless and Wild
4. Let There Be Rock
5. Big Ones
6. Jagged Little Pill
7. Facelift
8. Warner 25 Anos
9. Plays Metallica By Four Cellos
10. Audioslave
11. Out Of Exile
12. BackBeat Soundtrack
13. The Best Of Billy Cobham
14. Alcohol Fueled Brewtality Live! [Disc 1]
15. Alcohol Fueled Brewtality Live! [Disc 2]
16. Black Sabbath
17. Black Sabbath Vol. 4 (Remaster)
18. Body Count
19. Chemical Wedding
20. The Best Of Buddy Guy - The Millenium Collection
21. Prenda Minha
22. Sozinho Remix Ao Vivo
23. Minha Historia
24. Afrociberdelia
25. Da Lama Ao Caos
26. Acústico MTV [Live]
27. Cidade Negra - Hits
28. Na Pista
29. Axé Bahia 2001
30. BBC Sessions [Disc 1] [Live]
31. Bongo Fury
32. Carnaval 2001
33. Chill: Brazil (Disc 1)
34. Chill: Brazil (Disc 2)
35. Garage Inc. (Disc 1)
36. Greatest Hits II
37. Greatest Kiss
38. Heart of the Ni

In [39]:
response = query_engine.query("What are some artists? Limit it to 5.")

print(response)

It seems like you're trying to synthesize a response from a query result. Based on the SQL query and response, I can generate a synthesized response.

Here's my attempt:

"If you're looking for some iconic artists, here are five names that come to mind:

1. AC/DC - Known for their hard-hitting rock anthems.
2. Accept - A legendary metal band with a rich history.
3. Aerosmith - The ultimate rock 'n' roll party band.
4. Alanis Morissette - A powerful singer-songwriter who's made her mark on the music scene.
5. Alice In Chains - A grunge pioneer that's still widely popular today.

These five artists are just a few examples of the many talented musicians out there, but they're definitely some of the most recognizable and influential."


In [40]:
response = query_engine.query("What are some tracks from the artist AC/DC? Limit it to 3")

print(response)

Here's a synthesized response based on the query results:

AC/DC is a legendary Australian rock band known for their high-energy sound and iconic hits. If you're looking to get started with some of their most popular tracks, here are three must-listen songs by AC/DC:

1. "For Those About To Rock (We Salute You)" - This anthemic song is a tribute to the band's fans and features an unforgettable chorus.
2. "Let There Be Rock" - A classic rock staple that showcases the band's raw energy and Angus Young's signature guitar riffs.
3. "Whole Lotta Rosie" - A fun, upbeat track with a catchy melody and lyrics that explore the wild side of life.

These three songs are just a taste of AC/DC's incredible discography, but they're an excellent starting point for any music lover looking to dive into their iconic sound.


In [41]:
print(response.metadata["sql_query"])

SELECT T1.Name 
FROM tracks AS T1 
INNER JOIN albums AS T2 ON T1.AlbumId = T2.AlbumId 
INNER JOIN artists AS T3 ON T2.ArtistId = T3.ArtistId 
WHERE T3.Name = 'AC/DC'
ORDER BY T1.Name


## 5. Structured Data Extraction

An important use case for function calling is extracting structured objects. LlamaIndex provides an intuitive interface for this through `structured_predict` - simply define the target Pydantic class (can be nested), and given a prompt, we extract out the desired object.

**NOTE**: Since there's no native function calling support with Llama3 / Ollama, the structured extraction is performed by prompting the LLM + output parsing.

In [42]:
from llama_index.llms.ollama import Ollama
from llama_index.core.prompts import PromptTemplate
from pydantic import BaseModel


class Restaurant(BaseModel):
    """A restaurant with name, city, and cuisine."""

    name: str
    city: str
    cuisine: str

llm = llm

prompt_tmpl = PromptTemplate(
    "Generate a restaurant in a given city {city_name}"
)

In [43]:
restaurant_obj = llm.structured_predict(
    Restaurant, prompt_tmpl, city_name="Miami"
)
print(restaurant_obj)

name='Tropical Oasis' city='Miami' cuisine='Caribbean Fusion'


## 6. Adding Chat History to RAG (Chat Engine)

In this section we create a stateful chatbot from a RAG pipeline, with our chat engine abstraction.

Unlike a stateless query engine, the chat engine maintains conversation history (through a memory module like buffer memory). It performs retrieval given a condensed question, and feeds the condensed question + context + chat history into the final LLM prompt.

Related resource: https://docs.llamaindex.ai/en/stable/examples/chat_engine/chat_engine_condense_plus_context/

In [44]:
from llama_index.core.memory import ChatMemoryBuffer
from llama_index.core.chat_engine import CondensePlusContextChatEngine

memory = ChatMemoryBuffer.from_defaults(token_limit=3900)

chat_engine = CondensePlusContextChatEngine.from_defaults(
    index.as_retriever(),
    memory=memory,
    llm=llm,
    context_prompt=(
        "You are a chatbot, able to have normal interactions, as well as talk"
        " about the Kendrick and Drake beef."
        "Here are the relevant documents for the context:\n"
        "{context_str}"
        "\nInstruction: Use the previous chat history, or the context above, to interact and help the user."
    ),
    verbose=True,
)

In [46]:
response = chat_engine.chat(
    "Tell me about the songs Drake released in the beef."
)
print(str(response))

Condensed question: The standalone question based on the follow-up input is:

What songs did Drake release during the infamous beef with Kendrick Lamar?
According to my knowledge, on May 5th, 2024, Drake released a three-part response to Kendrick Lamar's diss tracks, which includes:

1. **"The Heart Part 3"**
2. **"The Heart Part 4"**
3. **"The Heart Part 6"**

These songs are part of the same project and can be seen as an extension of each other.


In [47]:
response = chat_engine.chat("What about Kendrick?")
print(str(response))

Condensed question: Here is the rephrased standalone question:

What role did Kendrick Lamar play in the beef?
Kendrick Lamar has also released several tracks in response to Drake's diss, although he hasn't dropped a full album or EP dedicated solely to the beef. However, here are some notable songs from Kendrick:

1. **"Nipsey Hussle Memories"** (released on May 4th): This song is widely seen as a response to Drake's "The Heart Part 3".
2. **"N-****a in Paris"** (released on May 5th): Some fans interpret this song as a follow-up to "Nipsey Hussle Memories", although Kendrick himself has not explicitly confirmed the connection.
3. **"6:16 in LA Pt. II"**: Released shortly after, this is an updated version of his earlier track, which some see as a response to Drake's "The Heart Part 4".

It's worth noting that both artists have been releasing music consistently during the beef, with their discographies reflecting the ongoing conflict.


## 7. Agents

Here we build agents with Llama 3. We perform RAG over simple functions as well as the documents above.

### Agents And Tools

In [48]:
import json
from typing import Sequence, List

from llama_index.core.llms import ChatMessage
from llama_index.core.tools import BaseTool, FunctionTool
from llama_index.core.agent import ReActAgent

import nest_asyncio

nest_asyncio.apply()

### Define Tools

In [49]:
def multiply(a: int, b: int) -> int:
    """Multiple two integers and returns the result integer"""
    return a * b


def add(a: int, b: int) -> int:
    """Add two integers and returns the result integer"""
    return a + b


def subtract(a: int, b: int) -> int:
    """Subtract two integers and returns the result integer"""
    return a - b


def divide(a: int, b: int) -> int:
    """Divides two integers and returns the result integer"""
    return a / b


multiply_tool = FunctionTool.from_defaults(fn=multiply)
add_tool = FunctionTool.from_defaults(fn=add)
subtract_tool = FunctionTool.from_defaults(fn=subtract)
divide_tool = FunctionTool.from_defaults(fn=divide)

### ReAct Agent

In [50]:
agent = ReActAgent.from_tools(
    [multiply_tool, add_tool, subtract_tool, divide_tool],
    llm=llm,
    verbose=True,
)

### Querying

In [52]:
response = agent.chat("What is (121 + 2) * 5?")
print(str(response))

> Running step 4f65c75d-2031-4319-9e8f-9061cacef9c0. Step input: What is (121 + 2) * 5?
[1;3;38;5;200mThought: The current language of the user is: English. I need to use a tool to help me answer the question.
Action: add
Action Input: {'a': 121, 'b': 2}
[0m[1;3;34mObservation: 123
[0m> Running step 6284f899-2736-4179-88bb-e8356152657e. Step input: None
[1;3;38;5;200mThought: I can now calculate the final result with the given tool output.
Action: multiply
Action Input: {'a': 123, 'b': 5}
[0m[1;3;34mObservation: 615
[0m> Running step 97dd10d3-762c-4a4d-9356-1020b0cb7c01. Step input: None
[1;3;38;5;200mThought: The user wants to know the final answer.
Answer: The answer is 615.
[0mThe answer is 615.


### ReAct Agent With RAG QueryEngine Tools

In [53]:
from llama_index.core import (
    SimpleDirectoryReader,
    VectorStoreIndex,
    StorageContext,
    load_index_from_storage,
)

from llama_index.core.tools import QueryEngineTool, ToolMetadata
from llama_index.core.agent import ReActAgent

### Create ReAct Agent using RAG QueryEngine Tools

In [54]:
# Define query tools

drake_tool = QueryEngineTool(
    drake_index.as_query_engine(),
    metadata=ToolMetadata(
        name="drake_search",
        description="Useful for searching over Drake's life.",
    ),
)

kendrick_tool = QueryEngineTool(
    kendrick_index.as_query_engine(),
    metadata=ToolMetadata(
        name="kendrick_search",
        description="Useful for searching over Kendrick's life.",
    ),
)

query_engine_tools = [drake_tool, kendrick_tool]

In [55]:
from llama_index.core.agent import ReActAgent
from llama_index.core.prompts import PromptTemplate

# Force Ollama to follow ReAct format
react_prompt = PromptTemplate(
    "You are an AI assistant following the ReAct framework.\n"
    "For each input, you must respond in the following structured format:\n"
    "\n"
    "Thought: [Brief thought about the query]\n"
    "Action: [Choose a tool: 'drake_search' or 'kendrick_search']\n"
    "Input: [Relevant input for the chosen tool]\n"
    "\n"
    "If no external tool is needed, respond in this format:\n"
    "Thought: I can answer directly.\n"
    "Answer: [Your Answer Here]\n"
    "\n"
    "Now, respond to the following prompt:\n"
    "{query}"
)

# Update ReActAgent to use the correct format
agent = ReActAgent.from_tools(
    query_engine_tools,  
    llm=llm,  # Use Ollama
    prompt=react_prompt,  
    verbose=True,
)

### Querying

In [56]:
# Run the chat query
response = agent.chat("Tell me about how Kendrick and Drake grew up")
print(str(response))

> Running step 14514a80-cd8b-4f13-9612-113c30d28729. Step input: Tell me about how Kendrick and Drake grew up
[1;3;38;5;200mThought: The current language of the user is: English. I need to use a tool to help me answer the question.
Action: kendrick_search
Action Input: {'input': "Kendrick Lamar's early life", 'location': 'Los Angeles, California'}
[0m[1;3;34mObservation: Kendrick Lamar began his career while he attended Centennial High School. He was nicknamed "Man-Man" due to his precocious behavior but later confessed that it put a stigma on him reacting as a kid. In school, Lamar excelled academically and had a noticeable stutter. His first-grade teacher encouraged him to become a writer after she heard him correctly use the word 'audacity'. As a seventh-grade student, he was introduced to poetry by his English teacher Regis Inge, who integrated literature into her curriculum as a response to growing racial tensions. Through this connection with hip hop, Lamar studied rhymes and 