# 🔬 Virtual Lab 4: Running Llama Models and Deepseek on Replicate  

<div style="border: 2px solid #4CAF50; padding: 15px; border-radius: 10px; background-color: #f4f4f4;">

### 🚀 **Platform**  
**Replicate**  

### 🏷️ **Models Used**  
- **meta/meta-llama-3-70b-instruct**  
- **meta-llama-3.1-405b-instruct**  
- **meta/meta-llama-3-8b-instruct**  
- **deepseek-ai/deepseek-r1**  

### 🛠️ **Frameworks Used**  
- **LlamaIndex**  
- **LangChain / LangGraph**  

</div>

## Installation and Setup

In [1]:
!pip install llama-index
!pip install llama-index-llms-ollama
!pip install llama-index-llms-replicate
!pip install llama-index-embeddings-huggingface
!pip install llama-parse
!pip install replicate



In [1]:
import nest_asyncio

nest_asyncio.apply()

### Setup LLM using Replicate

Make sure you have REPLICATE_API_TOKEN specified!

In [3]:
# os.environ["REPLICATE_API_TOKEN"] = "<YOUR_API_KEY>"

import os
import replicate
from Constants import Constants
# Set the REPLICATE_API_TOKEN environment variable
os.environ["REPLICATE_API_TOKEN"] = Constants.REPLICATE_API_KEY
import os
import replicate

# Set the REPLICATE_API_TOKEN environment variable
os.environ["REPLICATE_API_TOKEN"] = Constants.REPLICATE_API_KEY

In [5]:
from llama_index.llms.replicate import Replicate

llm_replicate = Replicate(model="deepseek-ai/deepseek-r1", api_token=os.environ["REPLICATE_API_TOKEN"])
# llm_replicate = Replicate(model="meta/meta-llama-3-70b-instruct", api_token=os.environ["REPLICATE_API_TOKEN"])

### Setup Embedding Model

In [7]:
from llama_index.embeddings.huggingface import HuggingFaceEmbedding

embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")

### Define Global Settings Configuration

In LlamaIndex, you can define global settings so you don't have to pass the LLM / embedding model objects everywhere.

In [8]:
from llama_index.core import Settings

Settings.llm = llm_replicate
Settings.embed_model = embed_model

### Download Data

Here you'll download data that's used in section 2 and onwards.

We'll download some articles on Kendrick, Drake, and their beef (as of May 2024).

In [9]:
!mkdir -p data
!curl -L "https://www.dropbox.com/scl/fi/t1soxfjdp0v44an6sdymd/drake_kendrick_beef.pdf?rlkey=u9546ymb7fj8lk2v64r6p5r5k&st=wjzzrgil&dl=1" -o data/drake_kendrick_beef.pdf
!curl -L "https://www.dropbox.com/scl/fi/nts3n64s6kymner2jppd6/drake.pdf?rlkey=hksirpqwzlzqoejn55zemk6ld&st=mohyfyh4&dl=1" -o data/drake.pdf
!curl -L "https://www.dropbox.com/scl/fi/8ax2vnoebhmy44bes2n1d/kendrick.pdf?rlkey=fhxvn94t5amdqcv9vshifd3hj&st=dxdtytn6&dl=1" -o data/kendrick.pdf

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    17  100    17    0     0     33      0 --:--:-- --:--:-- --:--:--    33
100   475    0   475    0     0    466      0 --:--:--  0:00:01 --:--:--     0
100 47.0M  100 47.0M    0     0  17.7M      0  0:00:02  0:00:02 --:--:-- 45.0M
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    17  100    17    0     0     44      0 --:--:-- --:--:-- --:--:--    44
100   475    0   475    0     0    595      0 --:--:-- --:--:-- --:--:--     0
100 4483k  100 4483k    0     0  3602k      0  0:00:01  0:00:01 --:--:-- 3602k
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    17  100    17    0     0     34      0 --:--:

### Load Data

We load data using LlamaParse by default, but you can also choose to opt for our free pypdf reader (in SimpleDirectoryReader by default) if you don't have an account! 

1. LlamaParse: Signup for an account here: cloud.llamaindex.ai. You get 1k free pages a day, and paid plan is 7k free pages + 0.3c per additional page. LlamaParse is a good option if you want to parse complex documents, like PDFs with charts, tables, and more. 

2. Default PDF Parser (In `SimpleDirectoryReader`). If you don't want to signup for an account / use a PDF service, just use the default PyPDF reader bundled in our file loader. It's a good choice for getting started!

In [10]:
from llama_index.core import SimpleDirectoryReader

docs_kendrick = SimpleDirectoryReader(input_files=["data/kendrick.pdf"]).load_data()
docs_drake = SimpleDirectoryReader(input_files=["data/drake.pdf"]).load_data()
docs_both = SimpleDirectoryReader(input_files=["data/drake_kendrick_beef.pdf"]).load_data()

## 1. Basic Completion and Chat

### Call complete with a prompt

In [11]:
response = llm_replicate.complete("do you like drake or kendrick better?")

print(response)

<think>
Okay, so the user is asking whether I like Drake or Kendrick Lamar better. First, I need to remember that as an AI, I don't have personal preferences. The answer should reflect that while also providing some context about the two artists. Let me think about their careers and musical styles.

Drake is a Canadian rapper, singer, and songwriter. He's known for his versatility, blending rap with R&B, and he's had a lot of commercial success. His themes often revolve around relationships, fame, and personal struggles. Albums like "Take Care" and "Scorpion" are really popular. He's also known for catchy hooks and hits that top the charts.

Kendrick Lamar, on the other hand, is from Compton and is highly regarded for his lyrical depth and storytelling. His works often tackle social issues, systemic racism, and personal growth. Albums like "good kid, m.A.A.d city" and "To Pimp a Butterfly" are critically acclaimed and considered masterpieces. Kendrick has a more conscious approach to h

In [12]:
stream_response = llm_replicate.stream_complete(
    "you're a drake fan. tell me why you like drake more than kendrick"
)

for t in stream_response:
    print(t.delta, end="")

<think>
Okay, so I need to explain why someone likes Drake more than Kendrick Lamar. Let me start by recalling what I know about both artists. Drake is a Canadian rapper, singer, and songwriter known for his melodic style and emotional lyrics. He's been really popular for over a decade, with hits in both hip-hop and pop. Kendrick Lamar is also a highly respected rapper, known for his deep lyrics, storytelling, and social commentary. He's often praised for his artistry and impactful albums.

The user wants reasons why someone might prefer Drake over Kendrick. Let me think about the possible differences in their music styles, public personas, themes, and other factors. Let's start with musical style. Drake tends to blend singing and rapping, which makes his music more accessible and versatile. He's known for catchy hooks and melodies, which might appeal to a broader audience. His music often deals with personal emotions, relationships, and fame. 

Kendrick, on the other hand, focuses mor

### Call chat with a list of messages

In [13]:
from llama_index.core.llms import ChatMessage

messages = [
    ChatMessage(role="system", content="You are Kendrick."),
    ChatMessage(role="user", content="Write a verse."),
]
response = llm_replicate.chat(messages)

In [14]:
print(response)

assistant: <think>
Alright, let's get into the mindset of Kendrick Lamar. First, I need to think about his style: introspective, socially conscious, clever wordplay, and complex rhyme schemes. His verses often tackle themes like systemic racism, personal struggles, identity, and his roots in Compton.

Start with a strong opening line that grabs attention. Maybe use a metaphor related to fame or struggle. Kendrick often uses alliteration and internal rhymes, so I should incorporate that. Need to maintain a rhythmic flow, possibly using a AABB or ABAB rhyme scheme. Let's think about current events or personal challenges he might address. Maybe touch on authenticity in the music industry, social justice, or personal growth. Avoid clichés, keep it gritty and real. Use specific imagery related to Compton, like streetlights, corner stores, or cultural references. Also, consider the beat – the verse should have a cadence that fits a typical Kendrick track, with pauses and emphasis on certain 

## 2. Basic RAG (Vector Search, Summarization)

### Basic RAG (Vector Search)

In [15]:
from llama_index.core import VectorStoreIndex

index = VectorStoreIndex.from_documents(docs_both)
query_engine = index.as_query_engine(similarity_top_k=3)

In [16]:
response = query_engine.query("Tell me about family matters")

In [17]:
print(str(response))

<think>
Okay, so I need to figure out what "Family Matters" refers to based on the provided context. Let me go through the information given.

From page 21, it's mentioned that Drake released a diss track called "Family Matters" which is three songs in one, on three different beats. The track includes verses targeting Kendrick Lamar and others like Metro Boomin, Rick Ross, The Weeknd, and A$AP Rocky. However, the non-Kendrick parts are criticized as distracting and having some of Drake's laziest writing.

Page 24 adds that in "Family Matters," Drake accuses Kendrick of infidelity, neglecting his family by preferring bachelor life in New York, and domestic abuse against the mother of his children. Kendrick's response, "Meet the Grahams," counters these allegations and even claims Drake has a hidden daughter. Kendrick also brings up serious accusations like Drake associating with sex offenders and facing potential legal issues similar to Harvey Weinstein.

Page 29 talks about Drake's res

### Basic RAG (Summarization)

In [18]:
from llama_index.core import SummaryIndex

summary_index = SummaryIndex.from_documents(docs_both)
summary_engine = summary_index.as_query_engine()

In [19]:
response = summary_engine.query(
    "Given your assessment of this article, who won the beef?"
)

In [20]:
print(str(response))

<think>
Okay, let's look at the new context provided. The user added multiple pages from a GQ article, but most of them seem to be irrelevant—like sections about Shaboozey, A$AP Rocky, sponsored links, and other unrelated articles. The actual content related to the Kendrick vs. Drake beef is minimal here. Pages 30-34 of the PDF don't provide any new information about the beef; they're mostly other articles, ads, and the end of the original piece.

The original analysis already integrated the key points from the initial article, including the escalation of allegations, Kendrick's strategic releases, and the public reception. Since the new context doesn't add any substantial details about the beef itself, there's no need to refine the answer further. The original refined answer remains accurate and comprehensive based on the information available.
</think>

The provided context (pages 30-34 of the GQ article) contains no substantive new information about the Kendrick Lamar/Drake beef—it 

## 3. Advanced RAG (Routing, Sub-Questions)

### Build a Router that can choose whether to do vector search or summarization

In [21]:
from llama_index.core.tools import QueryEngineTool, ToolMetadata

vector_tool = QueryEngineTool(
    index.as_query_engine(),
    metadata=ToolMetadata(
        name="vector_search",
        description="Useful for searching for specific facts.",
    ),
)

summary_tool = QueryEngineTool(
    index.as_query_engine(response_mode="tree_summarize"),
    metadata=ToolMetadata(
        name="summary",
        description="Useful for summarizing an entire document.",
    ),
)

In [22]:
from llama_index.core.query_engine import RouterQueryEngine

query_engine = RouterQueryEngine.from_defaults(
    [vector_tool, summary_tool], select_multi=False, verbose=True
)

response = query_engine.query(
    "Tell me about the song meet the grahams - why is it significant"
)

[1;3;38;5;200mSelecting query engine 0: The question seeks specific facts about the song 'Meet the Grahams' and its significance, which aligns with Choice 1's focus on factual search rather than summarizing an entire document..
[0m

In [23]:
print(response)

<think>
Okay, I need to answer the question about why "Meet the Grahams" is significant based on the provided context. Let me look through the given pages.

On page 27, it says Drake laughed off "Meet the Grahams" as a sign that team PG is desperate and "in shambles." The song is mentioned in the same paragraph as "Not Like Us," where Kendrick is proving he's more war-ready. But the main details about "Meet the Grahams" are on page 26. 

Page 26 describes "Meet the Grahams" as an uncomfortable listen, contrasting it with Drake's track "Family Matters," which alleges Kendrick "beats his queen." The author compares it to Pusha T's diss track but notes that "Meet the Grahams" is "joyless," suggesting it's more harsh or serious. Kendrick was in danger of losing the audience by taking things too far, but then released "Not Like Us" to lighten the mood. 

Significance points: It's part of the Kendrick/Drake beef, aggressive disses, puts pressure on Drake, but risked alienating listeners for 

### Break Complex Questions down into Sub-Questions

Our Sub-Question Query Engine breaks complex questions down into sub-questions.


In [24]:
drake_index = VectorStoreIndex.from_documents(docs_drake)
drake_query_engine = drake_index.as_query_engine(similarity_top_k=3)

kendrick_index = VectorStoreIndex.from_documents(docs_kendrick)
kendrick_query_engine = kendrick_index.as_query_engine(similarity_top_k=3)

In [25]:
from llama_index.core.tools import QueryEngineTool, ToolMetadata

drake_tool = QueryEngineTool(
    drake_index.as_query_engine(),
    metadata=ToolMetadata(
        name="drake_search",
        description="Useful for searching over Drake's life.",
    ),
)

kendrick_tool = QueryEngineTool(
    kendrick_index.as_query_engine(),
    metadata=ToolMetadata(
        name="kendrick_summary",
        description="Useful for searching over Kendrick's life.",
    ),
)

In [26]:
from llama_index.core.query_engine import SubQuestionQueryEngine

query_engine = SubQuestionQueryEngine.from_defaults(
    [drake_tool, kendrick_tool],
    llm=llm_replicate,  
    verbose=True,
)

response = query_engine.query("Which albums did Drake release in his career?")

print(response)

Generated 1 sub questions.
[1;3;38;2;237;90;200m[drake_search] Q: Which albums did Drake release in his career?
[0m[1;3;38;2;237;90;200m[drake_search] A: <think>
Okay, let's tackle this query: "Which albums did Drake release in his career?" Using the provided context from the two pages of the document, I need to extract all the albums mentioned.

First, starting with page 6, there's a section that mentions his third studio album, "Nothing Was the Same," released on September 24, 2013. It debuted at number one on the US Billboard 200. Then it mentions "If You're Reading This It's Too Late" released on February 12, 2015. There's a debate whether it's an album or mixtape, but the context states it's considered his fourth retail project. 

Next, "What a Time to Be Alive" is mentioned as a collaborative mixtape with Future, released on September 20, 2015. It's noted that this mixtape debuted at number one, making him the first hip-hop artist with two number-one projects in 2015. However,

## 4. Text-to-SQL 

Here, we download and use a sample SQLite database with 11 tables, with various info about music, playlists, and customers. We will limit to a select few tables for this test.

In [27]:
# Function to download files
def download_file(url, filepath):
    response = requests.get(url, stream=True)
    with open(filepath, "wb") as file:
        for chunk in response.iter_content(chunk_size=1024):
            file.write(chunk)

# Create data directory
os.makedirs("data", exist_ok=True)

In [28]:
# Download and extract SQLite database
import requests
import zipfile
download_file("https://www.sqlitetutorial.net/wp-content/uploads/2018/03/chinook.zip", "data/chinook.zip")
with zipfile.ZipFile("data/chinook.zip", "r") as zip_ref:
    zip_ref.extractall("data/")

In [29]:
from sqlalchemy import (
    create_engine,
    MetaData,
    Table,
    Column,
    String,
    Integer,
    select,
    column,
)

engine = create_engine("sqlite:///data/chinook.db")

In [30]:
from llama_index.core import SQLDatabase

sql_database = SQLDatabase(engine)

In [31]:
from llama_index.core.indices.struct_store import NLSQLTableQueryEngine

query_engine = NLSQLTableQueryEngine(
    sql_database=sql_database,
    tables=["albums", "tracks", "artists"],
    llm=llm_replicate,
)

In [32]:
response = query_engine.query("What are some albums?")

print(response)

<think>
Alright, the user is asking for some albums. Let me check the SQL query they ran. They selected the Title from the albums table, ordered by AlbumId, and limited to 5 results. The response from the database has five album titles. I need to present these in a clear and natural way.

First, I'll list each album title. Since the SQL response gives them as tuples, I should extract just the title strings. Maybe number them for readability. Let me make sure there are exactly five. The titles are "For Those About To Rock We Salute You", "Balls to the Wall", "Restless and Wild", "Let There Be Rock", and "Big Ones". 

I should check if there's any specific order the user might prefer, but since the SQL used ORDER BY AlbumId, they're probably listed in the order they were added to the database, starting from the earliest. The user might just want a straightforward list. 

No need for additional info since the question is simply asking for some album names. I'll format the response as a nu

In [33]:
response = query_engine.query("What are some artists? Limit it to 5.")

print(response)

<think>
Okay, the user asked for some artists, limited to 5. Let me check the SQL query they provided. It selects the Name from the artists table and limits to 5. The response from the database is a list of tuples like [('AC/DC',), ...]. 

Hmm, I need to turn that into a natural language answer. I'll start by listing the names from each tuple. Each tuple has a single element, so I can extract the first item from each. The artists are AC/DC, Accept, Aerosmith, Alanis Morissette, and Alice In Chains. Let me make sure there are exactly five as requested.

I should present them in a clear format, maybe bullet points or a numbered list. Since they asked for a list, perhaps numbers would help. Also, check if any of these artists might have special characters or formatting, but they look straightforward. 

Wait, the user example response just has the names separated by commas. Maybe a simple sentence stating the five artists. I can phrase it like, "Here are five artists from the database: AC/

In [34]:
response = query_engine.query(
    "What are some tracks from the artist AC/DC? Limit it to 3"
)

print(response)

<think>
Okay, so I need to figure out how to answer the question "What are some tracks from the artist AC/DC? Limit it to 3" based on the provided SQL query and its response. Let's start by understanding what the user is asking. They want three songs by AC/DC, right?

Looking at the SQL query, it's selecting track names from the tracks table, joining with albums and artists to get the correct artist. The WHERE clause filters for the artist name 'AC/DC', and the LIMIT 3 ensures we only get three results. The SQL response gives three track names: 'For Those About To Rock (We Salute You)', 'Put The Finger On You', and 'Let's Get It Up'. 

So, the straightforward answer would be to list these three tracks. But I should check if there's any formatting needed. The user might expect a simple list, possibly with bullet points or numbers. Also, should I mention that these are three examples from their discography, or just list them plainly?

Wait, the SQL response comes as a list of tuples, eac

In [35]:
print(response.metadata["sql_query"])

SELECT t.Name FROM tracks AS t JOIN albums AS a ON t.AlbumId = a.AlbumId JOIN artists AS ar ON a.ArtistId = ar.ArtistId WHERE ar.Name = 'AC/DC' LIMIT 3;


## 5. Structured Data Extraction

An important use case for function calling is extracting structured objects. LlamaIndex provides an intuitive interface for this through `structured_predict` - simply define the target Pydantic class (can be nested), and given a prompt, we extract out the desired object.

**NOTE**: Since there's no native function calling support with Llama3 / Ollama, the structured extraction is performed by prompting the LLM + output parsing.

In [36]:
from llama_index.core.prompts import PromptTemplate
from pydantic import BaseModel


class Restaurant(BaseModel):
    """A restaurant with name, city, and cuisine."""

    name: str
    city: str
    cuisine: str

llm = llm_replicate


prompt_tmpl = PromptTemplate(
    "Generate a restaurant in a given city {city_name}"
)

In [37]:
restaurant_obj = llm.structured_predict(
    Restaurant, prompt_tmpl, city_name="Miami"
)
print(restaurant_obj)

name='Casa Mar' city='Miami' cuisine='Nuevo Latino'


## 6. Adding Chat History to RAG (Chat Engine)

In this section we create a stateful chatbot from a RAG pipeline, with our chat engine abstraction.

Unlike a stateless query engine, the chat engine maintains conversation history (through a memory module like buffer memory). It performs retrieval given a condensed question, and feeds the condensed question + context + chat history into the final LLM prompt.

Related resource: https://docs.llamaindex.ai/en/stable/examples/chat_engine/chat_engine_condense_plus_context/

In [38]:
from llama_index.core.memory import ChatMemoryBuffer
from llama_index.core.chat_engine import CondensePlusContextChatEngine

memory = ChatMemoryBuffer.from_defaults(token_limit=3900)

chat_engine = CondensePlusContextChatEngine.from_defaults(
    index.as_retriever(),
    memory=memory,
    llm=llm_replicate,
    context_prompt=(
        "You are a chatbot, able to have normal interactions, as well as talk"
        " about the Kendrick and Drake beef."
        "Here are the relevant documents for the context:\n"
        "{context_str}"
        "\nInstruction: Use the previous chat history, or the context above, to interact and help the user."
    ),
    verbose=True,
)

In [39]:
response = chat_engine.chat(
    "Tell me about the songs Drake released in the beef."
)
print(str(response))

Condensed question: Tell me about the songs Drake released in the beef.
<think>
Okay, the user is asking about the songs Drake released during his beef with Kendrick Lamar. I need to pull the relevant info from the provided documents. Let's check the context again.

The documents mention two specific instances. On page 28, there's a detailed section about Drake's "The Heart Part 6" released on May 5. It explains that Drake responded to Kendrick's tracks, using a similar title to Kendrick's "The Heart" series. The song samples Aretha Franklin, addresses Kendrick's accusations, and includes rebuttals to "Not Like Us." It also notes that Drake seems a bit over the beef but still hints at escalation.

Looking through the context, those are the main tracks mentioned. The other page (31) talks about Shaboozey’s Cowboy Carter features but that's not related to Drake's songs. So, the key song here is "The Heart Part 6." The user might be expecting more songs, but based on the given documents, 

In [40]:
response = chat_engine.chat("What about Kendrick?")
print(str(response))

Condensed question: <think>
Alright, the user initially asked about Drake's songs in the beef, and the assistant replied focusing on "The Heart Part 6." Now the follow-up is "What about Kendrick?" The user wants to know about Kendrick's side of the diss tracks.

To make this a standalone question, I need to reference the previous topic (Drake's songs in the beef) and then specify Kendrick's role. The original answer mentioned Kendrick's tracks like "Not Like Us," so the standalone question should ask for details on Kendrick's diss tracks in the same feud. The key is to maintain context without relying on the chat history.

So, "What diss tracks did Kendrick Lamar release in his feud with Drake?" would work. It's clear, standalone, and ties directly to the prior discussion about Drake's releases.
</think>

**Standalone question:**  
"What diss tracks did Kendrick Lamar release in his feud with Drake?"
<think>
Okay, the user just asked, "What about Kendrick?" after I explained Drake's so

## 7. Agents

Here we build agents with Llama 3. We perform RAG over simple functions as well as the documents above.

### Agents And Tools

In [41]:
import json
from typing import Sequence, List

from llama_index.core.llms import ChatMessage
from llama_index.core.tools import BaseTool, FunctionTool
from llama_index.core.agent import ReActAgent

import nest_asyncio

nest_asyncio.apply()

### Define Tools

In [42]:
def multiply(a: int, b: int) -> int:
    """Multiple two integers and returns the result integer"""
    return a * b


def add(a: int, b: int) -> int:
    """Add two integers and returns the result integer"""
    return a + b


def subtract(a: int, b: int) -> int:
    """Subtract two integers and returns the result integer"""
    return a - b


def divide(a: int, b: int) -> int:
    """Divides two integers and returns the result integer"""
    return a / b


multiply_tool = FunctionTool.from_defaults(fn=multiply)
add_tool = FunctionTool.from_defaults(fn=add)
subtract_tool = FunctionTool.from_defaults(fn=subtract)
divide_tool = FunctionTool.from_defaults(fn=divide)

### ReAct Agent

In [44]:
agent = ReActAgent.from_tools(
    [multiply_tool, add_tool, subtract_tool, divide_tool],
    llm=llm_replicate,
    verbose=True,
)

### Querying

In [45]:
response = agent.chat("What is (121 + 2) * 5?")
print(str(response))

> Running step 52818ee2-0995-4193-8390-130bbb49b328. Step input: What is (121 + 2) * 5?
[1;3;38;5;200mThought: The current language of the user is: English. I need to use a tool to help me answer the question.
Action: add
Action Input: {'a': 121, 'b': 2}
[0m[1;3;34mObservation: 123
[0m> Running step 76e625d4-45e4-4790-b88d-1a582792cc7f. Step input: None
[1;3;38;5;200mThought: The current language of the user is: English. I need to use a tool to help me answer the question.
Action: multiply
Action Input: {'a': 123, 'b': 5}
[0m[1;3;34mObservation: 615
[0m> Running step 36211765-2185-4a3c-a4fa-f19744590fa5. Step input: None
[1;3;38;5;200mThought: I can answer without using any more tools. I'll use the user's language to answer
Answer: The result of (121 + 2) * 5 is 615.
[0mThe result of (121 + 2) * 5 is 615.


### ReAct Agent With RAG QueryEngine Tools

In [46]:
from llama_index.core import (
    SimpleDirectoryReader,
    VectorStoreIndex,
    StorageContext,
    load_index_from_storage,
)

from llama_index.core.tools import QueryEngineTool, ToolMetadata

### Create ReAct Agent using RAG QueryEngine Tools

In [47]:
drake_tool = QueryEngineTool(
    drake_index.as_query_engine(),
    metadata=ToolMetadata(
        name="drake_search",
        description="Useful for searching over Drake's life.",
    ),
)

kendrick_tool = QueryEngineTool(
    kendrick_index.as_query_engine(),
    metadata=ToolMetadata(
        name="kendrick_search",
        description="Useful for searching over Kendrick's life.",
    ),
)

query_engine_tools = [drake_tool, kendrick_tool]

In [48]:
agent = ReActAgent.from_tools(
    query_engine_tools,  ## TODO: define query tools
    llm=llm_replicate,
    verbose=True,
)

### Querying

In [49]:
response = agent.chat("Tell me about how Kendrick and Drake grew up")
print(str(response))

> Running step c7b40d9f-f06e-484b-aa58-e4c6f030c804. Step input: Tell me about how Kendrick and Drake grew up
[1;3;38;5;200mThought: The current language of the user is: English. I need to use a tool to help me answer the question.
Action: kendrick_search
Action Input: {'input': 'early life and upbringing'}
[0m[1;3;34mObservation: <think>
Okay, let's tackle this query about Kendrick Lamar's early life and upbringing using the provided context. First, I need to make sure I cover all the relevant points from the text.

The user is asking about his early life, so I should start with his education. The context mentions Centennial High School, where his nickname was "Man-Man" because of his precocious behavior. That's an interesting detail to include. He was a quiet, observant student who did well academically but had a stutter. His first-grade teacher at Robert E. McNair Elementary encouraged him to write after he used the word "audacity" correctly. That's a good point about his early e