# 🔬 Virtual Lab 4: Running OpenAI Models on LlamaIndex & LangChain  

<div style="border: 2px solid #4CAF50; padding: 15px; border-radius: 10px; background-color: #f4f4f4;">

### 🚀 **Platform**  
**OpenAI**  

### 🏷️ **Models Used**  
- **gpt-4o-mini**  
- **gpt-3.5-turbo**  

### 🛠️ **Frameworks Used**  
- **LlamaIndex**  
- **LangChain / LangGraph**  

</div>

In [1]:
!pip install pypdf2

Collecting pypdf2
  Downloading pypdf2-3.0.1-py3-none-any.whl.metadata (6.8 kB)
Downloading pypdf2-3.0.1-py3-none-any.whl (232 kB)
Installing collected packages: pypdf2
Successfully installed pypdf2-3.0.1


In [2]:
!pip show pypdf2

Name: PyPDF2
Version: 3.0.1
Summary: A pure-python PDF library capable of splitting, merging, cropping, and transforming PDF files
Home-page: 
Author: 
Author-email: Mathieu Fenniak <biziqe@mathieu.fenniak.net>
License: 
Location: /Users/denesh/Desktop/IIT-Chicago/Sem-3/SPM/VirtualLabs/Lab_4/venv_lab_4/lib/python3.12/site-packages
Requires: 
Required-by: 


In [3]:
!pip install langchain langchain-core openai

Collecting langchain
  Downloading langchain-0.3.19-py3-none-any.whl.metadata (7.9 kB)
Collecting langchain-core
  Downloading langchain_core-0.3.40-py3-none-any.whl.metadata (5.9 kB)
Collecting langchain-text-splitters<1.0.0,>=0.3.6 (from langchain)
  Using cached langchain_text_splitters-0.3.6-py3-none-any.whl.metadata (1.9 kB)
Collecting langsmith<0.4,>=0.1.17 (from langchain)
  Downloading langsmith-0.3.11-py3-none-any.whl.metadata (14 kB)
Collecting jsonpatch<2.0,>=1.33 (from langchain-core)
  Using cached jsonpatch-1.33-py2.py3-none-any.whl.metadata (3.0 kB)
Collecting jsonpointer>=1.9 (from jsonpatch<2.0,>=1.33->langchain-core)
  Using cached jsonpointer-3.0.0-py2.py3-none-any.whl.metadata (2.3 kB)
Collecting orjson<4.0.0,>=3.9.14 (from langsmith<0.4,>=0.1.17->langchain)
  Using cached orjson-3.10.15-cp312-cp312-macosx_10_15_x86_64.macosx_11_0_arm64.macosx_10_15_universal2.whl.metadata (41 kB)
Collecting requests-toolbelt<2.0.0,>=1.0.0 (from langsmith<0.4,>=0.1.17->langchain)
  

In [4]:
!pip install -U langchain-community

Collecting langchain-community
  Downloading langchain_community-0.3.18-py3-none-any.whl.metadata (2.4 kB)
Collecting pydantic-settings<3.0.0,>=2.4.0 (from langchain-community)
  Downloading pydantic_settings-2.8.1-py3-none-any.whl.metadata (3.5 kB)
Collecting httpx-sse<1.0.0,>=0.4.0 (from langchain-community)
  Using cached httpx_sse-0.4.0-py3-none-any.whl.metadata (9.0 kB)
Downloading langchain_community-0.3.18-py3-none-any.whl (2.5 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.5/2.5 MB[0m [31m1.6 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m
[?25hUsing cached httpx_sse-0.4.0-py3-none-any.whl (7.8 kB)
Downloading pydantic_settings-2.8.1-py3-none-any.whl (30 kB)
Installing collected packages: httpx-sse, pydantic-settings, langchain-community
Successfully installed httpx-sse-0.4.0 langchain-community-0.3.18 pydantic-settings-2.8.1


In [5]:
!pip install -U langchain langchain-core openai langchain-community



In [6]:
!pip install -U langchain-openai langchain langchain-core langchain-community

Collecting langchain-openai
  Downloading langchain_openai-0.3.7-py3-none-any.whl.metadata (2.3 kB)
Downloading langchain_openai-0.3.7-py3-none-any.whl (55 kB)
Installing collected packages: langchain-openai
Successfully installed langchain-openai-0.3.7


In [1]:
import os
import openai
import requests
import zipfile
import sqlite3
import json
from sqlalchemy import create_engine, text
from pydantic import BaseModel
from PyPDF2 import PdfReader
from langchain_openai import OpenAIEmbeddings
from langchain_core.vectorstores import InMemoryVectorStore
from langchain_core.documents import Document

### OpenAI API Setup & Configuration

In this section, we set up the OpenAI API client and define a function (`call_gpt`)  
to interact with **GPT-4o Mini**.

In [2]:
# Set OpenAI API Key
#os.environ["OPENAI_API_KEY"] = "add-your-api-key"
from Constants import Constants
openai.api_key = Constants.OPENAI_API_KEY

In [3]:
# Initialize OpenAI client
client = openai.OpenAI(api_key=Constants.OPENAI_API_KEY)

In [5]:
# Set global model configuration
llm_config = {"model": "gpt-4o-mini"}

In [7]:
# Call complete with a prompt
def call_gpt(prompt):
    response = client.chat.completions.create(
        model=llm_config["model"],
        messages=[{"role": "user", "content": prompt}],
        temperature=0.7
    )
    return response.choices[0].message.content


### Document Download & Text Extraction

This section downloads PDF documents related to **Drake and Kendrick Lamar**,  
extracts their text content, and loads them for further processing.

- **Download PDFs**: Fetches the documents from Dropbox and saves them locally.  
- **Extract Text**: Reads the PDFs using `PyPDF2` and converts them into plain text.  
- **Load Documents**: Stores the extracted text in variables (`docs_kendrick`, `docs_drake`, `docs_both`)  
  for querying and analysis.

In [8]:
# Function to download PDFs
def download_file(url, filepath):
    response = requests.get(url, stream=True)
    with open(filepath, "wb") as file:
        for chunk in response.iter_content(chunk_size=1024):
            file.write(chunk)

os.makedirs("data", exist_ok=True)

In [10]:
# Download documents
pdf_urls = {
    "drake_kendrick_beef": "https://www.dropbox.com/scl/fi/t1soxfjdp0v44an6sdymd/drake_kendrick_beef.pdf?rlkey=u9546ymb7fj8lk2v64r6p5r5k&st=wjzzrgil&dl=1",
    "drake": "https://www.dropbox.com/scl/fi/nts3n64s6kymner2jppd6/drake.pdf?rlkey=hksirpqwzlzqoejn55zemk6ld&st=mohyfyh4&dl=1",
    "kendrick": "https://www.dropbox.com/scl/fi/8ax2vnoebhmy44bes2n1d/kendrick.pdf?rlkey=fhxvn94t5amdqcv9vshifd3hj&st=dxdtytn6&dl=1"
}

In [11]:
for name, url in pdf_urls.items():
    download_file(url, f"data/{name}.pdf")

In [12]:
# Function to extract text from PDFs
def extract_text_from_pdf(filepath):
    with open(filepath, "rb") as file:
        reader = PdfReader(file)
        text = "\n\n".join([page.extract_text() for page in reader.pages if page.extract_text()])
    return text

In [13]:
# Load documents
docs = {
    "drake_kendrick_beef": extract_text_from_pdf("data/drake_kendrick_beef.pdf"),
    "drake": extract_text_from_pdf("data/drake.pdf"),
    "kendrick": extract_text_from_pdf("data/kendrick.pdf")
}

In [14]:
# Initialize OpenAI Embeddings and Vector Store
embedding_model = OpenAIEmbeddings(model="text-embedding-3-small", api_key= Constants.OPENAI_API_KEY)
vector_store = InMemoryVectorStore(embedding=embedding_model)

In [15]:
# Add documents to vector store 
for name, text in docs.items():
    doc = Document(page_content=text, metadata={"source": name})
    vector_store.add_documents([doc])  # Remove embedding_model from here

print(vector_store)

<langchain_core.vectorstores.in_memory.InMemoryVectorStore object at 0x133b92ed0>


### Basic GPT-4o Mini Query & Streaming Response

This section demonstrates how to interact with **GPT-4o Mini** using both standard  
and streaming responses.

- **Basic Completion**: Calls `call_gpt()` to get a simple text-based response.  
- **Streaming Response**: Uses `stream_gpt()` to receive output in real-time,  
  printing the response incrementally as it's generated.  

In [16]:
response = call_gpt("Do you like Drake or Kendrick better?")
print(response)

I don't have personal preferences, but I can tell you that both Drake and Kendrick Lamar have made significant impacts on the music industry. Drake is known for his catchy hooks and versatility across genres, while Kendrick is celebrated for his lyrical depth and storytelling. Fans often have strong opinions about both artists, and their styles cater to different tastes. Who do you prefer?


In [17]:
# Streaming response
def stream_gpt(prompt):
    response = client.chat.completions.create(
        model=llm_config["model"],
        messages=[{"role": "user", "content": prompt}],
        temperature=0.7,
        stream=True
    )
    for chunk in response:
        if chunk.choices:
            print(chunk.choices[0].delta.content, end="")

stream_gpt("You're a Drake fan. Tell me why you like Drake more than Kendrick.")


As a Drake fan, I appreciate his versatility and ability to blend different genres seamlessly. Drake has a unique talent for crafting catchy hooks and relatable lyrics that resonate with a wide audience. His storytelling often reflects personal experiences, making his music feel authentic and accessible.

Additionally, Drake excels in creating memorable collaborations and has a knack for tapping into current trends, which keeps his sound fresh. His ability to switch between rap and singing allows him to explore a range of emotions and themes, from introspection to celebration.

While Kendrick Lamar is undoubtedly a brilliant artist with profound lyrical depth and social commentary, I find that Drake’s emphasis on vibe, melody, and catchy production aligns more with my personal taste. His music often feels like the soundtrack to various moments in life, whether it’s a party, a late-night drive, or a reflective moment. Ultimately, it comes down to a preference for the feel and flow of Dr

### Multi-Turn Chat with GPT-4o Mini

This section demonstrates **structured conversations** with GPT-4o Mini using a list of messages.

- **Role-Based Messages**: The model is assigned a **system role** (e.g., acting as Kendrick).  
- **User Interaction**: The user provides an input query, and GPT-4o Mini generates a response.    

In [18]:
# Call chat with a list of messages
def chat_with_gpt(messages):
    response = client.chat.completions.create(
        model=llm_config["model"],
        messages=messages,
        temperature=0.7
    )
    return response.choices[0].message.content

messages = [
    {"role": "system", "content": "You are Kendrick."},
    {"role": "user", "content": "Write a verse."},
]
response = chat_with_gpt(messages)

In [19]:
print(response)

Sure, here’s a verse for you:

In the city lights, where dreams collide,  
Chasing shadows, never let them slide,  
Heartbeats echo in the midnight air,  
Rising from the struggle, yeah, we’re almost there.  

Every scar a story, every tear a fight,  
Painting life in colors, turning dark to bright,  
With the fire in my soul, I’m breaking through the chains,  
Building empires from the ashes, dancing in the rain.  

Hope you like it!


### Basic RAG (Retrieval-Augmented Generation) - Vector Search

This section demonstrates **retrieving and answering questions** from documents  
using **GPT-4o Mini**.

- **Query-Based Search**: Uses `query_rag()` to fetch relevant information from the document.  
- **Contextual Responses**: The model is provided with document content to generate informed answers.   

In [20]:
def query_rag_with_embedding(prompt, top_k=3, max_tokens=3000):
    relevant_docs = vector_store.similarity_search(prompt, k=top_k)
    context = "\n\n".join([doc.page_content[:max_tokens] for doc in relevant_docs])
    full_prompt = f"Using the following retrieved information, answer the question: {prompt}\n\n{context}"

    return call_gpt(full_prompt)

In [21]:
response = query_rag_with_embedding("Tell me about family matters")
print(response)

The information provided focuses primarily on the ongoing feud between two prominent hip-hop artists, Kendrick Lamar and Drake, rather than family matters in a traditional sense. However, both artists do have notable familial connections that are mentioned in their profiles.

### Kendrick Lamar:
- **Family**: Kendrick Lamar Duckworth, born June 17, 1987, in Compton, California, has familial ties to the music industry through his cousin, Baby Keem, who is also a rapper. Additionally, he is related to former NBA player Nick Young.
- **Personal Life**: Kendrick is engaged to Whitney Alford, with whom he has been in a long-term relationship since their high school years. They have two children together.

### Drake:
- **Family**: Aubrey Drake Graham, born October 24, 1986, in Toronto, Ontario, has notable family relationships, including his uncle Larry Graham, who is a musician, and Teenie Hodges, another uncle with musical connections.
- **Personal Life**: Drake has one child, Adonis, with

### Basic RAG (Retrieval-Augmented Generation) - Summarization

This section demonstrates **summarizing document content** using **GPT-4o Mini**.

- **Context-Based Summarization**: Uses `summarize_rag()` to extract key insights from documents.  
- **Efficient Information Extraction**: The model condenses long-form content into a concise response.  

In [22]:
def query_rag_with_embedding(prompt, top_k=3, max_doc_length=1000):
    relevant_docs = vector_store.similarity_search(prompt, k=top_k)    
    truncated_docs = [doc.page_content[:max_doc_length] for doc in relevant_docs]
    context = "\n\n".join(truncated_docs)
    
    full_prompt = f"Using the following retrieved information, answer the question: {prompt}\n\n{context}"
    
    response = call_gpt(full_prompt)
    
    return response


In [23]:
response = query_rag_with_embedding("Tell me about family matters")
print(response)

The information provided does not directly address "family matters." Instead, it focuses on the ongoing beef between two prominent hip-hop artists, Kendrick Lamar and Drake, detailing their recent diss exchanges and backgrounds. 

However, if you are interested in family matters related to either artist, here's a brief overview based on the retrieved information:

- **Drake (Aubrey Drake Graham)** has one child and is related to notable musicians, including his uncles Larry Graham and Teenie Hodges.
- **Kendrick Lamar (Kendrick Lamar Duckworth)** has been engaged to Whitney Alford since 2015 and has a focus on social commentary in his music.

If you have a specific aspect of family matters you would like to explore further, please let me know!


### Advanced RAG (Routing & Sub-Questions)

This section implements an **intelligent query router** that determines whether  
to perform **vector search** or **summarization** based on the user's intent.

- **Automatic Routing**: GPT-4o Mini decides if a query requires **search** (fact retrieval)  
  or **summary** (document overview).  
- **Dynamic Query Processing**: The model selects the appropriate approach and generates a response.   

In [24]:
def query_router_with_embedding(prompt, top_k=3, max_tokens=3000):
    routing_prompt = (
        "Determine the best mode (search or summary) to process the given user query based on intent. "
        "Return only 'search' if the query seeks specific facts, or 'summary' if the query requires summarization. "
        "Respond with only 'search' or 'summary'.\n\n"
        f"User Query: {prompt}"
    )

    response = client.chat.completions.create(
        model=llm_config["model"],
        messages=[{"role": "user", "content": routing_prompt}],
        temperature=0
    )

    mode = response.choices[0].message.content.strip().lower()

    relevant_docs = vector_store.similarity_search(prompt, k=top_k)
    context = "\n\n".join([doc.page_content[:max_tokens] for doc in relevant_docs]) 

    if mode == "search":
        full_prompt = f"Using the following retrieved information, find specific facts related to: {prompt}\n\n{context}"
    elif mode == "summary":
        full_prompt = f"Summarize the document with respect to: {prompt}\n\n{context}"
    else:
        full_prompt = f"{prompt}\n\n{context}" 

    return call_gpt(full_prompt)

In [25]:
response_search = query_router_with_embedding("Tell me about the song 'Meet the Grahams' - why is it significant")
print(response_search)

The song "Meet the Grahams" is significant as it emerges from a notable period of rivalry between two prominent figures in hip hop, Kendrick Lamar and Drake. This track is part of a broader narrative in which both artists have engaged in a series of diss tracks that have reshaped the dynamics of rap competition. The song highlights the intensity of their feud, particularly following Kendrick's declaration of "war" in his recent work, signifying a shift from friendly competition to more personal attacks.

In the context of this evolving conflict, "Meet the Grahams" serves as a pivotal moment that amplifies the stakes in their rivalry. Kendrick’s verses are laden with direct references and veiled insults aimed at Drake, which marks a departure from previous interactions that were often more subdued and less confrontational. This escalation is significant for both artists’ legacies, as it not only underscores their individual styles and lyrical prowess but also alters the landscape of hip

In [26]:
response_summary = query_router_with_embedding("Summarize the significance of 'Meet the Grahams'")
print(response_summary)

The document discusses the recent escalation of the feud between rappers Kendrick Lamar and Drake, highlighting its cultural significance within the hip-hop community. The rivalry has intensified, with both artists releasing diss tracks targeting each other in quick succession, marking a shift in how rap conflicts are engaged. Kendrick Lamar's latest verses have been particularly aggressive, signaling a clear declaration of animosity towards Drake. This ongoing conflict is not only reshaping their individual legacies but also altering the landscape of hip-hop, emphasizing the competitive nature of the genre. The document underscores that this beef has significant implications for the future dynamics of rap music, reflecting broader themes of rivalry and artistic expression in contemporary culture.


### Break Complex Questions into Sub-Questions

This section implements a **Sub-Question Query Engine** that determines  
whether a query is related to **Drake or Kendrick Lamar** and retrieves  
relevant information accordingly.

- **Automatic Subject Classification**: GPT-4o Mini classifies the query as  
  related to **Drake** or **Kendrick** before fetching data.  
- **Targeted Query Execution**: Uses `docs_drake` if the query is about Drake  
  and `docs_kendrick` if it's about Kendrick.   

In [27]:
def determine_subject(prompt):
    classification_prompt = (
        "Determine whether the following question is about 'Drake' or 'Kendrick Lamar'. "
        "Return only 'drake' or 'kendrick'.\n\n"
        f"User Question: {prompt}"
    )
    
    response = client.chat.completions.create(
        model=llm_config["model"],
        messages=[{"role": "user", "content": classification_prompt}],
        temperature=0
    )
    
    return response.choices[0].message.content.strip().lower()

In [28]:
def sub_question_query_engine_with_embedding(prompt, top_k=3, max_tokens=3000):
    subject = determine_subject(prompt)
    
    if subject == "drake":
        relevant_docs = vector_store.similarity_search(prompt, k=top_k)
    else:  
        relevant_docs = vector_store.similarity_search(prompt, k=top_k)

    context = "\n\n".join([doc.page_content[:max_tokens] for doc in relevant_docs])

    full_prompt = f"Using the following retrieved information, answer the question: {prompt}\n\n{context}"

    return call_gpt(full_prompt)


In [29]:
response = sub_question_query_engine_with_embedding("Which albums did Drake release in his career?")
print(response)

Drake has released the following albums in his career:

1. **Thank Me Later** (2010)
2. **Take Care** (2011)
3. **Nothing Was the Same** (2013)
4. **Views** (2016)
5. **Scorpion** (2018)
6. **Certified Lover Boy** (2021)
7. **Honestly, Nevermind** (2022)
8. **Her Loss** (2022, collaborative album with 21 Savage)
9. **For All the Dogs** (2023)

These albums have collectively contributed to Drake's status as a significant figure in contemporary music, achieving commercial success and critical acclaim.


### Text-to-SQL with GPT-4o Mini

This section demonstrates **converting natural language queries into SQL**  
to retrieve data from an SQLite database.

- **Database Setup**:  
  - Downloads and extracts the **Chinook SQLite database**, which contains  
    music-related tables like `albums`, `artists`, and `tracks`.  
  - Initializes a connection to `chinook.db` using SQLAlchemy.  

- **SQL Query Generation**:  
  - Uses GPT-4o Mini to **convert natural language questions into SQL queries**.  
  - Restricts queries to the tables: `albums`, `artists`, and `tracks`.  

- **Query Execution**:  
  - Runs the generated SQL queries on the database and retrieves the results.  

This setup allows **seamless querying of structured data** using natural language. 🚀

In [30]:
def download_file(url, filepath):
    response = requests.get(url, stream=True)
    with open(filepath, "wb") as file:
        for chunk in response.iter_content(chunk_size=1024):
            file.write(chunk)

# Create data directory
os.makedirs("data", exist_ok=True)

In [31]:
download_file("https://www.sqlitetutorial.net/wp-content/uploads/2018/03/chinook.zip", "data/chinook.zip")
with zipfile.ZipFile("data/chinook.zip", "r") as zip_ref:
    zip_ref.extractall("data/")

In [32]:
engine = create_engine("sqlite:///data/chinook.db")

In [33]:
tables_schema = {
    "albums": "AlbumId, Title, ArtistId",
    "artists": "ArtistId, Name",
    "tracks": "TrackId, Name, AlbumId, Composer, MediaTypeId, GenreId, Milliseconds, Bytes, UnitPrice"
}

In [34]:
schema_docs = [
    Document(page_content=f"Table: {table}\nColumns: {columns}", metadata={"table": table})
    for table, columns in tables_schema.items()
]

In [35]:
vector_store.add_documents(schema_docs)

['90376454-bfe2-46b3-8bc8-37bc41d98157',
 '221452eb-4a41-4857-8cde-59530dd85228',
 'cefa6c06-b47b-44d1-a0ec-07d1bed41bb0']

In [36]:
def retrieve_relevant_schema(prompt, top_k=2):
    relevant_schema = vector_store.similarity_search(prompt, k=top_k)
    return "\n\n".join([doc.page_content for doc in relevant_schema])

In [37]:
def generate_sql_query_with_embedding(natural_language_query):
    relevant_schema = retrieve_relevant_schema(natural_language_query)

    prompt = f"""
    Convert the following natural language question into a SQL query for a SQLite database.
    Use only the relevant schema details provided below:

    {relevant_schema}

    **Now generate a SQL query for the following request:**
    
    "{natural_language_query}"
    
    **Return only the SQL query, without any explanation.**
    """

    response = client.chat.completions.create(
        model=llm_config["model"],
        messages=[{"role": "user", "content": prompt}],
        temperature=0
    )

    sql_query = response.choices[0].message.content.strip()

    if "```sql" in sql_query:
        sql_query = sql_query.split("```sql")[1].split("```")[0].strip()

    return sql_query

In [38]:
from sqlalchemy.sql import text  

def query_sql_database_with_embedding(natural_language_query):
    sql_query = generate_sql_query_with_embedding(natural_language_query)
    print(f"Generated SQL Query: {sql_query}")
    with engine.connect() as connection:
        result = connection.execute(text(sql_query))  
        return [row for row in result]

In [39]:
response_albums = query_sql_database_with_embedding("What are some albums?")
print("Albums:", response_albums)

Generated SQL Query: SELECT * FROM albums;
Albums: [(1, 'For Those About To Rock We Salute You', 1), (2, 'Balls to the Wall', 2), (3, 'Restless and Wild', 2), (4, 'Let There Be Rock', 1), (5, 'Big Ones', 3), (6, 'Jagged Little Pill', 4), (7, 'Facelift', 5), (8, 'Warner 25 Anos', 6), (9, 'Plays Metallica By Four Cellos', 7), (10, 'Audioslave', 8), (11, 'Out Of Exile', 8), (12, 'BackBeat Soundtrack', 9), (13, 'The Best Of Billy Cobham', 10), (14, 'Alcohol Fueled Brewtality Live! [Disc 1]', 11), (15, 'Alcohol Fueled Brewtality Live! [Disc 2]', 11), (16, 'Black Sabbath', 12), (17, 'Black Sabbath Vol. 4 (Remaster)', 12), (18, 'Body Count', 13), (19, 'Chemical Wedding', 14), (20, 'The Best Of Buddy Guy - The Millenium Collection', 15), (21, 'Prenda Minha', 16), (22, 'Sozinho Remix Ao Vivo', 16), (23, 'Minha Historia', 17), (24, 'Afrociberdelia', 18), (25, 'Da Lama Ao Caos', 18), (26, 'Acústico MTV [Live]', 19), (27, 'Cidade Negra - Hits', 19), (28, 'Na Pista', 20), (29, 'Axé Bahia 2001', 21)

In [40]:
response_artists = query_sql_database_with_embedding("What are some artists? Limit it to 5.")
print("Artists:", response_artists)

Generated SQL Query: SELECT * FROM artists LIMIT 5;
Artists: [(1, 'AC/DC'), (2, 'Accept'), (3, 'Aerosmith'), (4, 'Alanis Morissette'), (5, 'Alice In Chains')]


In [41]:
response_tracks = query_sql_database_with_embedding("What are some tracks from the artist AC/DC? Limit it to 3")
print("AC/DC Tracks:", response_tracks)

Generated SQL Query: SELECT Name FROM tracks WHERE Composer = 'AC/DC' LIMIT 3;
AC/DC Tracks: [('Go Down',), ('Dog Eat Dog',), ('Let There Be Rock',)]


### Structured Data Extraction using GPT-4o Mini

This section demonstrates **extracting structured data** from natural language  
using **GPT-4o Mini** and returning it in **JSON format**.

- **Data Extraction Process**:  
  - The model generates structured data for a **restaurant** in a given city.  
  - The output must be a **valid JSON object** containing:  
    - `name`: The restaurant's name.  
    - `city`: The specified city.  
    - `cuisine`: The type of cuisine served.  

In [42]:
from langchain_core.documents import Document
from pydantic import BaseModel

class Restaurant(BaseModel):
    """A restaurant with name, city, and cuisine."""
    name: str
    city: str
    cuisine: str

In [43]:
restaurant_data = [
    {"name": "Joe's Seafood", "city": "Miami", "cuisine": "Seafood"},
    {"name": "Pasta Paradise", "city": "New York", "cuisine": "Italian"},
    {"name": "Sushi Haven", "city": "San Francisco", "cuisine": "Japanese"},
    {"name": "BBQ King", "city": "Austin", "cuisine": "BBQ"},
]

In [44]:
restaurant_docs = [
    Document(page_content=f"Restaurant: {r['name']}, City: {r['city']}, Cuisine: {r['cuisine']}", metadata={"city": r["city"]})
    for r in restaurant_data
]

vector_store.add_documents(restaurant_docs)


['b6cd8d6a-3b59-4609-ad1d-8cdf2a571c1d',
 '49377792-710e-4632-81a7-3f734748c6bb',
 '07d2e460-0d1e-43d4-ab30-80033a763d58',
 'f67f33a3-fd24-4de6-88fe-3a8ac6026598']

In [45]:
def retrieve_restaurants_by_city(city_name, top_k=3):
    relevant_docs = vector_store.similarity_search(city_name, k=top_k)
    return relevant_docs

In [46]:
def extract_structured_data_with_embedding(city_name):
    relevant_restaurants = retrieve_restaurants_by_city(city_name)

    if not relevant_restaurants:
        prompt = f"Generate a restaurant in a given city: {city_name}. Return only a valid JSON object with keys: name, city, cuisine, without markdown formatting."
        response = client.chat.completions.create(
            model=llm_config["model"],
            messages=[{"role": "user", "content": prompt}],
            temperature=0
        )

        structured_data = response.choices[0].message.content.strip()

        if "```json" in structured_data:
            structured_data = structured_data.split("```json")[1].split("```")[0].strip()

        try:
            json_data = json.loads(structured_data)
            return Restaurant.model_validate(json_data)
        except json.JSONDecodeError as e:
            print("JSON Decode Error:", e)
            print("Raw Response:", structured_data)
            return None

    restaurant_list = []
    for doc in relevant_restaurants:
        parts = doc.page_content.split(", ")
        restaurant_dict = {
            "name": parts[0].split(": ")[1],
            "city": parts[1].split(": ")[1],
            "cuisine": parts[2].split(": ")[1],
        }
        restaurant_list.append(Restaurant(**restaurant_dict))

    return restaurant_list[0] if restaurant_list else None  # Return the first matching restaurant

In [48]:
restaurant_obj = extract_structured_data_with_embedding("Miami")
print(restaurant_obj)

name="Joe's Seafood" city='Miami' cuisine='Seafood'


### Adding Chat History to RAG (Chat Engine)

This section implements a **stateful chatbot** that integrates **chat history**  
with **Retrieval-Augmented Generation (RAG)** to provide more context-aware responses.

- **Chat Memory Management**:  
  - Stores past interactions in `ChatMemory` to maintain conversation flow.  
  - Limits stored messages to prevent exceeding token constraints.  

- **Contextual Retrieval**:  
  - Combines **user input, past chat history, and relevant document context**  
    (e.g., about Kendrick & Drake) to generate informed responses.  

In [49]:
from langchain_core.documents import Document

class ChatMemoryWithEmbeddings:
    def __init__(self, token_limit=10000): 
        self.token_limit = token_limit
        self.messages = []
        self.vector_store = vector_store  

    def add_message(self, role, content):
        self.messages.append({"role": role, "content": content})

        doc = Document(page_content=content[:1000], metadata={"role": role}) 
        self.vector_store.add_documents([doc])

        if len(self.messages) > 20:  
            self.messages.pop(0)

    def retrieve_relevant_history(self, prompt, top_k=2):
        relevant_docs = self.vector_store.similarity_search(prompt, k=top_k)
        return "\n\n".join([doc.page_content[:500] for doc in relevant_docs])  


In [50]:
memory = ChatMemoryWithEmbeddings()

In [51]:
def chat_with_history_using_embeddings(prompt):
    relevant_history = memory.retrieve_relevant_history(prompt, top_k=2)

    relevant_docs = vector_store.similarity_search(prompt, k=2)
    document_context = "\n\n".join([doc.page_content[:1000] for doc in relevant_docs])  # Limit document size

    context_prompt = (
        "You are a chatbot, able to have normal interactions, as well as talk "
        "about the Kendrick and Drake beef. Use the retrieved chat history and document context:\n\n"
        f"Chat History (trimmed):\n{relevant_history}\n\n"
        f"Relevant Documents (trimmed):\n{document_context}\n\n"
        "Instruction: Use the previous chat history, or the context above, to interact and help the user."
    )

    messages = [
        {"role": "system", "content": context_prompt},
        {"role": "user", "content": prompt}
    ]

    total_tokens = sum(len(msg["content"].split()) for msg in messages)
    if total_tokens > 10000:
        print(f"Warning: Trimming messages to fit within token limit.")
        messages = messages[-5:]  

    response = client.chat.completions.create(
        model=llm_config["model"],
        messages=messages,
        temperature=0.7
    )

    response_content = response.choices[0].message.content.strip()
    memory.add_message("assistant", response_content) 

    return response_content



In [52]:
response = chat_with_history_using_embeddings("Tell me about the songs Drake released in the beef.")
print(response)

In the recent escalation of the beef between Kendrick Lamar and Drake, Drake released a three-part response that included a music video. While the specific titles of the songs weren't mentioned in the chat history, this kind of release typically involves sharp lyrics aimed at Kendrick, showcasing Drake's lyrical prowess and addressing the ongoing conflict.

Drake's strategy often includes clever wordplay and personal jabs, which can resonate well with his audience. Kendrick, known for his aggressive and introspective style, usually responds with equal intensity. If you’re looking for more detailed information on the specific titles and content of the songs, I can help summarize their themes or lyrical content based on their previous discographies!


In [53]:
response = chat_with_history_using_embeddings("What about Kendrick?")
print(response)

Kendrick Lamar is widely regarded as one of the greatest rappers of all time. He was born on June 17, 1987, in Compton, California, and began his music career under the stage name K.Dot while still in high school. He signed with Top Dawg Entertainment in 2005 and quickly gained recognition for his unique style, which combines personal storytelling with social and political commentary.

Kendrick is known for his intricate lyricism and thought-provoking themes, often addressing issues like race, identity, and systemic inequality. His album "DAMN." won the Pulitzer Prize for Music in 2018, making him the first non-classical or jazz musician to receive this honor.

In the context of his ongoing beef with Drake, Kendrick's approach tends to be more aggressive and introspective, often using his lyrics to make sharp social critiques. If you want to know more about his discography or specific songs, feel free to ask!


## 7. Agents

Here we build agents with gpt-4o-mini . We perform RAG over simple functions as well as the documents above.

In [54]:
import nest_asyncio
import json
from langchain_core.documents import Document

nest_asyncio.apply()

In [55]:
# Define mathematical functions
def multiply(a: int, b: int) -> int:
    """Multiply two integers and return the result."""
    return a * b

def add(a: int, b: int) -> int:
    """Add two integers and return the result."""
    return a + b

def subtract(a: int, b: int) -> int:
    """Subtract two integers and return the result."""
    return a - b

def divide(a: int, b: int) -> int:
    """Divide two integers and return the result."""
    return a / b if b != 0 else "Cannot divide by zero"

# Function map for the agent
tools = {
    "multiply": multiply,
    "add": add,
    "subtract": subtract,
    "divide": divide
}

In [56]:
function_examples = [
    {"function": "multiply", "query": "What is 5 times 3?", "args": [5, 3]},
    {"function": "add", "query": "What is 10 plus 4?", "args": [10, 4]},
    {"function": "subtract", "query": "What is 20 minus 7?", "args": [20, 7]},
    {"function": "divide", "query": "What is 15 divided by 5?", "args": [15, 5]},
]

In [57]:
function_docs = [
    Document(page_content=f"Query: {ex['query']}, Function: {ex['function']}, Args: {ex['args']}")
    for ex in function_examples
]

vector_store.add_documents(function_docs)

['4b78c853-6601-4f64-a7c4-a44d0d609e48',
 'aee11b5a-11fc-43a2-94b6-522cb819e09c',
 '210ce946-259b-4e7a-add0-c69d6dcdeec9',
 'a8f07c6c-fcde-4212-ba8c-2f7b11914ceb']

In [58]:
def retrieve_relevant_function_examples(prompt, top_k=2):
    relevant_docs = vector_store.similarity_search(prompt, k=top_k)
    return "\n\n".join([doc.page_content for doc in relevant_docs])


In [59]:
def execute_tool(tool_name, *args):
    if tool_name in tools:
        return tools[tool_name](*args)
    return "Invalid tool request"


In [60]:
def agent_chat_with_embeddings(prompt):
    relevant_examples = retrieve_relevant_function_examples(prompt)

    system_prompt = (
        "You are a smart assistant capable of performing arithmetic operations. "
        "You can use the following functions: multiply, add, subtract, divide. "
        "Below are relevant past function calls for reference:\n\n"
        f"{relevant_examples}\n\n"
        "When given a math question, return the correct function and inputs in JSON format. "
        "Ensure the JSON output follows this format:\n\n"
        "{\n  \"function\": \"function_name\",\n  \"arguments\": [arg1, arg2]\n}\n\n"
        "Return only valid JSON with no extra text or formatting."
    )

    messages = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": prompt}
    ]

    response = client.chat.completions.create(
        model=llm_config["model"],
        messages=messages,
        temperature=0
    )

    response_content = response.choices[0].message.content.strip()

    try:
        if "```json" in response_content:
            response_content = response_content.split("```json")[1].split("```")[0].strip()

        tool_request = json.loads(response_content)
        tool_name = tool_request.get("function")
        arguments = tool_request.get("arguments", [])

        result = execute_tool(tool_name, *arguments)
        return result
    except json.JSONDecodeError:
        print("Failed to parse response:", response_content)
        return "Error: Could not parse the response as JSON."


In [62]:
response = agent_chat_with_embeddings("What is (121 + 2) * 5?")
print(response)

615


### ReAct Agent with RAG QueryEngine Tools

This section implements a **ReAct-style agent** that dynamically selects between  
**Drake-related** and **Kendrick-related** document retrieval using **GPT-4o Mini**.

- **ReAct Framework**:  
  - Uses **reasoning + action** to **select the right tool** for querying.  
  - Determines whether to call **`query_drake`** or **`query_kendrick`** based on the prompt.  

- **Tool-Based Query Execution**:  
  - **GPT-4o Mini** analyzes the query and responds in **JSON format** specifying the correct tool.  
  - The tool is then **executed dynamically** to fetch relevant information.  

This approach enables **intelligent document selection** and **enhanced retrieval accuracy**. 🚀

In [63]:
import json
from langchain_core.documents import Document

def query_rag_with_embedding(prompt, top_k=3, max_tokens=3000):
    relevant_docs = vector_store.similarity_search(prompt, k=top_k)

    context = "\n\n".join([doc.page_content[:max_tokens] for doc in relevant_docs])

    full_prompt = f"Using the following retrieved information, answer the question: {prompt}\n\n{context}"

    return call_gpt(full_prompt)

In [64]:
def determine_subject_using_embeddings(prompt):
    classification_prompt = (
        "Determine whether the following question is about 'Drake' or 'Kendrick Lamar'. "
        "Return only 'drake' or 'kendrick'.\n\n"
        f"User Query: {prompt}"
    )
    
    response = client.chat.completions.create(
        model=llm_config["model"],
        messages=[{"role": "user", "content": classification_prompt}],
        temperature=0
    )
    
    return response.choices[0].message.content.strip().lower()

In [65]:
def react_agent_with_embeddings(prompt):
    subject = determine_subject_using_embeddings(prompt)

    relevant_docs = vector_store.similarity_search(prompt, k=3)
    document_context = "\n\n".join([doc.page_content[:1000] for doc in relevant_docs])  # Limit document size

    system_prompt = (
        f"You are an AI assistant capable of retrieving and summarizing information about Drake and Kendrick Lamar. "
        f"Use the retrieved document context below to generate your response:\n\n"
        f"Relevant Documents (trimmed):\n{document_context}\n\n"
        f"Instruction: Answer the user's query using the provided context."
    )

    messages = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": prompt}
    ]

    response = client.chat.completions.create(
        model=llm_config["model"],
        messages=messages,
        temperature=0
    )

    return response.choices[0].message.content.strip()


In [66]:
response = react_agent_with_embeddings("Tell me about how Kendrick and Drake grew up")
print(response)

Kendrick Lamar was born on June 17, 1987, in Compton, California. He began his music career under the stage name K.Dot while still in high school. Growing up in Compton, Kendrick was influenced by the environment around him, which shaped his music and lyrical themes. His work often reflects personal storytelling intertwined with social and political commentary, addressing issues like race, identity, and systemic inequality.

Drake, on the other hand, was born on October 24, 1986, in Toronto, Canada. He grew up in a mixed-race household, with his father being African American and his mother being Jewish Canadian. Drake's early life was marked by his experiences in the entertainment industry, as he started acting in the Canadian teen drama series "Degrassi: The Next Generation." His background in acting and music eventually led him to pursue a career in hip-hop, where he became known for his emotional and introspective style.

While both artists have different backgrounds—Kendrick's root