# Exercise 2: Embedding models

In this exercise, we'll explore embedding models.

Most embedding models handle text input and output floating point numbers. 

The **dimensions** is the number of dimensions in that model. The "embeddings", or "vectors" are the floating point numbers.

Like the chat completions client, the OpenAI SDK has become the standard for how people use embedding models.

For this exercise, I have pre-computed the embeddings for a "database" of 166 items of clothing for a shop (see data/clothes.json). This will save you time. 

In [37]:
# Your first embedding with a model
from openai import OpenAI
import utils

# If you change the environment variables, you need to restart the kernel
base_url = utils.get_base_url()
api_key = utils.get_api_key()

if utils.MODE == "github":
    model = "text-embedding-3-small"  # A fast, small model
    base_url = "https://models.inference.ai.azure.com"
elif utils.MODE == "ollama":
    model = "nomic-embed-text"  # A comparable open-source model

# OpenAI client is a class. The old API used to use globals. Sometimes you might see code snippets for the old API. 

client = OpenAI(
    base_url=base_url,
    api_key=api_key,
)

def get_embedding(text, dimensions=1024):
    response = client.embeddings.create(
        input=text,
        model=model,
        dimensions=dimensions, # default is 1536 for text-embedding-3-small. Is not an arbitrary number, is one of the accepted values (256, 512, 1024)
    )
    return response.data[0].embedding

First lets try getting an embedding for a piece of text.

In [38]:
beans_embedding = get_embedding("delicious beans")
print(len(beans_embedding), beans_embedding[:10])

1024 [-0.012074312195181847, -0.06039705127477646, -0.011071659624576569, -0.06430569291114807, -0.02080078423023224, -0.04166954755783081, -0.001650552498176694, 0.0812997967004776, 0.008726472966372967, -0.05580864101648331]


In [39]:
import pandas as pd
from utils.embeddings import cosine_similarity  # See utils/embeddings.py for the cosine similarity function (its not complicated)

data = pd.read_json("data/clothes.json")


def search_df(df, product_description, n=3):
    embedding = get_embedding(product_description, dimensions=1024)
    df['similarities'] = df.embedding.apply(lambda x: cosine_similarity(x, embedding))
    res = df.sort_values('similarities', ascending=False).head(n)
    return res

data

Unnamed: 0,id,name,description,image,price,embedding,image_embedding
0,1,Azure Dream T-Shirt,"A soft, azure-colored t-shirt made from 100% o...",1.jpeg,19.99,"[0.05275312066078101, 0.009046390652656, -0.03...","[-0.6689453, -0.12243652000000001, -1.8681641,..."
1,2,Crimson Night Hoodie,"A warm, crimson hoodie with a kangaroo pocket ...",2.jpeg,39.99,"[-0.014993565157055001, 0.025782626122236002, ...","[1.4003906, -1.3886718999999998, -2.4453125, -..."
2,3,Golden Glow Dress,"A stunning, golden dress with a flowing silhou...",3.jpeg,59.99,"[0.055652309209108006, 0.021441116929054, -0.0...","[0.98291016, -0.10308838000000001, -0.08532715..."
3,4,Emerald Wave Shorts,"Comfortable, emerald green shorts with an elas...",4.jpeg,24.99,"[0.052464514970779, 0.036760117858648, 0.01885...","[1.9863281, 1.3964843999999998, -0.8095703, -2..."
4,5,Sapphire Breeze Jacket,"A lightweight, sapphire blue jacket with a wat...",5.jpeg,49.99,"[0.013225099071860001, 0.025714728981256003, -...","[0.5888671999999999, 0.1887207, -2.6171875, 0...."
...,...,...,...,...,...,...,...
161,162,Amber Glow Cardigan,"A cozy, amber-colored cardigan with chunky kni...",162.jpeg,44.99,"[0.007482758723199001, 0.017736375331878003, -...","[1.59375, 1.4335938, -1.2958984, -1.0791016, -..."
162,163,Midnight Eclipse Dress,"A sleek, midnight black dress with a flatterin...",163.jpeg,89.99,"[0.03593635186553, 0.005488856229931001, 0.014...","[1.6962891, 0.7714844, -0.55322266, -1.3916016..."
163,164,Sunset Haze Scarf,"A lightweight, sunset orange scarf with a soft...",164.jpeg,19.99,"[0.040957491844892, -0.009238531813025001, -0....","[2.1152344, 2.0410156, -1.7548827999999999, 0...."
164,165,Ivory Dream Blouse,"An elegant, ivory blouse with delicate lace de...",165.jpeg,34.99,"[0.06905066221952401, -0.000338043930241, -0.0...","[1.9111327999999999, 0.34960938, 0.08654785, -..."


In [40]:
res = search_df(data, 'wooly hat', n=3)

res

Unnamed: 0,id,name,description,image,price,embedding,image_embedding,similarities
112,113,Slate Gray Beanie,"A cozy, slate gray beanie to keep you warm dur...",113.jpeg,19.99,"[0.020402135327458, -0.021744921803474003, -0....","[-0.75439453, 0.38989258, -2.1875, -0.5605469,...",0.552354
104,105,Blizzard White Beanie,"A snug, blizzard white beanie with a fluffy po...",105.jpeg,12.99,"[0.022073635831475, -0.017169991508126002, -0....","[-1.1884766, 0.40014648, -1.7167968999999998, ...",0.54716
20,21,Golden Harvest Beanie,"A warm, golden-yellow beanie made from soft wo...",21.jpeg,15.99,"[0.028872488066554004, 0.011147356592118001, -...","[-0.5180663999999999, 0.31347656, -2.2246094, ...",0.504835


In [41]:
# This works for many languages, not just English
search_df(data, 'gorro de lana', n=3)  # Spanish

Unnamed: 0,id,name,description,image,price,embedding,image_embedding,similarities
112,113,Slate Gray Beanie,"A cozy, slate gray beanie to keep you warm dur...",113.jpeg,19.99,"[0.020402135327458, -0.021744921803474003, -0....","[-0.75439453, 0.38989258, -2.1875, -0.5605469,...",0.386156
120,121,Amber Glow Scarf,"A warm amber-colored scarf made from soft, hig...",121.jpeg,29.99,"[0.032102942466735, 0.025921884924173, -0.0571...","[0.6245117, 1.0107422, -2.6308594, 0.611816399...",0.358473
155,156,Silver Lining Scarf,"A luxurious, silver-grey scarf made from soft,...",156.jpeg,34.99,"[0.016723128035664, -0.017999913543462, -0.078...","[-0.31738279999999996, 1.3271484, -2.2070312, ...",0.351074


In [42]:
search_df(data, 'หมวกบีนนี่', n=3)  # Thai

Unnamed: 0,id,name,description,image,price,embedding,image_embedding,similarities
69,70,Crimson Tide Beanie,"A snug, crimson red beanie made from 100% wool...",70.jpeg,19.99,"[-0.019580882042646002, -0.010400461032986, 0....","[-0.69970703, -0.32641602000000003, -1.671875,...",0.206713
89,90,Cloud Nine Pajama Set,A heavenly soft pajama set in a calming sky bl...,90.jpeg,34.99,"[0.004278134554624001, -0.040751971304416004, ...","[1.2128906, -0.59228516, -1.0634766, 0.6416015...",0.187624
104,105,Blizzard White Beanie,"A snug, blizzard white beanie with a fluffy po...",105.jpeg,12.99,"[0.022073635831475, -0.017169991508126002, -0....","[-1.1884766, 0.40014648, -1.7167968999999998, ...",0.182232


Even though we searched for the same thing in 3 different languages, the similarity score (right column) was quite different.
The embedding models are multilingual but same-language will score higher.

Computing the similarities for every item is highly intensive, so we can use indexes to cluster vectors together to speed up the search.

# Combining Text Completions and Embeddings to make a RAG bot

We can combine the text-completions (LLM) with the embedding search to find relevant products and include them in the chat.

This information could also be something like a knowledgebase, wiki, or an unstructured data store.

The stages are:

1. Get the request from the user
1. Search the embedding index for similar matches
1. Give those matches to the LLM along with the original question or query
1. Ask it to generate a response
1. Give the response back to the user

In [48]:
if utils.MODE == "github":
    chat_model = "gpt-4.1-nano"  # A fast, small model
elif utils.MODE == "ollama":
    chat_model = "llama3.1"  # llama and ollama are not related. It's a coincidence


def rag_chat(query, n=3):
    # Step 1: Get the embedding for the query
    matches = search_df(data, query, n=n)
    
    # Merge this into a prompt

    system_prompt = f"""
    The user has asked about a product, you are a helpful assistant that can give suggestions about products we have. 

    The matching products are:

    Name: {matches.iloc[0]['name']}
    Description: {matches.iloc[0].description}
    URL: https://www.superpythonshop.com/products/{matches.iloc[0].id}

    Name: {matches.iloc[1]['name']}
    Description: {matches.iloc[1].description}
    URL: https://www.superpythonshop.com/products/{matches.iloc[1].id}

    Name: {matches.iloc[2]['name']}
    Description: {matches.iloc[2].description}
    URL: https://www.superpythonshop.com/products/{matches.iloc[2].id}

    """

    # Step 2: Call the model with the prompt
    response = client.chat.completions.create(
        model=chat_model,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": query},
        ],
        temperature=0.5,
        n=1,
    )

    # Step 3: Return the response
    return response.choices[0].message.content

from IPython.display import display, Markdown

display(Markdown(rag_chat("I need a warm hat for winter")))


We have several warm hats perfect for winter! 

- The Slate Gray Beanie is cozy and made from soft, knitted wool, ideal for keeping warm.
- The Arctic White Beanie offers a snug fit with a soft, knitted texture to protect against the cold.
- The Golden Harvest Beanie is also warm and soft, perfect for chilly days.

Would you like more details on any of these options?

# Task

Your next job is to iterate on this prompt to refine it and improve the suggestions. Try different queries and searches to see what it does.

Try:
- Looking for something silly
- Looking for something that doesn't exist
- Starting an argument with it
- Asking a question with errors
- Asking a question in a different language


Can you get a good prompt?