# **Welcome to the notebook**

### Task 1 - Set up project environment

Installing the needed modules

In [1]:
!pip install openai==1.16.2 python-dotenv

Collecting openai==1.16.2
  Downloading openai-1.16.2-py3-none-any.whl.metadata (21 kB)
Downloading openai-1.16.2-py3-none-any.whl (267 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m267.1/267.1 kB[0m [31m6.6 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: openai
  Attempting uninstall: openai
    Found existing installation: openai 2.16.0
    Uninstalling openai-2.16.0:
      Successfully uninstalled openai-2.16.0
Successfully installed openai-1.16.2


In [2]:
!pip install --upgrade openai httpx python-dotenv

Collecting openai
  Downloading openai-2.17.0-py3-none-any.whl.metadata (29 kB)
Downloading openai-2.17.0-py3-none-any.whl (1.1 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.1/1.1 MB[0m [31m9.7 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: openai
  Attempting uninstall: openai
    Found existing installation: openai 1.16.2
    Uninstalling openai-1.16.2:
      Successfully uninstalled openai-1.16.2
Successfully installed openai-2.17.0


Importing the needed modules and setup the OpenAI API

In [3]:
import pandas as pd
import numpy as np
import os
from openai import OpenAI
from dotenv import load_dotenv
from matplotlib import pyplot as plt
import plotly.express as px

from sklearn.decomposition import PCA
from sklearn.metrics.pairwise import cosine_similarity

# Loading API key and organization ID from a dotenv file
load_dotenv(dotenv_path='/content/apikey.env.txt')

# Retrieving API key and organization ID from environment variables
APIKEY = os.getenv("APIKEY")


# Creating an instance of the OpenAI client with the provided API key and organization ID
client = OpenAI(
  api_key=APIKEY
)

client

<openai.OpenAI at 0x798d44c4a0c0>

Import our dataset

In [4]:
data = pd.read_csv("products_dataset.csv")
data

Unnamed: 0,product_id,title,description
0,P0,Men's 3X Large Carbon Heather Cotton/Polyester...,"This heavyweight, water-repellent hooded sweat..."
1,P1,Turmode 30 ft. RP TNC Female to RP TNC Male Ad...,If you need more length between your existing ...
2,P2,Large Tapestry Bolster Bed,Polyester cover resembling rich Italian tapest...
3,P3,16-Gauge-Sinks Vessel Sink in White with Faucet,It features a rectangle shape. This vessel set...
4,P4,Men's Crazy Horse 9'' Logger Boot - Steel Toe ...,This 9 in. black full grain leather logger boo...
...,...,...,...
1995,P1995,Dotty Black and White Black and White Wallpape...,"With a stylish monochrome look, this dotty wal..."
1996,P1996,Abrielle Brown/Light Gray 8 ft. x 10 ft. Orien...,The Abrielle collection features a stunning as...
1997,P1997,20 in. x 2-1/2 in. x 2-1/2 in. Polyurethane As...,"With Fypon balustrade systems, you can transfo..."
1998,P1998,1 gal. #P120-6 Diva Glam Flat Exterior Paint &...,BEHR PREMIUM PLUS Exterior Paint & Primer is a...


List of last 8 products recently viewed by the user.

In [5]:
searched_products_id = [
    'P1938',
    'P1970',
    'P1044',
    'P1838',
    'P1048',
    'P1017',
    'P1310',
    'P1444',
]

### Task 2 - Prepare the dataset

Let's label the data points that are recently veiwed.

In [6]:
data['product_status'] = 'not_viewed'
data.loc[data['product_id'].isin(searched_products_id), 'product_status'] = 'recently_viewed'
data[data.product_status == 'recently_viewed']

Unnamed: 0,product_id,title,description,product_status
1017,P1017,1 qt. #660D-7 Blackberry Farm Satin Enamel Int...,Love your space like never before with the hig...,recently_viewed
1044,P1044,1 qt. #M360-4 Marjoram One-Coat Hide Eggshell ...,Introducing the best of BEHR Paint. Featuring ...,recently_viewed
1048,P1048,5 gal. #640C-1 Hosta Flower Extra Durable Sati...,BEHR ULTRA SCUFF DEFENSE Stain-Blocking Paint ...,recently_viewed
1310,P1310,5 gal. #180A-2 Romantic Morn Extra Durable Sem...,BEHR ULTRA SCUFF DEFENSE Stain-Blocking Paint ...,recently_viewed
1444,P1444,5 gal. #PPU12-17 Cameroon Green Extra Durable ...,BEHR ULTRA SCUFF DEFENSE Stain-Blocking Paint ...,recently_viewed
1838,P1838,5 gal. #N340-2 Dune Grass Extra Durable Satin ...,BEHR ULTRA SCUFF DEFENSE Stain-Blocking Paint ...,recently_viewed
1938,P1938,1 gal. #HDC-SP16-10 Japanese Rose Garden Semi-...,Introducing the best of BEHR Paint. Featuring ...,recently_viewed
1970,P1970,8 oz. #510C-3 Rivers Edge Semi-Gloss Enamel St...,Introducing the best of BEHR Paint. Featuring ...,recently_viewed


Now let's combine the product `title` and `description` and store it into a column called `combined`.

In [7]:
data['combined'] = data.title + data.description
data

Unnamed: 0,product_id,title,description,product_status,combined
0,P0,Men's 3X Large Carbon Heather Cotton/Polyester...,"This heavyweight, water-repellent hooded sweat...",not_viewed,Men's 3X Large Carbon Heather Cotton/Polyester...
1,P1,Turmode 30 ft. RP TNC Female to RP TNC Male Ad...,If you need more length between your existing ...,not_viewed,Turmode 30 ft. RP TNC Female to RP TNC Male Ad...
2,P2,Large Tapestry Bolster Bed,Polyester cover resembling rich Italian tapest...,not_viewed,Large Tapestry Bolster BedPolyester cover rese...
3,P3,16-Gauge-Sinks Vessel Sink in White with Faucet,It features a rectangle shape. This vessel set...,not_viewed,16-Gauge-Sinks Vessel Sink in White with Fauce...
4,P4,Men's Crazy Horse 9'' Logger Boot - Steel Toe ...,This 9 in. black full grain leather logger boo...,not_viewed,Men's Crazy Horse 9'' Logger Boot - Steel Toe ...
...,...,...,...,...,...
1995,P1995,Dotty Black and White Black and White Wallpape...,"With a stylish monochrome look, this dotty wal...",not_viewed,Dotty Black and White Black and White Wallpape...
1996,P1996,Abrielle Brown/Light Gray 8 ft. x 10 ft. Orien...,The Abrielle collection features a stunning as...,not_viewed,Abrielle Brown/Light Gray 8 ft. x 10 ft. Orien...
1997,P1997,20 in. x 2-1/2 in. x 2-1/2 in. Polyurethane As...,"With Fypon balustrade systems, you can transfo...",not_viewed,20 in. x 2-1/2 in. x 2-1/2 in. Polyurethane As...
1998,P1998,1 gal. #P120-6 Diva Glam Flat Exterior Paint &...,BEHR PREMIUM PLUS Exterior Paint & Primer is a...,not_viewed,1 gal. #P120-6 Diva Glam Flat Exterior Paint &...


### Task 3 - Text embedding and visualization


Creating the text embedding vector

In [12]:
response = client.embeddings.create(
    input = data.combined.tolist(),
    model = "text-embedding-3-small",
    dimensions= 512
)
vectors = [d.embedding for d in response.data]
data['text_embeddings'] = vectors
data

Unnamed: 0,product_id,title,description,product_status,combined,text_embeddings
0,P0,Men's 3X Large Carbon Heather Cotton/Polyester...,"This heavyweight, water-repellent hooded sweat...",not_viewed,Men's 3X Large Carbon Heather Cotton/Polyester...,"[0.03744583949446678, 0.03042474389076233, -0...."
1,P1,Turmode 30 ft. RP TNC Female to RP TNC Male Ad...,If you need more length between your existing ...,not_viewed,Turmode 30 ft. RP TNC Female to RP TNC Male Ad...,"[0.03523961082100868, 0.013278326019644737, 0...."
2,P2,Large Tapestry Bolster Bed,Polyester cover resembling rich Italian tapest...,not_viewed,Large Tapestry Bolster BedPolyester cover rese...,"[0.035860564559698105, -0.05905349925160408, 0..."
3,P3,16-Gauge-Sinks Vessel Sink in White with Faucet,It features a rectangle shape. This vessel set...,not_viewed,16-Gauge-Sinks Vessel Sink in White with Fauce...,"[-0.05834035575389862, -0.007969953119754791, ..."
4,P4,Men's Crazy Horse 9'' Logger Boot - Steel Toe ...,This 9 in. black full grain leather logger boo...,not_viewed,Men's Crazy Horse 9'' Logger Boot - Steel Toe ...,"[0.01998496614396572, 0.05075598508119583, -0...."
...,...,...,...,...,...,...
1995,P1995,Dotty Black and White Black and White Wallpape...,"With a stylish monochrome look, this dotty wal...",not_viewed,Dotty Black and White Black and White Wallpape...,"[0.08823681622743607, -0.05279356613755226, -0..."
1996,P1996,Abrielle Brown/Light Gray 8 ft. x 10 ft. Orien...,The Abrielle collection features a stunning as...,not_viewed,Abrielle Brown/Light Gray 8 ft. x 10 ft. Orien...,"[0.01092537585645914, -0.040394917130470276, 0..."
1997,P1997,20 in. x 2-1/2 in. x 2-1/2 in. Polyurethane As...,"With Fypon balustrade systems, you can transfo...",not_viewed,20 in. x 2-1/2 in. x 2-1/2 in. Polyurethane As...,"[-0.034084752202034, -0.009548565372824669, 0...."
1998,P1998,1 gal. #P120-6 Diva Glam Flat Exterior Paint &...,BEHR PREMIUM PLUS Exterior Paint & Primer is a...,not_viewed,1 gal. #P120-6 Diva Glam Flat Exterior Paint &...,"[-0.010861445218324661, -0.014231621287763119,..."


> We know that each vector has 512 dimensions. In order to be able to visualize the vectors in a scatter plot, we need to use Principal Component Analysis (PCA) to reduce the dimension from 512 to 2.

In [13]:
pca = PCA(2)
vector_2d = pca.fit_transform(data.text_embeddings.tolist())
data['pc1'] = vector_2d[:,0]
data['pc2'] = vector_2d[:,1]
data

Unnamed: 0,product_id,title,description,product_status,combined,text_embeddings,pc1,pc2
0,P0,Men's 3X Large Carbon Heather Cotton/Polyester...,"This heavyweight, water-repellent hooded sweat...",not_viewed,Men's 3X Large Carbon Heather Cotton/Polyester...,"[0.03744583949446678, 0.03042474389076233, -0....",-0.013345,0.071633
1,P1,Turmode 30 ft. RP TNC Female to RP TNC Male Ad...,If you need more length between your existing ...,not_viewed,Turmode 30 ft. RP TNC Female to RP TNC Male Ad...,"[0.03523961082100868, 0.013278326019644737, 0....",-0.357207,0.236119
2,P2,Large Tapestry Bolster Bed,Polyester cover resembling rich Italian tapest...,not_viewed,Large Tapestry Bolster BedPolyester cover rese...,"[0.035860564559698105, -0.05905349925160408, 0...",-0.201097,-0.206634
3,P3,16-Gauge-Sinks Vessel Sink in White with Faucet,It features a rectangle shape. This vessel set...,not_viewed,16-Gauge-Sinks Vessel Sink in White with Fauce...,"[-0.05834035575389862, -0.007969953119754791, ...",-0.181801,0.043378
4,P4,Men's Crazy Horse 9'' Logger Boot - Steel Toe ...,This 9 in. black full grain leather logger boo...,not_viewed,Men's Crazy Horse 9'' Logger Boot - Steel Toe ...,"[0.01998496614396572, 0.05075598508119583, -0....",-0.214502,0.141177
...,...,...,...,...,...,...,...,...
1995,P1995,Dotty Black and White Black and White Wallpape...,"With a stylish monochrome look, this dotty wal...",not_viewed,Dotty Black and White Black and White Wallpape...,"[0.08823681622743607, -0.05279356613755226, -0...",-0.042134,-0.196931
1996,P1996,Abrielle Brown/Light Gray 8 ft. x 10 ft. Orien...,The Abrielle collection features a stunning as...,not_viewed,Abrielle Brown/Light Gray 8 ft. x 10 ft. Orien...,"[0.01092537585645914, -0.040394917130470276, 0...",-0.246671,-0.481560
1997,P1997,20 in. x 2-1/2 in. x 2-1/2 in. Polyurethane As...,"With Fypon balustrade systems, you can transfo...",not_viewed,20 in. x 2-1/2 in. x 2-1/2 in. Polyurethane As...,"[-0.034084752202034, -0.009548565372824669, 0....",-0.082898,0.103884
1998,P1998,1 gal. #P120-6 Diva Glam Flat Exterior Paint &...,BEHR PREMIUM PLUS Exterior Paint & Primer is a...,not_viewed,1 gal. #P120-6 Diva Glam Flat Exterior Paint &...,"[-0.010861445218324661, -0.014231621287763119,...",0.506961,0.005647


Now that we have the text embedding vectors in two dimensions, we can use them to create a 2D plot.

In [15]:
px.scatter(data, x = 'pc1', y = 'pc2', color = 'product_status')

### Task 4 - Find similar products

In [16]:
data.head()

Unnamed: 0,product_id,title,description,product_status,combined,text_embeddings,pc1,pc2
0,P0,Men's 3X Large Carbon Heather Cotton/Polyester...,"This heavyweight, water-repellent hooded sweat...",not_viewed,Men's 3X Large Carbon Heather Cotton/Polyester...,"[0.03744583949446678, 0.03042474389076233, -0....",-0.013345,0.071633
1,P1,Turmode 30 ft. RP TNC Female to RP TNC Male Ad...,If you need more length between your existing ...,not_viewed,Turmode 30 ft. RP TNC Female to RP TNC Male Ad...,"[0.03523961082100868, 0.013278326019644737, 0....",-0.357207,0.236119
2,P2,Large Tapestry Bolster Bed,Polyester cover resembling rich Italian tapest...,not_viewed,Large Tapestry Bolster BedPolyester cover rese...,"[0.035860564559698105, -0.05905349925160408, 0...",-0.201097,-0.206634
3,P3,16-Gauge-Sinks Vessel Sink in White with Faucet,It features a rectangle shape. This vessel set...,not_viewed,16-Gauge-Sinks Vessel Sink in White with Fauce...,"[-0.05834035575389862, -0.007969953119754791, ...",-0.181801,0.043378
4,P4,Men's Crazy Horse 9'' Logger Boot - Steel Toe ...,This 9 in. black full grain leather logger boo...,not_viewed,Men's Crazy Horse 9'' Logger Boot - Steel Toe ...,"[0.01998496614396572, 0.05075598508119583, -0....",-0.214502,0.141177


Get the data related to `recently_viewed` and `not_viewed` products

In [18]:
df_recently_viewed = data[data.product_status == 'recently_viewed']
df_not_viewed = data[data.product_status == 'not_viewed']
df_not_viewed

Unnamed: 0,product_id,title,description,product_status,combined,text_embeddings,pc1,pc2
0,P0,Men's 3X Large Carbon Heather Cotton/Polyester...,"This heavyweight, water-repellent hooded sweat...",not_viewed,Men's 3X Large Carbon Heather Cotton/Polyester...,"[0.03744583949446678, 0.03042474389076233, -0....",-0.013345,0.071633
1,P1,Turmode 30 ft. RP TNC Female to RP TNC Male Ad...,If you need more length between your existing ...,not_viewed,Turmode 30 ft. RP TNC Female to RP TNC Male Ad...,"[0.03523961082100868, 0.013278326019644737, 0....",-0.357207,0.236119
2,P2,Large Tapestry Bolster Bed,Polyester cover resembling rich Italian tapest...,not_viewed,Large Tapestry Bolster BedPolyester cover rese...,"[0.035860564559698105, -0.05905349925160408, 0...",-0.201097,-0.206634
3,P3,16-Gauge-Sinks Vessel Sink in White with Faucet,It features a rectangle shape. This vessel set...,not_viewed,16-Gauge-Sinks Vessel Sink in White with Fauce...,"[-0.05834035575389862, -0.007969953119754791, ...",-0.181801,0.043378
4,P4,Men's Crazy Horse 9'' Logger Boot - Steel Toe ...,This 9 in. black full grain leather logger boo...,not_viewed,Men's Crazy Horse 9'' Logger Boot - Steel Toe ...,"[0.01998496614396572, 0.05075598508119583, -0....",-0.214502,0.141177
...,...,...,...,...,...,...,...,...
1995,P1995,Dotty Black and White Black and White Wallpape...,"With a stylish monochrome look, this dotty wal...",not_viewed,Dotty Black and White Black and White Wallpape...,"[0.08823681622743607, -0.05279356613755226, -0...",-0.042134,-0.196931
1996,P1996,Abrielle Brown/Light Gray 8 ft. x 10 ft. Orien...,The Abrielle collection features a stunning as...,not_viewed,Abrielle Brown/Light Gray 8 ft. x 10 ft. Orien...,"[0.01092537585645914, -0.040394917130470276, 0...",-0.246671,-0.481560
1997,P1997,20 in. x 2-1/2 in. x 2-1/2 in. Polyurethane As...,"With Fypon balustrade systems, you can transfo...",not_viewed,20 in. x 2-1/2 in. x 2-1/2 in. Polyurethane As...,"[-0.034084752202034, -0.009548565372824669, 0....",-0.082898,0.103884
1998,P1998,1 gal. #P120-6 Diva Glam Flat Exterior Paint &...,BEHR PREMIUM PLUS Exterior Paint & Primer is a...,not_viewed,1 gal. #P120-6 Diva Glam Flat Exterior Paint &...,"[-0.010861445218324661, -0.014231621287763119,...",0.506961,0.005647


Convert the embedding vectors to Numpy arrays

In [20]:
vectors_recently_viewed = np.array(df_recently_viewed.text_embeddings.tolist())
vectors_not_viewed = np.array(df_not_viewed.text_embeddings.tolist())

Find the similarity between each viewed product and all the unviewed products.

In [25]:
similarity_matrix = cosine_similarity(vectors_recently_viewed, vectors_not_viewed)
top_ids = []
for row in similarity_matrix:
  top_id = np.argmax(row)
  top_ids.append(top_id)
most_similiar_product_ids = list(df_not_viewed.iloc[top_ids].product_id)
most_similiar_product_ids

['P854', 'P1061', 'P1705', 'P733', 'P1327', 'P1705', 'P1059', 'P314']

### Task 5 - Recommend products based on the searched products

Let's update the status of the top similar products to `recommended`.

In [27]:
data.loc[data.product_id.isin(most_similiar_product_ids), 'product_status'] = 'recommended'
data[data.product_status == "recommended"]

Unnamed: 0,product_id,title,description,product_status,combined,text_embeddings,pc1,pc2
314,P314,8 oz. #230F-7 Florence Brown Semi-Gloss Enamel...,Introducing the best of BEHR Paint. Featuring ...,recommended,8 oz. #230F-7 Florence Brown Semi-Gloss Enamel...,"[-0.003966080024838448, -0.057984184473752975,...",0.486927,-0.060005
733,P733,5 gal. #N440-1 Streetwise Extra Durable Semi-G...,BEHR ULTRA SCUFF DEFENSE Stain-Blocking Paint ...,recommended,5 gal. #N440-1 Streetwise Extra Durable Semi-G...,"[0.010829787701368332, -0.01894751563668251, 0...",0.468495,-0.009424
854,P854,1 qt. #N460-1 Evening White Satin Enamel Inter...,Love your space like never before with the hig...,recommended,1 qt. #N460-1 Evening White Satin Enamel Inter...,"[0.03936273232102394, -0.017813587561249733, 0...",0.493101,-0.052187
1059,P1059,1 gal. Home Decorators Collection #HDC-SP14-6 ...,Introducing the best of BEHR Paint. Featuring ...,recommended,1 gal. Home Decorators Collection #HDC-SP14-6 ...,"[-0.0021748889703303576, -0.05987035483121872,...",0.44811,-0.054049
1061,P1061,1 gal. #MQ1-28 Orange Flambe One-Coat Hide Egg...,Introducing the best of BEHR Paint. Featuring ...,recommended,1 gal. #MQ1-28 Orange Flambe One-Coat Hide Egg...,"[0.014321798458695412, -0.024152863770723343, ...",0.491525,-0.066632
1327,P1327,5 gal. #MQ4-44 Green Dynasty Extra Durable Egg...,BEHR ULTRA SCUFF DEFENSE Stain-Blocking Paint ...,recommended,5 gal. #MQ4-44 Green Dynasty Extra Durable Egg...,"[0.04926867410540581, -0.019617563113570213, 0...",0.473394,-0.065312
1705,P1705,5 gal. #310D-4 Gold Buff Extra Durable Satin E...,BEHR ULTRA SCUFF DEFENSE Stain-Blocking Paint ...,recommended,5 gal. #310D-4 Gold Buff Extra Durable Satin E...,"[-0.002173374406993389, -0.013826336711645126,...",0.467003,-0.037503


Let's visualize the recommended products.

In [28]:
px.scatter(data, x= 'pc1', y = 'pc2', color = "product_status", hover_data = "title")