<a href="https://colab.research.google.com/github/kimdesok/Recommender_Systems/blob/main/Recommender_ChatGPT.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [2]:
# Get the .kaggle password file from the Kaggle site & place it under /root directory
!mv kaggle.json /root/.kaggle

In [1]:
!pip install openai --upgrade

Collecting openai
  Downloading openai-0.28.0-py3-none-any.whl (76 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m76.5/76.5 kB[0m [31m1.5 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: openai
Successfully installed openai-0.28.0


In [4]:
import openai
import pandas as pd
from openai.embeddings_utils import get_embedding, cosine_similarity

In [None]:
# Get the key from My Account page at the OPENAI.
api_key =
openai.api_key = api_key

## References:
1) Norah Sakal's [How to use chatGPT API to build a chatbot for product recommendations with embeddings](https://norahsakal.com/blog/chatgpt-product-recommendation-embeddings) <br>
2) Rabie Rh's [How to build a chatbot for product recommendations using ChatGPT API and embeddings](https://medium.com/@monkeytyper/how-to-build-a-chatbot-for-product-recommendations-using-chatgpt-api-and-embeddings-52e531fc7562) <br>



## Create product data (Item features)

In [None]:
product_data = [{
    "prod_id": 1,
    "prod": "moisturizer",
    "brand":"Aveeno",
    "description": "for dry skin"
},
{
    "prod_id": 2,
    "prod": "foundation",
    "brand":"Maybelline",
    "description": "medium coverage"
},
{
    "prod_id": 3,
    "prod": "moisturizer",
    "brand":"CeraVe",
    "description": "for dry skin"
},
{
    "prod_id": 4,
    "prod": "nail polish",
    "brand":"OPI",
    "description": "raspberry red"
},
{
    "prod_id": 5,
    "prod": "concealer",
    "brand":"chanel",
    "description": "medium coverage"
},
{
    "prod_id": 6,
    "prod": "moisturizer",
    "brand":"Ole Henkrisen",
    "description": "for oily skin"
},
{
    "prod_id": 7,
    "prod": "moisturizer",
    "brand":"CeraVe",
    "description": "for normal to dry skin"
},
{
    "prod_id": 8,
    "prod": "moisturizer",
    "brand":"First Aid Beauty",
    "description": "for dry skin"
},{
    "prod_id": 9,
    "prod": "makeup sponge",
    "brand":"Sephora",
    "description": "super-soft, exclusive, latex-free foam"
}]

In [None]:
# Transform the data into a Pandas Dataframe for an easy handling
product_data_df = pd.DataFrame(product_data)
product_data_df

Unnamed: 0,prod_id,prod,brand,description
0,1,moisturizer,Aveeno,for dry skin
1,2,foundation,Maybelline,medium coverage
2,3,moisturizer,CeraVe,for dry skin
3,4,nail polish,OPI,raspberry red
4,5,concealer,chanel,medium coverage
5,6,moisturizer,Ole Henkrisen,for oily skin
6,7,moisturizer,CeraVe,for normal to dry skin
7,8,moisturizer,First Aid Beauty,for dry skin
8,9,makeup sponge,Sephora,"super-soft, exclusive, latex-free foam"


In [None]:
# Drive the whole data into a column for the preparation of embedding
product_data_df['combined'] = product_data_df.apply(lambda row: f"{row['brand']}, {row['prod']}, {row['description']}", axis=1)
product_data_df

Unnamed: 0,prod_id,prod,brand,description,combined
0,1,moisturizer,Aveeno,for dry skin,"Aveeno, moisturizer, for dry skin"
1,2,foundation,Maybelline,medium coverage,"Maybelline, foundation, medium coverage"
2,3,moisturizer,CeraVe,for dry skin,"CeraVe, moisturizer, for dry skin"
3,4,nail polish,OPI,raspberry red,"OPI, nail polish, raspberry red"
4,5,concealer,chanel,medium coverage,"chanel, concealer, medium coverage"
5,6,moisturizer,Ole Henkrisen,for oily skin,"Ole Henkrisen, moisturizer, for oily skin"
6,7,moisturizer,CeraVe,for normal to dry skin,"CeraVe, moisturizer, for normal to dry skin"
7,8,moisturizer,First Aid Beauty,for dry skin,"First Aid Beauty, moisturizer, for dry skin"
8,9,makeup sponge,Sephora,"super-soft, exclusive, latex-free foam","Sephora, makeup sponge, super-soft, exclusive,..."


## Text embedding using OPENAI API, get_embedding

In [None]:
product_data_df['text_embedding'] = product_data_df.combined.apply(lambda x: get_embedding(x, engine='text-embedding-ada-002'))
product_data_df

Unnamed: 0,prod_id,prod,brand,description,combined,text_embedding
0,1,moisturizer,Aveeno,for dry skin,"Aveeno, moisturizer, for dry skin","[-0.005490448791533709, -0.009179000742733479,..."
1,2,foundation,Maybelline,medium coverage,"Maybelline, foundation, medium coverage","[-0.01599975861608982, 0.002224505180492997, -..."
2,3,moisturizer,CeraVe,for dry skin,"CeraVe, moisturizer, for dry skin","[0.007382103707641363, -0.017064472660422325, ..."
3,4,nail polish,OPI,raspberry red,"OPI, nail polish, raspberry red","[-0.0006032940000295639, -0.01373579166829586,..."
4,5,concealer,chanel,medium coverage,"chanel, concealer, medium coverage","[0.004690815694630146, 0.00457560271024704, 0...."
5,6,moisturizer,Ole Henkrisen,for oily skin,"Ole Henkrisen, moisturizer, for oily skin","[-0.004918951541185379, -0.02238703891634941, ..."
6,7,moisturizer,CeraVe,for normal to dry skin,"CeraVe, moisturizer, for normal to dry skin","[0.015850096940994263, -0.01311473734676838, 0..."
7,8,moisturizer,First Aid Beauty,for dry skin,"First Aid Beauty, moisturizer, for dry skin","[-0.01125184353441, -0.007704718038439751, -0...."
8,9,makeup sponge,Sephora,"super-soft, exclusive, latex-free foam","Sephora, makeup sponge, super-soft, exclusive,...","[0.006289263255894184, 0.0048912749625742435, ..."


## Create customer profile data
>* share the exactly same keys
>* may have unique keys required to describe the customer requirements in a different application

In [None]:
customer_order_data = [
{
    "prod_id": 1,
    "prod": "moisturizer",
    "brand":"Aveeno",
    "description": "for dry skin"
},{
    "prod_id": 2,
    "prod": "foundation",
    "brand":"Maybelline",
    "description": "medium coverage"
},{
    "prod_id": 4,
    "prod": "nail polish",
    "brand":"OPI",
    "description": "raspberry red"
},{
    "prod_id": 5,
    "prod": "concealer",
    "brand":"chanel",
    "description": "medium coverage"
},{
    "prod_id": 9,
    "prod": "makeup sponge",
    "brand":"Sephora",
    "description": "super-soft, exclusive, latex-free foam"
}]

In [None]:
customer_order_df = pd.DataFrame(customer_order_data)
customer_order_df

Unnamed: 0,prod_id,prod,brand,description
0,1,moisturizer,Aveeno,for dry skin
1,2,foundation,Maybelline,medium coverage
2,4,nail polish,OPI,raspberry red
3,5,concealer,chanel,medium coverage
4,9,makeup sponge,Sephora,"super-soft, exclusive, latex-free foam"


In [None]:
customer_order_df['combined'] = customer_order_df.apply(lambda row: f"{row['brand']}, {row['prod']}, {row['description']}", axis=1)
customer_order_df

Unnamed: 0,prod_id,prod,brand,description,combined
0,1,moisturizer,Aveeno,for dry skin,"Aveeno, moisturizer, for dry skin"
1,2,foundation,Maybelline,medium coverage,"Maybelline, foundation, medium coverage"
2,4,nail polish,OPI,raspberry red,"OPI, nail polish, raspberry red"
3,5,concealer,chanel,medium coverage,"chanel, concealer, medium coverage"
4,9,makeup sponge,Sephora,"super-soft, exclusive, latex-free foam","Sephora, makeup sponge, super-soft, exclusive,..."


In [None]:
customer_order_df['text_embedding'] = customer_order_df.combined.apply(lambda x: get_embedding(x, engine='text-embedding-ada-002'))
customer_order_df

Unnamed: 0,prod_id,prod,brand,description,combined,text_embedding
0,1,moisturizer,Aveeno,for dry skin,"Aveeno, moisturizer, for dry skin","[-0.0054562389850616455, -0.009110474959015846..."
1,2,foundation,Maybelline,medium coverage,"Maybelline, foundation, medium coverage","[-0.01599975861608982, 0.002224505180492997, -..."
2,4,nail polish,OPI,raspberry red,"OPI, nail polish, raspberry red","[-0.0005378166679292917, -0.013731765560805798..."
3,5,concealer,chanel,medium coverage,"chanel, concealer, medium coverage","[0.004690815694630146, 0.00457560271024704, 0...."
4,9,makeup sponge,Sephora,"super-soft, exclusive, latex-free foam","Sephora, makeup sponge, super-soft, exclusive,...","[0.006272361148148775, 0.0048912288621068, 0.0..."


## Testing with input questions
>* Embedding applied to the input

In [None]:
customer_input = "Hi! Can you recommend a good moisturizer for me?"

In [None]:
# Inquiry presented as embedded by an OPENAI model
response = openai.Embedding.create(
    input=customer_input,
    model="text-embedding-ada-002"
)
embeddings_customer_question = response['data'][0]['embedding']

## Calculate similarities between the input and purchase history
>* that is provided by the past customer_order
>* represented as customer features
>* processed and ready as embedded features

In [None]:
customer_order_df['search_purchase_history'] = customer_order_df.text_embedding.apply(lambda x: cosine_similarity(x, embeddings_customer_question))
customer_order_df = customer_order_df.sort_values('search_purchase_history', ascending=False)
customer_order_df

Unnamed: 0,prod_id,prod,brand,description,combined,text_embedding,search_purchase_history
0,1,moisturizer,Aveeno,for dry skin,"Aveeno, moisturizer, for dry skin","[-0.0054562389850616455, -0.009110474959015846...",0.861086
3,5,concealer,chanel,medium coverage,"chanel, concealer, medium coverage","[0.004690815694630146, 0.00457560271024704, 0....",0.783757
1,2,foundation,Maybelline,medium coverage,"Maybelline, foundation, medium coverage","[-0.01599975861608982, 0.002224505180492997, -...",0.782503
4,9,makeup sponge,Sephora,"super-soft, exclusive, latex-free foam","Sephora, makeup sponge, super-soft, exclusive,...","[0.006272361148148775, 0.0048912288621068, 0.0...",0.762074
2,4,nail polish,OPI,raspberry red,"OPI, nail polish, raspberry red","[-0.0005378166679292917, -0.013731765560805798...",0.748523


In [None]:
top_3_purchases_df = customer_order_df.head(3)
top_3_purchases_df

Unnamed: 0,prod_id,prod,brand,description,combined,text_embedding,search_purchase_history
0,1,moisturizer,Aveeno,for dry skin,"Aveeno, moisturizer, for dry skin","[-0.0054562389850616455, -0.009110474959015846...",0.861086
3,5,concealer,chanel,medium coverage,"chanel, concealer, medium coverage","[0.004690815694630146, 0.00457560271024704, 0....",0.783757
1,2,foundation,Maybelline,medium coverage,"Maybelline, foundation, medium coverage","[-0.01599975861608982, 0.002224505180492997, -...",0.782503


In [None]:
top_3_products_df = product_data_df.head(3)
top_3_products_df

Unnamed: 0,prod_id,prod,brand,description,combined,text_embedding
0,1,moisturizer,Aveeno,for dry skin,"Aveeno, moisturizer, for dry skin","[-0.005490448791533709, -0.009179000742733479,..."
1,2,foundation,Maybelline,medium coverage,"Maybelline, foundation, medium coverage","[-0.01599975861608982, 0.002224505180492997, -..."
2,3,moisturizer,CeraVe,for dry skin,"CeraVe, moisturizer, for dry skin","[0.007382103707641363, -0.017064472660422325, ..."


## Calculate similarities between the input and products

In [None]:
product_data_df['search_products'] = product_data_df.text_embedding.apply(lambda x: cosine_similarity(x, embeddings_customer_question))
product_data_df = product_data_df.sort_values('search_products', ascending=False)
product_data_df

Unnamed: 0,prod_id,prod,brand,description,combined,text_embedding,search_products
2,3,moisturizer,CeraVe,for dry skin,"CeraVe, moisturizer, for dry skin","[0.007382103707641363, -0.017064472660422325, ...",0.861119
0,1,moisturizer,Aveeno,for dry skin,"Aveeno, moisturizer, for dry skin","[-0.005490448791533709, -0.009179000742733479,...",0.861041
7,8,moisturizer,First Aid Beauty,for dry skin,"First Aid Beauty, moisturizer, for dry skin","[-0.01125184353441, -0.007704718038439751, -0....",0.855802
6,7,moisturizer,CeraVe,for normal to dry skin,"CeraVe, moisturizer, for normal to dry skin","[0.015850096940994263, -0.01311473734676838, 0...",0.851248
5,6,moisturizer,Ole Henkrisen,for oily skin,"Ole Henkrisen, moisturizer, for oily skin","[-0.004918951541185379, -0.02238703891634941, ...",0.837511
4,5,concealer,chanel,medium coverage,"chanel, concealer, medium coverage","[0.004690815694630146, 0.00457560271024704, 0....",0.783757
1,2,foundation,Maybelline,medium coverage,"Maybelline, foundation, medium coverage","[-0.01599975861608982, 0.002224505180492997, -...",0.782503
8,9,makeup sponge,Sephora,"super-soft, exclusive, latex-free foam","Sephora, makeup sponge, super-soft, exclusive,...","[0.006289263255894184, 0.0048912749625742435, ...",0.762078
3,4,nail polish,OPI,raspberry red,"OPI, nail polish, raspberry red","[-0.0006032940000295639, -0.01373579166829586,...",0.748478


In [None]:
top_3_products_df = product_data_df.head(3)
top_3_products_df

Unnamed: 0,prod_id,prod,brand,description,combined,text_embedding,search_products
2,3,moisturizer,CeraVe,for dry skin,"CeraVe, moisturizer, for dry skin","[0.007382103707641363, -0.017064472660422325, ...",0.861119
0,1,moisturizer,Aveeno,for dry skin,"Aveeno, moisturizer, for dry skin","[-0.005490448791533709, -0.009179000742733479,...",0.861041
7,8,moisturizer,First Aid Beauty,for dry skin,"First Aid Beauty, moisturizer, for dry skin","[-0.01125184353441, -0.007704718038439751, -0....",0.855802


## Create chatGPT prompts

In [None]:
message_objects = []
message_objects.append({"role":"system", "content":"You're a chatbot helping customers with beauty-related questions and help."})

In [None]:
# Append the customer message
message_objects.append({"role":"user", "content": customer_input})

In [None]:
# Create previously purchased input
prev_purchases = ". ".join([f"{row['combined']}" for index, row in top_3_purchases_df.iterrows()])
prev_purchases

'Aveeno, moisturizer, for dry skin. chanel, concealer, medium coverage. Maybelline, foundation, medium coverage'

In [None]:
# Append prev relevant purchase
message_objects.append({"role":"user", "content": f"Here're my latest product orders: {prev_purchases}"})
message_objects.append({"role":"user", "content": f"Please give me a detailed explanation of your recommendations"})
message_objects.append({"role":"user", "content": f"Please be friendly and talk to me like a person, don't just give me a list. Got it?"})

In [None]:
# Create list of 3 products to recommend
products_list = []

for index, row in top_3_products_df.iterrows():
    brand_dict = {'role': "assistant", "content": f"{row['combined']}"}
    products_list.append(brand_dict)
products_list

[{'role': 'assistant', 'content': 'Aveeno, moisturizer, for dry skin'},
 {'role': 'assistant', 'content': 'Maybelline, foundation, medium coverage'},
 {'role': 'assistant', 'content': 'CeraVe, moisturizer, for dry skin'}]

In [None]:
# Append found products
message_objects.append({"role": "assistant", "content": f"I found these 3 products I would recommend"})
message_objects.extend(products_list)
message_objects.append({"role": "assistant", "content": f"Here's my summarized recommendation of products, and why it would suit you:"})
message_objects

[{'role': 'system',
  'content': "You're a chatbot helping customers with beauty-related questions and help."},
 {'role': 'user',
  'content': 'Hi! Can you recommend a good moisturizer for me?'},
 {'role': 'user',
  'content': "Here're my latest product orders: Aveeno, moisturizer, for dry skin. chanel, concealer, medium coverage. Maybelline, foundation, medium coverage"},
 {'role': 'user',
  'content': 'Please give me a detailed explanation of your recommendations'},
 {'role': 'user',
  'content': "Please be friendly and talk to me like a person, don't just give me a list. Got it?"},
 {'role': 'assistant',
  'content': 'I found these 3 products I would recommend'},
 {'role': 'assistant', 'content': 'Aveeno, moisturizer, for dry skin'},
 {'role': 'assistant', 'content': 'Maybelline, foundation, medium coverage'},
 {'role': 'assistant', 'content': 'CeraVe, moisturizer, for dry skin'},
 {'role': 'assistant',
  'content': "Here's my summarized recommendation of products, and why it would 

In [None]:
completion = openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=message_objects
)

print(completion.choices[0].message['content'])

Based on your order history, I recommend the following products:

1. Aveeno Moisturizer for Dry Skin: This moisturizer is specifically formulated for dry skin, making it a great choice for you. Aveeno is known for its gentle and nourishing formulas, and this moisturizer is no exception. It helps to replenish and hydrate the skin, leaving it soft and smooth.

2. Maybelline Foundation with Medium Coverage: This foundation is a reliable choice for achieving medium coverage. It helps to even out your skin tone and hide any imperfections, while still maintaining a natural look. Maybelline is a well-known brand with a range of shades, so you should be able to find a good match for your skin tone.

3. Chanel Concealer with Medium Coverage: Chanel is a luxury brand known for its high-quality products. This concealer offers medium coverage, which is great for concealing dark circles, blemishes, and other areas of concern. It has a creamy texture that blends seamlessly into the skin for a flawle