# Amazon product recomendation with user's natural language query (POC)
In this notebook we utilize the vectors and Faiss indeces that we created in previous notebook (title_vectorization.ipynb) to retrive and recommend the closest items in our database to user's natural language query.

In [19]:
import pandas as pd
import numpy as np
import faiss
from transformers import MPNetModel, MPNetTokenizer
#from sentence_transformers import SentenceTransformer
import torch
from tqdm.auto import tqdm
pd.set_option('display.max_rows', 50)
pd.set_option('display.max_colwidth', None)


Encoding and search the index function

In [20]:
# Batch encoding function with GPU acceleration
def encode_texts_in_batches_gpu(texts, batch_size=32):
    tokenizer = MPNetTokenizer.from_pretrained('sentence-transformers/all-mpnet-base-v2')
    model = MPNetModel.from_pretrained('sentence-transformers/all-mpnet-base-v2').cuda()  # Move model to GPU
    model.eval()  # Evaluation mode

    all_embeddings = []
    
    for i in tqdm(range(0, len(texts), batch_size), desc="Encoding Texts"):
        batch_texts = texts[i:i + batch_size]
        encoded_input = tokenizer(batch_texts, padding=True, truncation=True, max_length=128, return_tensors='pt').to('cuda')
        
        with torch.no_grad():
            model_output = model(**encoded_input)
        embeddings = model_output.last_hidden_state.mean(dim=1).cpu().numpy()  # Move embeddings back to CPU
        all_embeddings.extend(embeddings)
    
    return np.array(all_embeddings)

# Function to search the Faiss index
def search_query(query, df, index, k=3):
    query_vec = encode_texts_in_batches_gpu([query])[0]  # Vectorize query
    distances, indices = index.search(np.array([query_vec]), k)  # Search
    closest_texts = df.iloc[indices[0]]['title'].values
    closest_ids = df.iloc[indices[0]]['parent_asin'].values
    return closest_texts, closest_ids, distances



loading the index and vector files, encoding user's query and search and retrieve the closest titles to users query.

In [21]:


# Loading vectorized texts and Faiss index for demonstration
df_loaded = pd.read_pickle("vectors_indeces/vectorized_texts_v2.pkl")
index_loaded = faiss.read_index("vectors_indeces/faiss_index_v2.bin")

# Perform a search query
query = "I have a dry skin and need a moisturizer for my face. What do you recommend?"
closest_titles, closest_parent_asin, distances = search_query(query, df_loaded, index_loaded)

print("Closest text(s):", closest_titles)
print("Corresponding ID(s):", closest_parent_asin)
print("Distances:", distances)


Encoding Texts: 100%|██████████| 1/1 [00:00<00:00, 76.76it/s]

Closest text(s): ['Best Face Moisturizer For Oily Skin – Rapid Absorbing Facial Moisturizer For Face, Facial Cream With Silky Feel - For Oily Skin Face Cream For Women – Organic Ocean Minerals, Aloe Vera,'
 'IOPE Super Vital Cream Bio Excellent 50ml With Gift Set / best moisturizer for dry skin'
 'My Face Hyaluronic Acid Face Wash Cream With Collagen Konjac Sponge - Restore And Radiate Hydrating Facial Cleanser Kit - Oil Free, Hypoallergenic - Gentle Exfoliating Cleansing For Dry Skin']
Corresponding ID(s): ['B07FK6V8MG' 'B01MEE6DT7' 'B0854NXJX8']
Distances: [[5.630227 5.643558 5.902549]]





#Retrieve and organize the reviews for each retrieved parent item.

In [22]:
df_retrieve = pd.DataFrame({
    'parent_asin': closest_parent_asin,
    'distances': distances[0]
})
df_reviews = pd.read_csv('data/all_beauty_review_amazon.csv',usecols=['parent_asin','title','text','rating'])


In [23]:
filtered_df = df_reviews[df_reviews['parent_asin'].isin(closest_parent_asin)]
aggregated_reviews = filtered_df.groupby('parent_asin')['text'].agg(lambda x: ''.join(str(x))).reset_index()
avg_rating = filtered_df.groupby('parent_asin')['rating'].mean().reset_index()

In [24]:
new_df = pd.DataFrame({
    'parent_asin': aggregated_reviews['parent_asin'],
    'all_reviews': aggregated_reviews['text'],
    'avg_rating': avg_rating['rating']
})

df_final = df_retrieve.merge(new_df, on='parent_asin', how='left')

In [25]:
df_final

Unnamed: 0,parent_asin,distances,all_reviews,avg_rating
0,B07FK6V8MG,5.630227,"118909 Just too heavy & felt oily on my already oily skin. I wanted it to be the ""one"", but...I use on my neck & chest. So it's great, for me, for that.\nName: text, dtype: object",3.0
1,B01MEE6DT7,5.643558,"257674 This kit is awesome!!! Although I bought it mostly for the face cream, I also appreciate the other products, especially the softener. I apply the softener after I apply my essences and it feels so soft and amazing. I also enjoy the emulsion for the daytime as a lighter moisturizer option. The serum was ok, I didn't feel much of a difference from applying it and not applying it, and same about the eyecream. I have dry/combination skin, on the sensitive side. Overall, it's a great kit, but the face cream 100% steals the show. I will most likely purchase it again.\n315015 I really like it.\nName: text, dtype: object",5.0
2,B0854NXJX8,5.902549,"5673 I like the sponge that comes with it. It comes wrapped in plastic and feels very hard, like a pumice stone. Once you soak it in water though, it becomes very soft and malleable. The sponge alone is a 5/5.<br /><br />The cleanser on the other hand isn't a favorite of mine. It does not have an inner seal and the smell is just meh. I don't know how another reviewer got a citrus smell since the ingredients don't include fragrance. I think the cleanser is a little too harsh and drying for my skin, it might be better for people with very oily skin.\n442674 This is a great cleanser your face feels nice and smooth afterwards it got all the gunk off my face when I had when I have my moisturizer and stuff on and gets it all off great buy for the money\nName: text, dtype: object",4.0


processing all comments using an LLM to write a report containing user's sentiment about the product as well as pros and cons of the product, based on user reviews.

In [26]:
# Using Neural chat 7B model locally to summarize the reviews.
# The API structure is exactly like the openAI API so you can replace the base_url and api_key with the actual openAI API key and base_url for the actual openAI API and run the code.
from openai import OpenAI
summary_reviews = []

for i in range(0, len(df_final['all_reviews'])):
    # Point to the local server
    client = OpenAI(base_url="http://localhost:1234/v1", api_key="not-needed")

    completion = client.chat.completions.create(
    model="local-model", # this field is currently unused
    messages=[
        {"role": "system", "content": '''Here is a list of reviews for a particular product. Please process them and creat 
                                    a summary of all the important points that are mentioned in reviews in one paragraph including overall sentiment of reviewers toward this product as well as positive aspects and negative aspect of the product A rrange all that under one element called final verdict. Don't include any of the review texts in the final output. The final result should look like this example: "Final Verdict: The product is great. It is very effective and works well. However, it is a bit expensive. Overall, it is a good product." '''},
        {"role": "user", "content": df_final['all_reviews'][i]}
    ],
    temperature=0.7,
    )
    summary_reviews.append(completion.choices[0].message.content)
df_final['summary_reviews'] = summary_reviews

Final dataframe containing the recommended procuts, average rating of the user's with comments and final verdict based on aggregated users comments.

In [27]:
df_final

Unnamed: 0,parent_asin,distances,all_reviews,avg_rating,summary_reviews
0,B07FK6V8MG,5.630227,"118909 Just too heavy & felt oily on my already oily skin. I wanted it to be the ""one"", but...I use on my neck & chest. So it's great, for me, for that.\nName: text, dtype: object",3.0,"179158 Excellent product - really works well and long-lasting. It is a bit expensive though.\nFinal Verdict: The product is great and effective but might be costly. It has positive feedback on its performance and can cater to specific body parts like neck and chest for some users. However, the high price remains an issue."
1,B01MEE6DT7,5.643558,"257674 This kit is awesome!!! Although I bought it mostly for the face cream, I also appreciate the other products, especially the softener. I apply the softener after I apply my essences and it feels so soft and amazing. I also enjoy the emulsion for the daytime as a lighter moisturizer option. The serum was ok, I didn't feel much of a difference from applying it and not applying it, and same about the eyecream. I have dry/combination skin, on the sensitive side. Overall, it's a great kit, but the face cream 100% steals the show. I will most likely purchase it again.\n315015 I really like it.\nName: text, dtype: object",5.0,Final Verdict: The product is well-liked with great face cream and other useful items. It suits dry/combination skin on the sensitive side. Some users find other products less effective while others are satisfied with overall performance. It can be expensive but worth it for its standout features.
2,B0854NXJX8,5.902549,"5673 I like the sponge that comes with it. It comes wrapped in plastic and feels very hard, like a pumice stone. Once you soak it in water though, it becomes very soft and malleable. The sponge alone is a 5/5.<br /><br />The cleanser on the other hand isn't a favorite of mine. It does not have an inner seal and the smell is just meh. I don't know how another reviewer got a citrus smell since the ingredients don't include fragrance. I think the cleanser is a little too harsh and drying for my skin, it might be better for people with very oily skin.\n442674 This is a great cleanser your face feels nice and smooth afterwards it got all the gunk off my face when I had when I have my moisturizer and stuff on and gets it all off great buy for the money\nName: text, dtype: object",4.0,"Final Verdict: The product has mixed reviews. It is appreciated for its effective sponge and cleanser's ability to remove makeup. However, there are concerns about the cleanser's harshness, smell, and suitability for particular skin types."
