# Cosine Similarity-Based Product Recommendation
This workflow demonstrates how to build a feature matrix and use cosine similarity to recommend similar products.

In the early stages of this project, the goal was to build a streamlined, metadata-driven recommender. However, as we moved through Preprocessing and Feature Engineering, the data began to fight back. The very techniques intended to simplify the model—Correlation Analysis and Lasso Regression—instead acted as a 'Stress Test' that simple models could not pass. 

In order to overcome the limitations of single-dimensional filtering, we adopted a multi-model hybrid architecture. This approach leverages various recommendation 'experts'—each utilizing cosine similarity to measure product-to-product relationships—which are then synthesized into a unified ensemble. 

We have tried 10 models and attempted to apply a weighted hybrid model.

To ensure each model contributes effectively, we evaluated their performance against our established Key Performance Indicators (KPIs). The final integration was achieved by assigning strategic weights to each model. 

To determine the optimal weight distribution, we employed a Randomized Search optimization process. This allowed the system to iteratively test thousands of weight combinations, ultimately identifying the configuration that maximized the NDCG (Normalized Discounted Cumulative Gain) score, ensuring our recommendations are both relevant and accurately ranked.

Here is our KPI and its corresponding target:
NDCG@5 - Is the entire Top-5 list sorted perfectly from best to worst?, 
    target -> >0.80
MRR (Mean Reciprocal Rank) - How quickly does the user find the first relevant item?
    target -> 0.70
Latency - How fast is the recommendation delivered?
    target -> < 200ms

In [23]:
# Import libraries
import numpy as np
import sys
import os
import pandas as pd
from sklearn.metrics import ndcg_score
sys.path.append(os.path.abspath('../src'))
from model_training import get_recommendations, get_content_recommendations_pca, get_sentiment_recommendations, get_recommendations_with_pca, get_review_recommendations, get_topic_recommendations, get_reviewer_overlap_recommendations, get_weighted_hybrid_recommendations, load_all_models, get_content_recommendations, svd_product_recommendations, get_knn_recommendations, prepare_features_for_knn, svd_product_recommendations_optimized
df = pd.read_parquet('../app/dataset/product_features.parquet')
#print(df.head())
load_all_models()
product_id = "B07JW9H4J1"

All models and data loaded successfully.


# Model Training and Comparison


# Collaborative Filtering: Matrix Factorization (SVD)
This cell demonstrates how to use Singular Value Decomposition (SVD) for collaborative filtering product recommendations. SVD is a popular matrix factorization technique for recommender systems.

In [3]:
import os
sys.path.append(os.path.abspath('../src'))
from model_training import svd_product_recommendations
try:
    similar_products = svd_product_recommendations(df, product_id, n=5)
    print(similar_products)
except Exception as e:
    print(f"Error running svd_product_recommendations: {e}")

     product_id                                       product_name  rating  \
6    B07JW1Y6XV  Wayona Nylon Braided 3A Lightning to USB A Syn...     4.2   
23   B07LGT55SJ  Wayona Usb Nylon Braided Data Sync And Chargin...     4.2   
30   B07JH1C41D  Wayona Nylon Braided (2 Pack) Lightning Fast U...     4.2   
46   B07JGDB5M1  Wayona Nylon Braided 2M / 6Ft Fast Charge Usb ...     4.2   
160  B07JH1CBGW  Wayona Nylon Braided Usb Syncing And Charging ...     4.2   

     discounted_price  
6               399.0  
23              399.0  
30              649.0  
46              449.0  
160             649.0  


In [4]:
from model_training import svd_product_recommendations

# Example: Evaluate SVD-based recommendations for a specific product_id
try:
    similar_products_svd = svd_product_recommendations(df, product_id, n=5)
    print(f"Top similar products for product_id {product_id} (SVD):")
    print(similar_products_svd)
except Exception as e:
    print(f"Error: {e}")

Top similar products for product_id B07JW9H4J1 (SVD):
     product_id                                       product_name  rating  \
6    B07JW1Y6XV  Wayona Nylon Braided 3A Lightning to USB A Syn...     4.2   
23   B07LGT55SJ  Wayona Usb Nylon Braided Data Sync And Chargin...     4.2   
30   B07JH1C41D  Wayona Nylon Braided (2 Pack) Lightning Fast U...     4.2   
160  B07JH1CBGW  Wayona Nylon Braided Usb Syncing And Charging ...     4.2   
500  B07JW9H4J1  Wayona Nylon Braided USB to Lightning Fast Cha...     4.2   

     discounted_price  
6               399.0  
23              399.0  
30              649.0  
160             649.0  
500             399.0  


# Content-Based Filtering: KNN on Product Features
This cell demonstrates how to use K-Nearest Neighbors (KNN) on product features for content-based recommendations.

In [5]:
import os
sys.path.append(os.path.abspath('../src'))
from model_training import prepare_features_for_knn, get_knn_recommendations

# Prepare features
df, X = prepare_features_for_knn(df)


# Find the row index for this product_id
if product_id in df['product_id'].values:
    product_idx = int(df.index[df['product_id'] == product_id][0])
    try:
        similar_products = get_knn_recommendations(df, X, product_idx, 5)
        display(similar_products)
    except Exception as e:
        print(f"KNN error: {e}")
else:
    print(f"Product ID {product_id} not found in the DataFrame.")

Unnamed: 0,product_id,product_name,rating,discounted_price
6,B07JW1Y6XV,Wayona Nylon Braided 3A Lightning to USB A Syn...,4.2,399.0
23,B07LGT55SJ,Wayona Usb Nylon Braided Data Sync And Chargin...,4.2,399.0
117,B08PSVBB2X,Zoul USB C to USB C Fast Charging Cable 65W Ty...,4.1,399.0
80,B08PSQRW2T,Zoul Type C to Type C Fast Charging Cable 65W ...,4.1,399.0
470,B071Z8M4KX,boAt BassHeads 100 in-Ear Wired Headphones wit...,4.1,365.0


## Cosine Similarity (Classic)
This method uses cosine similarity on product features to recommend similar products.

In [6]:
import os
sys.path.append(os.path.abspath('../src'))
from model_training import get_recommendations

try:
    pca_recs = get_recommendations(df, product_id, N=5)
    if isinstance(pca_recs, pd.DataFrame) and not pca_recs.empty:
        display(pca_recs)
    else:
        print("No recommendations found or Product ID not in database.")
except Exception as e:
    print(f"Error generating recommendations: {e}")

Unnamed: 0,product_id,product_name,category,discounted_price,actual_price,discount_percentage,rating,rating_count,about_product,user_id,user_name,review_id,review_title,review_content,img_link,product_link
0,B07LGT55SJ,Wayona Usb Nylon Braided Data Sync And Chargin...,Computers&Accessories|Accessories&Peripherals|...,399.0,1099.0,64,4.2,0.0,[High Compatibility] : Compatible For iPhone X...,"AG3D6O4STAQKAY2UVGEUV46KN35Q,AHMY5CWJMMK5BJRBB...","Manav,Adarsh gupta,Sundeep,S.Sayeed Ahmed,jasp...","R3HXWT0LRP0NMF,R2AJM3LFTLZHFO,R6AQJGUP6P86,R1K...","Satisfied,Charging is really fast,Value for mo...",Looks durable Charging is fine tooNo complains...,https://m.media-amazon.com/images/I/41xmv3WPs7...,https://www.amazon.in/Wayona-Braided-Syncing-C...
1,B07JW1Y6XV,Wayona Nylon Braided 3A Lightning to USB A Syn...,Computers&Accessories|Accessories&Peripherals|...,399.0,1099.0,64,4.2,0.0,"[High Compatibility] : iPhone X/XsMax/Xr ,iPho...","AG3D6O4STAQKAY2UVGEUV46KN35Q,AHMY5CWJMMK5BJRBB...","Manav,Adarsh gupta,Sundeep,S.Sayeed Ahmed,jasp...","R3HXWT0LRP0NMF,R2AJM3LFTLZHFO,R6AQJGUP6P86,R1K...","Satisfied,Charging is really fast,Value for mo...",Looks durable Charging is fine tooNo complains...,https://m.media-amazon.com/images/I/41rB0DnVFm...,https://www.amazon.in/Wayona-Braided-WN3LB1-Sy...
2,B08PSQRW2T,Zoul Type C to Type C Fast Charging Cable 65W ...,Computers&Accessories|Accessories&Peripherals|...,399.0,1099.0,64,4.1,0.0,【NOTE before purchase】:This is a USB C to USB ...,"AFAQLRAKYASFXOQP7MS6SZK4STIQ,AGGQ72HVXMSQN3ZPG...","Abhay Goyal,Apurva,Shilpa,Vishnu Narayanan,Kus...","R1PCC1YKW3I4G8,RCUHBFP4RIAI5,RXEJH230ZKTRM,RNK...","Changing speed,Make it better,Superb Build Qua...","The product was nice its charging awesome,Cant...",https://m.media-amazon.com/images/I/41wI9GGhTH...,https://www.amazon.in/Charging-Braided-Compati...
3,B08PSVBB2X,Zoul USB C to USB C Fast Charging Cable 65W Ty...,Computers&Accessories|Accessories&Peripherals|...,399.0,1099.0,64,4.1,0.0,【High Charging Speed 65W】: Output power up to ...,"AFAQLRAKYASFXOQP7MS6SZK4STIQ,AGGQ72HVXMSQN3ZPG...","Abhay Goyal,Apurva,Shilpa,Vishnu Narayanan,Kus...","R1PCC1YKW3I4G8,RCUHBFP4RIAI5,RXEJH230ZKTRM,RNK...","Changing speed,Make it better,Superb Build Qua...","The product was nice its charging awesome,Cant...",https://m.media-amazon.com/images/I/41EhlNJ-v8...,https://www.amazon.in/Charging-Braided-Compati...
4,B071Z8M4KX,boAt BassHeads 100 in-Ear Wired Headphones wit...,"Electronics|Headphones,Earbuds&Accessories|Hea...",365.0,999.0,63,4.1,0.0,The perfect way to add some style and stand ou...,"AF4MVO4JNFDEPWFKZO62OAJKRIWA,AHVPAXEWPATRASBKH...","tarun kumar,mahesh radheshyam tawari,Blackspad...","R2DD2M5YARW7R2,R2M9ZYNGGV1ZLN,RNWNTRNLSJWSB,R3...","Best value for money,HEAD PHONE POUCH NOT RECE...",The sound quality of this earphone are really ...,https://m.media-amazon.com/images/I/31IdiM9ZM8...,https://www.amazon.in/boAt-BassHeads-100-Headp...


## Cosine Similarity with PCA
This method applies PCA to reduce feature dimensionality before computing cosine similarity.

In [7]:
import os
sys.path.append(os.path.abspath('../src'))
from model_training import get_recommendations_with_pca

product_id = "B07JW9H4J1"

try:
    pca_recs = get_recommendations_with_pca(df, product_id, N=5)
    if isinstance(pca_recs, pd.DataFrame) and not pca_recs.empty:
        display(pca_recs)
    else:
        print("No recommendations found or Product ID not in database.")
except Exception as e:
    print(f"Error generating recommendations: {e}")

Unnamed: 0,product_id,product_name,category,discounted_price,actual_price,discount_percentage,rating,rating_count,about_product,user_id,user_name,review_id,review_title,review_content,img_link,product_link
0,B07JW1Y6XV,Wayona Nylon Braided 3A Lightning to USB A Syn...,Computers&Accessories|Accessories&Peripherals|...,399.0,1099.0,64,4.2,0.0,"[High Compatibility] : iPhone X/XsMax/Xr ,iPho...","AG3D6O4STAQKAY2UVGEUV46KN35Q,AHMY5CWJMMK5BJRBB...","Manav,Adarsh gupta,Sundeep,S.Sayeed Ahmed,jasp...","R3HXWT0LRP0NMF,R2AJM3LFTLZHFO,R6AQJGUP6P86,R1K...","Satisfied,Charging is really fast,Value for mo...",Looks durable Charging is fine tooNo complains...,https://m.media-amazon.com/images/I/41rB0DnVFm...,https://www.amazon.in/Wayona-Braided-WN3LB1-Sy...
1,B07LGT55SJ,Wayona Usb Nylon Braided Data Sync And Chargin...,Computers&Accessories|Accessories&Peripherals|...,399.0,1099.0,64,4.2,0.0,[High Compatibility] : Compatible For iPhone X...,"AG3D6O4STAQKAY2UVGEUV46KN35Q,AHMY5CWJMMK5BJRBB...","Manav,Adarsh gupta,Sundeep,S.Sayeed Ahmed,jasp...","R3HXWT0LRP0NMF,R2AJM3LFTLZHFO,R6AQJGUP6P86,R1K...","Satisfied,Charging is really fast,Value for mo...",Looks durable Charging is fine tooNo complains...,https://m.media-amazon.com/images/I/41xmv3WPs7...,https://www.amazon.in/Wayona-Braided-Syncing-C...
2,B08PSVBB2X,Zoul USB C to USB C Fast Charging Cable 65W Ty...,Computers&Accessories|Accessories&Peripherals|...,399.0,1099.0,64,4.1,0.0,【High Charging Speed 65W】: Output power up to ...,"AFAQLRAKYASFXOQP7MS6SZK4STIQ,AGGQ72HVXMSQN3ZPG...","Abhay Goyal,Apurva,Shilpa,Vishnu Narayanan,Kus...","R1PCC1YKW3I4G8,RCUHBFP4RIAI5,RXEJH230ZKTRM,RNK...","Changing speed,Make it better,Superb Build Qua...","The product was nice its charging awesome,Cant...",https://m.media-amazon.com/images/I/41EhlNJ-v8...,https://www.amazon.in/Charging-Braided-Compati...
3,B08PSQRW2T,Zoul Type C to Type C Fast Charging Cable 65W ...,Computers&Accessories|Accessories&Peripherals|...,399.0,1099.0,64,4.1,0.0,【NOTE before purchase】:This is a USB C to USB ...,"AFAQLRAKYASFXOQP7MS6SZK4STIQ,AGGQ72HVXMSQN3ZPG...","Abhay Goyal,Apurva,Shilpa,Vishnu Narayanan,Kus...","R1PCC1YKW3I4G8,RCUHBFP4RIAI5,RXEJH230ZKTRM,RNK...","Changing speed,Make it better,Superb Build Qua...","The product was nice its charging awesome,Cant...",https://m.media-amazon.com/images/I/41wI9GGhTH...,https://www.amazon.in/Charging-Braided-Compati...
4,B081FJWN52,Wayona Usb Type C To Usb Nylon Braided Quick C...,Computers&Accessories|Accessories&Peripherals|...,339.0,999.0,66,4.3,0.0,✅【Fast Charge & Data Sync】: Fast charge& data ...,"AH3ZH5IE4MTFB3T33O3QSGLU4BBA,AEQHHPCXUH4O5BS4V...","SMG,Rohit,roy,Sukumar Ballavolu,Jeeva,Sirajdur...","R3CGMQSB9H564N,RG5V69YDA5TLP,R18ESJU4TI0EGY,R1...","Good pick for Galaxy Note 9,Durable and qualit...",I purchased the cable for my Galaxy Note 9 in ...,https://m.media-amazon.com/images/I/41etMsrKqT...,https://www.amazon.in/Wayona-Braided-Charger-C...


## Review-Based Recommendations (TF-IDF)
This method uses TF-IDF vectorization of product reviews to recommend similar products.

In [8]:
from model_training import get_review_recommendations

try:
    review_recs = get_review_recommendations(df, product_id)
    if isinstance(review_recs, pd.DataFrame) and not review_recs.empty:
        display(review_recs)
    else:
        print("No review-based recommendations found.")
except Exception as e:
    print(f"Error generating review recommendations: {e}")

Unnamed: 0,product_id,product_name,category,discounted_price,actual_price,discount_percentage,rating,rating_count,about_product,user_id,user_name,review_id,review_title,review_content,img_link,product_link
0,B07JW9H4J1,Wayona Nylon Braided USB to Lightning Fast Cha...,Computers&Accessories|Accessories&Peripherals|...,399.0,1099.0,64,4.2,0.0,High Compatibility : Compatible With iPhone 12...,"AG3D6O4STAQKAY2UVGEUV46KN35Q,AHMY5CWJMMK5BJRBB...","Manav,Adarsh gupta,Sundeep,S.Sayeed Ahmed,jasp...","R3HXWT0LRP0NMF,R2AJM3LFTLZHFO,R6AQJGUP6P86,R1K...","Satisfied,Charging is really fast,Value for mo...",Looks durable Charging is fine tooNo complains...,https://m.media-amazon.com/images/I/51UsScvHQN...,https://www.amazon.in/Wayona-Braided-WN3LG1-Sy...
1,B07JW1Y6XV,Wayona Nylon Braided 3A Lightning to USB A Syn...,Computers&Accessories|Accessories&Peripherals|...,399.0,1099.0,64,4.2,0.0,"[High Compatibility] : iPhone X/XsMax/Xr ,iPho...","AG3D6O4STAQKAY2UVGEUV46KN35Q,AHMY5CWJMMK5BJRBB...","Manav,Adarsh gupta,Sundeep,S.Sayeed Ahmed,jasp...","R3HXWT0LRP0NMF,R2AJM3LFTLZHFO,R6AQJGUP6P86,R1K...","Satisfied,Charging is really fast,Value for mo...",Looks durable Charging is fine tooNo complains...,https://m.media-amazon.com/images/I/41rB0DnVFm...,https://www.amazon.in/Wayona-Braided-WN3LB1-Sy...
2,B07JGDB5M1,Wayona Nylon Braided 2M / 6Ft Fast Charge Usb ...,Computers&Accessories|Accessories&Peripherals|...,449.0,1299.0,65,4.2,0.0,"[High Compatibility] : Phone X/XsMax/Xr ,Phone...","AG3D6O4STAQKAY2UVGEUV46KN35Q,AHMY5CWJMMK5BJRBB...","Manav,Adarsh gupta,Sundeep,S.Sayeed Ahmed,jasp...","R3HXWT0LRP0NMF,R2AJM3LFTLZHFO,R6AQJGUP6P86,R1K...","Satisfied,Charging is really fast,Value for mo...",Looks durable Charging is fine tooNo complains...,https://m.media-amazon.com/images/I/51UsScvHQN...,https://www.amazon.in/Wayona-Braided-WN6LG1-Sy...
3,B07JH1C41D,Wayona Nylon Braided (2 Pack) Lightning Fast U...,Computers&Accessories|Accessories&Peripherals|...,649.0,1999.0,68,4.2,0.0,"[High Compatibility] : iPhone X/XsMax/Xr ,iPho...","AG3D6O4STAQKAY2UVGEUV46KN35Q,AHMY5CWJMMK5BJRBB...","Manav,Adarsh gupta,Sundeep,S.Sayeed Ahmed,jasp...","R3HXWT0LRP0NMF,R2AJM3LFTLZHFO,R6AQJGUP6P86,R1K...","Satisfied,Charging is really fast,Value for mo...",Looks durable Charging is fine tooNo complains...,https://m.media-amazon.com/images/I/412fvb7k2F...,https://www.amazon.in/Wayona-Braided-WN3LG2-Sy...
4,B07JH1CBGW,Wayona Nylon Braided Usb Syncing And Charging ...,Computers&Accessories|Accessories&Peripherals|...,649.0,1999.0,68,4.2,0.0,"[High Compatibility] : iPhone X/XsMax/Xr ,iPho...","AG3D6O4STAQKAY2UVGEUV46KN35Q,AHMY5CWJMMK5BJRBB...","Manav,Adarsh gupta,Sundeep,S.Sayeed Ahmed,jasp...","R3HXWT0LRP0NMF,R2AJM3LFTLZHFO,R6AQJGUP6P86,R1K...","Satisfied,Charging is really fast,Value for mo...",Looks durable Charging is fine tooNo complains...,https://m.media-amazon.com/images/I/41eHLj-wfG...,https://www.amazon.in/Wayona-Braided-WN3LB2-Sy...


##  Content-Based Recommendations (All Product Text Fields)
This method uses all available product text fields for content-based recommendations.

In [9]:
from model_training import get_content_recommendations

try:
    content_recs = get_content_recommendations(df, product_id)
    if isinstance(content_recs, pd.DataFrame) and not content_recs.empty:
        display(content_recs)
    else:
        print("No content-based recommendations found.")
except Exception as e:
    print(f"Error generating content recommendations: {e}")

Unnamed: 0,product_id,product_name,category,discounted_price,actual_price,discount_percentage,rating,rating_count,about_product,user_id,user_name,review_id,review_title,review_content,img_link,product_link
0,B07JH1CBGW,Wayona Nylon Braided Usb Syncing And Charging ...,Computers&Accessories|Accessories&Peripherals|...,649.0,1999.0,68,4.2,0.0,"[High Compatibility] : iPhone X/XsMax/Xr ,iPho...","AG3D6O4STAQKAY2UVGEUV46KN35Q,AHMY5CWJMMK5BJRBB...","Manav,Adarsh gupta,Sundeep,S.Sayeed Ahmed,jasp...","R3HXWT0LRP0NMF,R2AJM3LFTLZHFO,R6AQJGUP6P86,R1K...","Satisfied,Charging is really fast,Value for mo...",Looks durable Charging is fine tooNo complains...,https://m.media-amazon.com/images/I/41eHLj-wfG...,https://www.amazon.in/Wayona-Braided-WN3LB2-Sy...
1,B07JW1Y6XV,Wayona Nylon Braided 3A Lightning to USB A Syn...,Computers&Accessories|Accessories&Peripherals|...,399.0,1099.0,64,4.2,0.0,"[High Compatibility] : iPhone X/XsMax/Xr ,iPho...","AG3D6O4STAQKAY2UVGEUV46KN35Q,AHMY5CWJMMK5BJRBB...","Manav,Adarsh gupta,Sundeep,S.Sayeed Ahmed,jasp...","R3HXWT0LRP0NMF,R2AJM3LFTLZHFO,R6AQJGUP6P86,R1K...","Satisfied,Charging is really fast,Value for mo...",Looks durable Charging is fine tooNo complains...,https://m.media-amazon.com/images/I/41rB0DnVFm...,https://www.amazon.in/Wayona-Braided-WN3LB1-Sy...
2,B07JH1C41D,Wayona Nylon Braided (2 Pack) Lightning Fast U...,Computers&Accessories|Accessories&Peripherals|...,649.0,1999.0,68,4.2,0.0,"[High Compatibility] : iPhone X/XsMax/Xr ,iPho...","AG3D6O4STAQKAY2UVGEUV46KN35Q,AHMY5CWJMMK5BJRBB...","Manav,Adarsh gupta,Sundeep,S.Sayeed Ahmed,jasp...","R3HXWT0LRP0NMF,R2AJM3LFTLZHFO,R6AQJGUP6P86,R1K...","Satisfied,Charging is really fast,Value for mo...",Looks durable Charging is fine tooNo complains...,https://m.media-amazon.com/images/I/412fvb7k2F...,https://www.amazon.in/Wayona-Braided-WN3LG2-Sy...
3,B07LGT55SJ,Wayona Usb Nylon Braided Data Sync And Chargin...,Computers&Accessories|Accessories&Peripherals|...,399.0,1099.0,64,4.2,0.0,[High Compatibility] : Compatible For iPhone X...,"AG3D6O4STAQKAY2UVGEUV46KN35Q,AHMY5CWJMMK5BJRBB...","Manav,Adarsh gupta,Sundeep,S.Sayeed Ahmed,jasp...","R3HXWT0LRP0NMF,R2AJM3LFTLZHFO,R6AQJGUP6P86,R1K...","Satisfied,Charging is really fast,Value for mo...",Looks durable Charging is fine tooNo complains...,https://m.media-amazon.com/images/I/41xmv3WPs7...,https://www.amazon.in/Wayona-Braided-Syncing-C...
4,B07JGDB5M1,Wayona Nylon Braided 2M / 6Ft Fast Charge Usb ...,Computers&Accessories|Accessories&Peripherals|...,449.0,1299.0,65,4.2,0.0,"[High Compatibility] : Phone X/XsMax/Xr ,Phone...","AG3D6O4STAQKAY2UVGEUV46KN35Q,AHMY5CWJMMK5BJRBB...","Manav,Adarsh gupta,Sundeep,S.Sayeed Ahmed,jasp...","R3HXWT0LRP0NMF,R2AJM3LFTLZHFO,R6AQJGUP6P86,R1K...","Satisfied,Charging is really fast,Value for mo...",Looks durable Charging is fine tooNo complains...,https://m.media-amazon.com/images/I/51UsScvHQN...,https://www.amazon.in/Wayona-Braided-WN6LG1-Sy...


## Content-Based with PCA (TF-IDF + PCA)
This method applies PCA to TF-IDF features from product text fields before recommending similar products.

In [10]:
from model_training import get_content_recommendations_pca

try:
    content_pca_recs = get_content_recommendations_pca(df, product_id)
    if isinstance(content_pca_recs, pd.DataFrame) and not content_pca_recs.empty:
        display(content_pca_recs)
    else:
        print("No content-PCA recommendations found.")
except Exception as e:
    print(f"Error generating content PCA recommendations: {e}")

Unnamed: 0,product_id,product_name,category,discounted_price,actual_price,discount_percentage,rating,rating_count,about_product,user_id,user_name,review_id,review_title,review_content,img_link,product_link
0,B07JW1Y6XV,Wayona Nylon Braided 3A Lightning to USB A Syn...,Computers&Accessories|Accessories&Peripherals|...,399.0,1099.0,64,4.2,0.0,"[High Compatibility] : iPhone X/XsMax/Xr ,iPho...","AG3D6O4STAQKAY2UVGEUV46KN35Q,AHMY5CWJMMK5BJRBB...","Manav,Adarsh gupta,Sundeep,S.Sayeed Ahmed,jasp...","R3HXWT0LRP0NMF,R2AJM3LFTLZHFO,R6AQJGUP6P86,R1K...","Satisfied,Charging is really fast,Value for mo...",Looks durable Charging is fine tooNo complains...,https://m.media-amazon.com/images/I/41rB0DnVFm...,https://www.amazon.in/Wayona-Braided-WN3LB1-Sy...
1,B07JH1C41D,Wayona Nylon Braided (2 Pack) Lightning Fast U...,Computers&Accessories|Accessories&Peripherals|...,649.0,1999.0,68,4.2,0.0,"[High Compatibility] : iPhone X/XsMax/Xr ,iPho...","AG3D6O4STAQKAY2UVGEUV46KN35Q,AHMY5CWJMMK5BJRBB...","Manav,Adarsh gupta,Sundeep,S.Sayeed Ahmed,jasp...","R3HXWT0LRP0NMF,R2AJM3LFTLZHFO,R6AQJGUP6P86,R1K...","Satisfied,Charging is really fast,Value for mo...",Looks durable Charging is fine tooNo complains...,https://m.media-amazon.com/images/I/412fvb7k2F...,https://www.amazon.in/Wayona-Braided-WN3LG2-Sy...
2,B07JH1CBGW,Wayona Nylon Braided Usb Syncing And Charging ...,Computers&Accessories|Accessories&Peripherals|...,649.0,1999.0,68,4.2,0.0,"[High Compatibility] : iPhone X/XsMax/Xr ,iPho...","AG3D6O4STAQKAY2UVGEUV46KN35Q,AHMY5CWJMMK5BJRBB...","Manav,Adarsh gupta,Sundeep,S.Sayeed Ahmed,jasp...","R3HXWT0LRP0NMF,R2AJM3LFTLZHFO,R6AQJGUP6P86,R1K...","Satisfied,Charging is really fast,Value for mo...",Looks durable Charging is fine tooNo complains...,https://m.media-amazon.com/images/I/41eHLj-wfG...,https://www.amazon.in/Wayona-Braided-WN3LB2-Sy...
3,B07LGT55SJ,Wayona Usb Nylon Braided Data Sync And Chargin...,Computers&Accessories|Accessories&Peripherals|...,399.0,1099.0,64,4.2,0.0,[High Compatibility] : Compatible For iPhone X...,"AG3D6O4STAQKAY2UVGEUV46KN35Q,AHMY5CWJMMK5BJRBB...","Manav,Adarsh gupta,Sundeep,S.Sayeed Ahmed,jasp...","R3HXWT0LRP0NMF,R2AJM3LFTLZHFO,R6AQJGUP6P86,R1K...","Satisfied,Charging is really fast,Value for mo...",Looks durable Charging is fine tooNo complains...,https://m.media-amazon.com/images/I/41xmv3WPs7...,https://www.amazon.in/Wayona-Braided-Syncing-C...
4,B07JGDB5M1,Wayona Nylon Braided 2M / 6Ft Fast Charge Usb ...,Computers&Accessories|Accessories&Peripherals|...,449.0,1299.0,65,4.2,0.0,"[High Compatibility] : Phone X/XsMax/Xr ,Phone...","AG3D6O4STAQKAY2UVGEUV46KN35Q,AHMY5CWJMMK5BJRBB...","Manav,Adarsh gupta,Sundeep,S.Sayeed Ahmed,jasp...","R3HXWT0LRP0NMF,R2AJM3LFTLZHFO,R6AQJGUP6P86,R1K...","Satisfied,Charging is really fast,Value for mo...",Looks durable Charging is fine tooNo complains...,https://m.media-amazon.com/images/I/51UsScvHQN...,https://www.amazon.in/Wayona-Braided-WN6LG1-Sy...


## Sentiment-Based Recommendations
This method uses sentiment analysis scores from product reviews to recommend similar products.

In [11]:
from model_training import get_sentiment_recommendations

try:
    sentiment_recs = get_sentiment_recommendations(df, product_id)
    if isinstance(sentiment_recs, pd.DataFrame) and not sentiment_recs.empty:
        display(sentiment_recs)
    else:
        print("No sentiment-based recommendations found.")
except Exception as e:
    print(f"Error generating sentiment recommendations: {e}")

Unnamed: 0,product_id,product_name,category,discounted_price,actual_price,discount_percentage,rating,rating_count,about_product,user_id,user_name,review_id,review_title,review_content,img_link,product_link
0,B07LGT55SJ,Wayona Usb Nylon Braided Data Sync And Chargin...,Computers&Accessories|Accessories&Peripherals|...,399.0,1099.0,64,4.2,0.0,[High Compatibility] : Compatible For iPhone X...,"AG3D6O4STAQKAY2UVGEUV46KN35Q,AHMY5CWJMMK5BJRBB...","Manav,Adarsh gupta,Sundeep,S.Sayeed Ahmed,jasp...","R3HXWT0LRP0NMF,R2AJM3LFTLZHFO,R6AQJGUP6P86,R1K...","Satisfied,Charging is really fast,Value for mo...",Looks durable Charging is fine tooNo complains...,https://m.media-amazon.com/images/I/41xmv3WPs7...,https://www.amazon.in/Wayona-Braided-Syncing-C...
1,B07JH1C41D,Wayona Nylon Braided (2 Pack) Lightning Fast U...,Computers&Accessories|Accessories&Peripherals|...,649.0,1999.0,68,4.2,0.0,"[High Compatibility] : iPhone X/XsMax/Xr ,iPho...","AG3D6O4STAQKAY2UVGEUV46KN35Q,AHMY5CWJMMK5BJRBB...","Manav,Adarsh gupta,Sundeep,S.Sayeed Ahmed,jasp...","R3HXWT0LRP0NMF,R2AJM3LFTLZHFO,R6AQJGUP6P86,R1K...","Satisfied,Charging is really fast,Value for mo...",Looks durable Charging is fine tooNo complains...,https://m.media-amazon.com/images/I/412fvb7k2F...,https://www.amazon.in/Wayona-Braided-WN3LG2-Sy...
2,B07JGDB5M1,Wayona Nylon Braided 2M / 6Ft Fast Charge Usb ...,Computers&Accessories|Accessories&Peripherals|...,449.0,1299.0,65,4.2,0.0,"[High Compatibility] : Phone X/XsMax/Xr ,Phone...","AG3D6O4STAQKAY2UVGEUV46KN35Q,AHMY5CWJMMK5BJRBB...","Manav,Adarsh gupta,Sundeep,S.Sayeed Ahmed,jasp...","R3HXWT0LRP0NMF,R2AJM3LFTLZHFO,R6AQJGUP6P86,R1K...","Satisfied,Charging is really fast,Value for mo...",Looks durable Charging is fine tooNo complains...,https://m.media-amazon.com/images/I/51UsScvHQN...,https://www.amazon.in/Wayona-Braided-WN6LG1-Sy...
3,B07JH1CBGW,Wayona Nylon Braided Usb Syncing And Charging ...,Computers&Accessories|Accessories&Peripherals|...,649.0,1999.0,68,4.2,0.0,"[High Compatibility] : iPhone X/XsMax/Xr ,iPho...","AG3D6O4STAQKAY2UVGEUV46KN35Q,AHMY5CWJMMK5BJRBB...","Manav,Adarsh gupta,Sundeep,S.Sayeed Ahmed,jasp...","R3HXWT0LRP0NMF,R2AJM3LFTLZHFO,R6AQJGUP6P86,R1K...","Satisfied,Charging is really fast,Value for mo...",Looks durable Charging is fine tooNo complains...,https://m.media-amazon.com/images/I/41eHLj-wfG...,https://www.amazon.in/Wayona-Braided-WN3LB2-Sy...
4,B07JW9H4J1,Wayona Nylon Braided USB to Lightning Fast Cha...,Computers&Accessories|Accessories&Peripherals|...,399.0,1099.0,64,4.2,0.0,High Compatibility : Compatible With iPhone 12...,"AG3D6O4STAQKAY2UVGEUV46KN35Q,AHMY5CWJMMK5BJRBB...","Manav,Adarsh gupta,Sundeep,S.Sayeed Ahmed,jasp...","R3HXWT0LRP0NMF,R2AJM3LFTLZHFO,R6AQJGUP6P86,R1K...","Satisfied,Charging is really fast,Value for mo...",Looks durable Charging is fine tooNo complains...,https://m.media-amazon.com/images/I/51UsScvHQN...,https://www.amazon.in/Wayona-Braided-WN3LG1-Sy...


## Topic Modeling (LDA)
This method uses topic modeling (LDA) on product reviews to recommend products with similar topics.

In [12]:
from model_training import get_topic_recommendations

try:
    topic_recs = get_topic_recommendations(df, product_id)
    if isinstance(topic_recs, pd.DataFrame) and not topic_recs.empty:
        display(topic_recs)
    else:
        print("No topic-modeling recommendations found.")
except Exception as e:
    print(f"Error generating topic recommendations: {e}")

Unnamed: 0,product_id,product_name,category,discounted_price,actual_price,discount_percentage,rating,rating_count,about_product,user_id,user_name,review_id,review_title,review_content,img_link,product_link
0,B07JW9H4J1,Wayona Nylon Braided USB to Lightning Fast Cha...,Computers&Accessories|Accessories&Peripherals|...,399.0,1099.0,64,4.2,0.0,High Compatibility : Compatible With iPhone 12...,"AG3D6O4STAQKAY2UVGEUV46KN35Q,AHMY5CWJMMK5BJRBB...","Manav,Adarsh gupta,Sundeep,S.Sayeed Ahmed,jasp...","R3HXWT0LRP0NMF,R2AJM3LFTLZHFO,R6AQJGUP6P86,R1K...","Satisfied,Charging is really fast,Value for mo...",Looks durable Charging is fine tooNo complains...,https://m.media-amazon.com/images/I/51UsScvHQN...,https://www.amazon.in/Wayona-Braided-WN3LG1-Sy...
1,B07JW1Y6XV,Wayona Nylon Braided 3A Lightning to USB A Syn...,Computers&Accessories|Accessories&Peripherals|...,399.0,1099.0,64,4.2,0.0,"[High Compatibility] : iPhone X/XsMax/Xr ,iPho...","AG3D6O4STAQKAY2UVGEUV46KN35Q,AHMY5CWJMMK5BJRBB...","Manav,Adarsh gupta,Sundeep,S.Sayeed Ahmed,jasp...","R3HXWT0LRP0NMF,R2AJM3LFTLZHFO,R6AQJGUP6P86,R1K...","Satisfied,Charging is really fast,Value for mo...",Looks durable Charging is fine tooNo complains...,https://m.media-amazon.com/images/I/41rB0DnVFm...,https://www.amazon.in/Wayona-Braided-WN3LB1-Sy...
2,B07JGDB5M1,Wayona Nylon Braided 2M / 6Ft Fast Charge Usb ...,Computers&Accessories|Accessories&Peripherals|...,449.0,1299.0,65,4.2,0.0,"[High Compatibility] : Phone X/XsMax/Xr ,Phone...","AG3D6O4STAQKAY2UVGEUV46KN35Q,AHMY5CWJMMK5BJRBB...","Manav,Adarsh gupta,Sundeep,S.Sayeed Ahmed,jasp...","R3HXWT0LRP0NMF,R2AJM3LFTLZHFO,R6AQJGUP6P86,R1K...","Satisfied,Charging is really fast,Value for mo...",Looks durable Charging is fine tooNo complains...,https://m.media-amazon.com/images/I/51UsScvHQN...,https://www.amazon.in/Wayona-Braided-WN6LG1-Sy...
3,B07JH1C41D,Wayona Nylon Braided (2 Pack) Lightning Fast U...,Computers&Accessories|Accessories&Peripherals|...,649.0,1999.0,68,4.2,0.0,"[High Compatibility] : iPhone X/XsMax/Xr ,iPho...","AG3D6O4STAQKAY2UVGEUV46KN35Q,AHMY5CWJMMK5BJRBB...","Manav,Adarsh gupta,Sundeep,S.Sayeed Ahmed,jasp...","R3HXWT0LRP0NMF,R2AJM3LFTLZHFO,R6AQJGUP6P86,R1K...","Satisfied,Charging is really fast,Value for mo...",Looks durable Charging is fine tooNo complains...,https://m.media-amazon.com/images/I/412fvb7k2F...,https://www.amazon.in/Wayona-Braided-WN3LG2-Sy...
4,B07JH1CBGW,Wayona Nylon Braided Usb Syncing And Charging ...,Computers&Accessories|Accessories&Peripherals|...,649.0,1999.0,68,4.2,0.0,"[High Compatibility] : iPhone X/XsMax/Xr ,iPho...","AG3D6O4STAQKAY2UVGEUV46KN35Q,AHMY5CWJMMK5BJRBB...","Manav,Adarsh gupta,Sundeep,S.Sayeed Ahmed,jasp...","R3HXWT0LRP0NMF,R2AJM3LFTLZHFO,R6AQJGUP6P86,R1K...","Satisfied,Charging is really fast,Value for mo...",Looks durable Charging is fine tooNo complains...,https://m.media-amazon.com/images/I/41eHLj-wfG...,https://www.amazon.in/Wayona-Braided-WN3LB2-Sy...


##  Reviewer Overlap Recommendations
This method recommends products based on overlap in reviewers (users who reviewed both products).

In [13]:
from model_training import get_reviewer_overlap_recommendations

try:
    reviewer_overlap_recs = get_reviewer_overlap_recommendations(df, product_id)
    if isinstance(reviewer_overlap_recs, pd.DataFrame) and not reviewer_overlap_recs.empty:
        display(reviewer_overlap_recs)
    else:
        print("No reviewer-overlap recommendations found.")
except Exception as e:
    print(f"Error generating reviewer-overlap recommendations: {e}")

Unnamed: 0,product_id,product_name,category,rating,discounted_price,img_link
0,B07JW1Y6XV,Wayona Nylon Braided 3A Lightning to USB A Syn...,Computers&Accessories|Accessories&Peripherals|...,4.2,399.0,https://m.media-amazon.com/images/I/41rB0DnVFm...
1,B07LGT55SJ,Wayona Usb Nylon Braided Data Sync And Chargin...,Computers&Accessories|Accessories&Peripherals|...,4.2,399.0,https://m.media-amazon.com/images/I/41xmv3WPs7...
2,B07JH1C41D,Wayona Nylon Braided (2 Pack) Lightning Fast U...,Computers&Accessories|Accessories&Peripherals|...,4.2,649.0,https://m.media-amazon.com/images/I/412fvb7k2F...
3,B07JGDB5M1,Wayona Nylon Braided 2M / 6Ft Fast Charge Usb ...,Computers&Accessories|Accessories&Peripherals|...,4.2,449.0,https://m.media-amazon.com/images/I/51UsScvHQN...
4,B07JH1CBGW,Wayona Nylon Braided Usb Syncing And Charging ...,Computers&Accessories|Accessories&Peripherals|...,4.2,649.0,https://m.media-amazon.com/images/I/41eHLj-wfG...


##  Weighted Hybrid Recommendations
This method combines multiple recommendation sources using weighted scores for a hybrid approach.

In [14]:
from model_training import get_weighted_hybrid_recommendations

model_keys = [
        'basic_cosine', 'pca_features', 'content_tfidf', 'content_pca', 
        'review_text', 'sentiment', 'topic_lda', 'reviewer_overlap', 
        'knn_numeric', 'svd_collaborative_optimized'
    ]

# we will use all the 10 models for this test
try:
    weighted_hybrid_recs = get_weighted_hybrid_recommendations(df, product_id, 5, timeout=2.0, model_keys=model_keys)
    if isinstance(weighted_hybrid_recs, pd.DataFrame) and not weighted_hybrid_recs.empty:
        display(weighted_hybrid_recs)
    else:
        print("No weighted-hybrid recommendations found.")
except Exception as e:
    print(f"Error generating weighted hybrid recommendations: {e}")

⚡ First-time SVD build: Generating latent factors...


Unnamed: 0,product_id,product_name,category,discounted_price,actual_price,discount_percentage,rating,rating_count,about_product,user_id,user_name,review_id,review_title,review_content,img_link,product_link,hybrid_score
0,B07JW1Y6XV,Wayona Nylon Braided 3A Lightning to USB A Syn...,Computers&Accessories|Accessories&Peripherals|...,399.0,1099.0,64,4.2,0.0,"[High Compatibility] : iPhone X/XsMax/Xr ,iPho...","AG3D6O4STAQKAY2UVGEUV46KN35Q,AHMY5CWJMMK5BJRBB...","Manav,Adarsh gupta,Sundeep,S.Sayeed Ahmed,jasp...","R3HXWT0LRP0NMF,R2AJM3LFTLZHFO,R6AQJGUP6P86,R1K...","Satisfied,Charging is really fast,Value for mo...",Looks durable Charging is fine tooNo complains...,https://m.media-amazon.com/images/I/41rB0DnVFm...,https://www.amazon.in/Wayona-Braided-WN3LB1-Sy...,8.0
1,B07LGT55SJ,Wayona Usb Nylon Braided Data Sync And Chargin...,Computers&Accessories|Accessories&Peripherals|...,399.0,1099.0,64,4.2,0.0,[High Compatibility] : Compatible For iPhone X...,"AG3D6O4STAQKAY2UVGEUV46KN35Q,AHMY5CWJMMK5BJRBB...","Manav,Adarsh gupta,Sundeep,S.Sayeed Ahmed,jasp...","R3HXWT0LRP0NMF,R2AJM3LFTLZHFO,R6AQJGUP6P86,R1K...","Satisfied,Charging is really fast,Value for mo...",Looks durable Charging is fine tooNo complains...,https://m.media-amazon.com/images/I/41xmv3WPs7...,https://www.amazon.in/Wayona-Braided-Syncing-C...,5.6
2,B07JH1C41D,Wayona Nylon Braided (2 Pack) Lightning Fast U...,Computers&Accessories|Accessories&Peripherals|...,649.0,1999.0,68,4.2,0.0,"[High Compatibility] : iPhone X/XsMax/Xr ,iPho...","AG3D6O4STAQKAY2UVGEUV46KN35Q,AHMY5CWJMMK5BJRBB...","Manav,Adarsh gupta,Sundeep,S.Sayeed Ahmed,jasp...","R3HXWT0LRP0NMF,R2AJM3LFTLZHFO,R6AQJGUP6P86,R1K...","Satisfied,Charging is really fast,Value for mo...",Looks durable Charging is fine tooNo complains...,https://m.media-amazon.com/images/I/412fvb7k2F...,https://www.amazon.in/Wayona-Braided-WN3LG2-Sy...,5.1
3,B07JGDB5M1,Wayona Nylon Braided 2M / 6Ft Fast Charge Usb ...,Computers&Accessories|Accessories&Peripherals|...,449.0,1299.0,65,4.2,0.0,"[High Compatibility] : Phone X/XsMax/Xr ,Phone...","AG3D6O4STAQKAY2UVGEUV46KN35Q,AHMY5CWJMMK5BJRBB...","Manav,Adarsh gupta,Sundeep,S.Sayeed Ahmed,jasp...","R3HXWT0LRP0NMF,R2AJM3LFTLZHFO,R6AQJGUP6P86,R1K...","Satisfied,Charging is really fast,Value for mo...",Looks durable Charging is fine tooNo complains...,https://m.media-amazon.com/images/I/51UsScvHQN...,https://www.amazon.in/Wayona-Braided-WN6LG1-Sy...,4.5
4,B07JH1CBGW,Wayona Nylon Braided Usb Syncing And Charging ...,Computers&Accessories|Accessories&Peripherals|...,649.0,1999.0,68,4.2,0.0,"[High Compatibility] : iPhone X/XsMax/Xr ,iPho...","AG3D6O4STAQKAY2UVGEUV46KN35Q,AHMY5CWJMMK5BJRBB...","Manav,Adarsh gupta,Sundeep,S.Sayeed Ahmed,jasp...","R3HXWT0LRP0NMF,R2AJM3LFTLZHFO,R6AQJGUP6P86,R1K...","Satisfied,Charging is really fast,Value for mo...",Looks durable Charging is fine tooNo complains...,https://m.media-amazon.com/images/I/41eHLj-wfG...,https://www.amazon.in/Wayona-Braided-WN3LB2-Sy...,4.4


START TRAINING - At this point we have 10 different ways to recommend a product to the user. 

We will test all the product recommendation models to each KPI we set.
In the code below the method generate_category_ground_truth is the 'Answer Key', it is labeled as the Ground Truth— the "relevant" items that the recommender should find.
evaluate_recommender - This is the core loop that "tests" the AI models ,  it tests all the KPI we set.
then it test all the recommendation models and measure each of the metrics.

In [15]:
import time
def generate_category_ground_truth(df):
    """
    Creates a mapping of product_id -> list of other product_ids 
    in the same exact category.
    """
    ground_truth = {}
    # Group by the category string
    category_groups = df.groupby('category')['product_id'].apply(list)
    
    for category, products in category_groups.items():
        for i, target_id in enumerate(products):
            # All other products in this category are 'relevant'
            relevant = [p for j, p in enumerate(products) if i != j]
            if relevant:
                ground_truth[target_id] = relevant
                
    return ground_truth


def evaluate_recommender(test_data, get_rec_func, K=5):
    """
    Updated: No longer passes 'df' to the recommendation function.
    """
    precisions, reciprocal_ranks, ndcgs, latencies = [], [], [], []

    for target_id, relevant_ids in test_data.items():
        # Measure latency
        start = time.time()
        # FIX: Only pass product_id and N
        recs = get_rec_func(target_id, N=K) 
        latency = time.time() - start
        latencies.append(latency)

        if hasattr(recs, 'product_id'):
            recommended_ids = recs['product_id'].tolist()
        elif isinstance(recs, list):
            recommended_ids = [r['product_id'] for r in recs if 'product_id' in r]
        else:
            recommended_ids = []

        # Precision@K
        hits = len(set(recommended_ids) & set(relevant_ids))
        precisions.append(hits / K)

        # MRR (Mean Reciprocal Rank)
        rank = 0
        for i, rid in enumerate(recommended_ids):
            if rid in relevant_ids:
                rank = i + 1
                break
        reciprocal_ranks.append(1 / rank if rank > 0 else 0)

        # NDCG calculation
        y_true = np.array([[1 if rid in relevant_ids else 0 for rid in recommended_ids]])
        y_score = np.array([[K - i for i in range(len(recommended_ids))]])
        try:
            if np.sum(y_true) > 0:
                ndcgs.append(ndcg_score(y_true, y_score, k=K))
            else:
                ndcgs.append(0.0)
        except:
            ndcgs.append(0.0)

    return {
        f"Precision@{K}": np.mean(precisions),
        "MRR": np.mean(reciprocal_ranks),
        f"NDCG@{K}": np.mean(ndcgs),
        "Latency (s)": np.mean(latencies)
    }


# Ensure KNN features are ready
df, X_knn = prepare_features_for_knn(df)

test_set = generate_category_ground_truth(df)
# Updated test_methods to match the (pid, N) signature exactly
test_methods = {
    "Basic Cosine": lambda pid, N: get_recommendations(df, pid, N=N),
    "PCA Features": lambda pid, N: get_recommendations_with_pca(df, pid, N=N),
    "Content (TF-IDF)": lambda pid, N: get_content_recommendations(df, pid, N=N),
    "Content + PCA": lambda pid, N: get_content_recommendations_pca(df, pid, N=N),
    "Review Text": lambda pid, N: get_review_recommendations(df, pid, N=N),
    "Sentiment": lambda pid, N: get_sentiment_recommendations(df, pid, N=N),
    "Topic Modeling": lambda pid, N: get_topic_recommendations(df, pid, N=N),
    "Reviewer Overlap": lambda pid, N: get_reviewer_overlap_recommendations(df, pid, N=N),
    "SVD (Collab)": lambda pid, N: svd_product_recommendations(df, pid, n=N),
    "KNN (Numeric)": lambda pid, N: get_knn_recommendations(df, X_knn, df[df['product_id']==pid].index[0], n=N)
   # "OPTIMIZED HYBRID": lambda pid, N: get_weighted_hybrid_recommendations(pid, N=N, weights=optimal_weights)
}

# Run the evaluation
results = []
test_set_filtered = {k: v for k, v in test_set.items() if len(v) > 1}
eval_sample = dict(list(test_set_filtered.items())[:10])
print(f"Sampled {len(eval_sample)} queries for evaluation.")

#eval_sample = dict(list(hybrid_truth.items())[:30]) 

for name, func in test_methods.items():
    print(f"Evaluating {name}...", end=" ")
    try:
        metrics = evaluate_recommender(eval_sample, func, K=5)
        metrics['Model'] = name
        results.append(metrics)
        print("Done.")
    except Exception as e:
        print(f"Failed: {e}")

# Display results
kpi_df = pd.DataFrame(results)
kpi_df = kpi_df[['Model', 'NDCG@5', 'MRR', 'Precision@5', 'Latency (s)']]
display(kpi_df.sort_values(by='NDCG@5', ascending=False))

Sampled 10 queries for evaluation.
Evaluating Basic Cosine... Done.
Evaluating PCA Features... Done.
Evaluating Content (TF-IDF)... Done.
Evaluating Content + PCA... Done.
Evaluating Review Text... Done.
Evaluating Sentiment... Done.
Evaluating Topic Modeling... Done.
Evaluating Reviewer Overlap... Done.
Evaluating SVD (Collab)... Done.
Evaluating KNN (Numeric)... Done.


Unnamed: 0,Model,NDCG@5,MRR,Precision@5,Latency (s)
2,Content (TF-IDF),1.0,1.0,1.0,0.001562
3,Content + PCA,1.0,1.0,1.0,0.001507
0,Basic Cosine,0.984468,1.0,0.6,0.006053
1,PCA Features,0.976613,1.0,0.7,0.003752
4,Review Text,0.967973,0.95,0.86,0.001524
6,Topic Modeling,0.851854,0.85,0.7,0.001454
9,KNN (Numeric),0.837005,0.77,0.44,0.001453
5,Sentiment,0.7426,0.7,0.54,0.001525
8,SVD (Collab),0.626692,0.52,0.58,18.621347
7,Reviewer Overlap,0.6,0.8,0.6,0.001879


Interpretation
NDCG@5 (Normalized Discounted Cumulative Gain): Measures the quality of the ranking. It rewards the model for putting the most relevant items at the very top of the list. 1.000000 is a perfect score.

MRR (Mean Reciprocal Rank): Focuses on the first relevant item. A score of 1.00 means the model’s very first recommendation was always correct.

Precision@5: Measures density. It tells you what percentage of the top 5 recommendations were relevant.

Latency (s): The speed of the model in seconds. For a live website, you generally want this under 0.1 seconds.

We set the KPI target of NDCG - >0.80, MRR - >0.70 and Latency - <200MS

The result shows that:
Content + PCA (Highest accuracy).
Review Text (Best semantic/user-generated signal).
PCA Features (Excellent ranking logic).
SVD (Collab), as it is currently a bottleneck for system performance finishes the job at 18secs.




Our new goal is to combine these models in achieving optimal NDCG score, this will be a hybrid type of recommendation engine, and we will use grid search to find the 'sweet spot' 

In [None]:
import numpy as np
import pandas as pd
import random
from model_training import get_weighted_hybrid_recommendations

def randomized_weight_search(df, test_data, n_iter=50, K=5):
    """
    Randomly samples weight combinations to find the best configuration.
    """
    model_keys = [
        'basic_cosine', 'pca_features', 'content_tfidf', 'content_pca', 
        'review_text', 'sentiment', 'topic_lda', 'reviewer_overlap', 
        'knn_numeric', 'svd_collaborative_optimized'
    ]
    
    results = []
    best_score = -1
    best_weights = None

    print(f"Starting randomized search for {n_iter} iterations...")

    for i in range(n_iter):
        # 1. Generate random weights that sum to 1.0
        raw_weights = np.random.dirichlet(np.ones(len(model_keys)), size=1)[0]
        current_weights = dict(zip(model_keys, raw_weights))

        # 2. Updated wrapper signature to match evaluate_recommender (pid, N)
        def hybrid_func(pid, N):
            # Pass the global df to the hybrid function as required by model_training.py
            return get_weighted_hybrid_recommendations(df, pid, N=N, weights=current_weights, model_keys=model_keys)

        # 3. Evaluate using the category-based truth
        scores = evaluate_recommender(test_data, hybrid_func, K=K)
        score = scores[f'NDCG@{K}']
        
        results.append({'weights': current_weights, 'ndcg': score})

        if score > best_score:
            best_score = score
            best_weights = current_weights
            print(f"Iteration {i}: New Best NDCG@{K} = {best_score:.4f}")

    return best_weights, best_score

# --- Run the Optimization ---
category_truth = generate_category_ground_truth(df)
sample_truth = dict(list(category_truth.items())[:20]) 

optimal_weights, top_score = randomized_weight_search(df, sample_truth, n_iter=30)

print("\n--- OPTIMAL HYBRID WEIGHTS ---")
for k, v in optimal_weights.items():
    print(f"{k:20}: {v:.4f}")

Starting randomized search for 30 iterations...
Iteration 0: New Best NDCG@5 = 0.6668
Iteration 2: New Best NDCG@5 = 0.8351
Iteration 5: New Best NDCG@5 = 0.8720
Iteration 6: New Best NDCG@5 = 0.9446
Iteration 7: New Best NDCG@5 = 0.9446
Iteration 12: New Best NDCG@5 = 0.9631
Iteration 19: New Best NDCG@5 = 0.9815
Iteration 27: New Best NDCG@5 = 0.9984

--- OPTIMAL HYBRID WEIGHTS ---
basic_cosine        : 0.0624
pca_features        : 0.0967
content_tfidf       : 0.1437
content_pca         : 0.1563
review_text         : 0.0156
sentiment           : 0.0320
topic_lda           : 0.1680
reviewer_overlap    : 0.1714
knn_numeric         : 0.1167
svd_collaborative   : 0.0372


The weight results above we will use it in our main model which is get_weighted_hybrid_recommendations.

In [17]:
import time
import pandas as pd
import numpy as np
from sklearn.metrics import ndcg_score

# 1. Setup Ground Truth (Filtered for rich data)
test_set = generate_category_ground_truth(df)
test_set_filtered = {k: v for k, v in test_set.items() if len(v) > 1}
# Expanding sample size to 30 for more stable KPIs
eval_sample = dict(list(test_set_filtered.items())[:30]) 

print(f"Evaluating Optimized Hybrid on {len(eval_sample)} queries...")

# 2. Define the Optimized Hybrid Wrapper
# Uses your best weights from the randomized search
best_hybrid_weights = optimal_weights  # From previous optimization step

# The lambda ensures the evaluator only needs to pass (pid, N)
hybrid_func = lambda pid, N: get_weighted_hybrid_recommendations(df, pid, N=N, weights=best_hybrid_weights)

# 3. Run Single Evaluation
try:
    start_wall = time.time()
    metrics = evaluate_recommender(eval_sample, hybrid_func, K=5)
    total_time = time.time() - start_wall
    
    metrics['Model'] = "OPTIMIZED HYBRID"
    
    # 4. Display results in a clean table
    kpi_df = pd.DataFrame([metrics])
    cols = ['Model', 'NDCG@5', 'MRR', 'Precision@5', 'Latency (s)']
    display(kpi_df[cols])
    
    print(f"\nTotal Wall Time for 30 samples: {total_time:.2f} seconds")

except Exception as e:
    print(f"Evaluation failed: {e}")

Evaluating Optimized Hybrid on 30 queries...


Unnamed: 0,Model,NDCG@5,MRR,Precision@5,Latency (s)
0,OPTIMIZED HYBRID,0.986858,1.0,0.9,0.025685



Total Wall Time for 30 samples: 0.78 seconds


Over all it met our target KPI, however, in reality although we can make the 10 models to run in parallel it will still take us an extra 18 seconds to finish the SVD recommendation. (please see the section in the paper where we tried to deploy a web application and the recommendation loaded for about 13 seconds which is not good because the real user will be bored before for the recommended product to appear in the browser).

So we need to think of ways to optimized it. We will apply an optimized svd.


In [18]:
import time
def generate_category_ground_truth(df):
    """
    Creates a mapping of product_id -> list of other product_ids 
    in the same exact category.
    """
    ground_truth = {}
    # Group by the category string
    category_groups = df.groupby('category')['product_id'].apply(list)
    
    for category, products in category_groups.items():
        for i, target_id in enumerate(products):
            # All other products in this category are 'relevant'
            relevant = [p for j, p in enumerate(products) if i != j]
            if relevant:
                ground_truth[target_id] = relevant
                
    return ground_truth


def evaluate_recommender(test_data, get_rec_func, K=5):
    """
    Updated: No longer passes 'df' to the recommendation function.
    """
    precisions, reciprocal_ranks, ndcgs, latencies = [], [], [], []

    for target_id, relevant_ids in test_data.items():
        # Measure latency
        start = time.time()
        # FIX: Only pass product_id and N
        recs = get_rec_func(target_id, N=K) 
        latency = time.time() - start
        latencies.append(latency)

        if hasattr(recs, 'product_id'):
            recommended_ids = recs['product_id'].tolist()
        elif isinstance(recs, list):
            recommended_ids = [r['product_id'] for r in recs if 'product_id' in r]
        else:
            recommended_ids = []

        # Precision@K
        hits = len(set(recommended_ids) & set(relevant_ids))
        precisions.append(hits / K)

        # MRR (Mean Reciprocal Rank)
        rank = 0
        for i, rid in enumerate(recommended_ids):
            if rid in relevant_ids:
                rank = i + 1
                break
        reciprocal_ranks.append(1 / rank if rank > 0 else 0)

        # NDCG calculation
        y_true = np.array([[1 if rid in relevant_ids else 0 for rid in recommended_ids]])
        y_score = np.array([[K - i for i in range(len(recommended_ids))]])
        try:
            if np.sum(y_true) > 0:
                ndcgs.append(ndcg_score(y_true, y_score, k=K))
            else:
                ndcgs.append(0.0)
        except:
            ndcgs.append(0.0)

    return {
        f"Precision@{K}": np.mean(precisions),
        "MRR": np.mean(reciprocal_ranks),
        f"NDCG@{K}": np.mean(ndcgs),
        "Latency (s)": np.mean(latencies)
    }


# Ensure KNN features are ready
df, X_knn = prepare_features_for_knn(df)

test_set = generate_category_ground_truth(df)
# Updated test_methods to match the (pid, N) signature exactly
test_methods = {
    "Basic Cosine": lambda pid, N: get_recommendations(df, pid, N=N),
    "PCA Features": lambda pid, N: get_recommendations_with_pca(df, pid, N=N),
    "Content (TF-IDF)": lambda pid, N: get_content_recommendations(df, pid, N=N),
    "Content + PCA": lambda pid, N: get_content_recommendations_pca(df, pid, N=N),
    "Review Text": lambda pid, N: get_review_recommendations(df, pid, N=N),
    "Sentiment": lambda pid, N: get_sentiment_recommendations(df, pid, N=N),
    "Topic Modeling": lambda pid, N: get_topic_recommendations(df, pid, N=N),
    "Reviewer Overlap": lambda pid, N: get_reviewer_overlap_recommendations(df, pid, N=N),
    "SVD (Collab)": lambda pid, N: svd_product_recommendations(df, pid, n=N),
    "SVD Optimized": lambda pid, N: svd_product_recommendations_optimized(df, pid, n=N),
    "KNN (Numeric)": lambda pid, N: get_knn_recommendations(df, X_knn, df[df['product_id']==pid].index[0], n=N)
   # "OPTIMIZED HYBRID": lambda pid, N: get_weighted_hybrid_recommendations(pid, N=N, weights=optimal_weights)
}

# Run the evaluation
results = []
test_set_filtered = {k: v for k, v in test_set.items() if len(v) > 1}
eval_sample = dict(list(test_set_filtered.items())[:10])
print(f"Sampled {len(eval_sample)} queries for evaluation.")

#eval_sample = dict(list(hybrid_truth.items())[:30]) 

for name, func in test_methods.items():
    print(f"Evaluating {name}...", end=" ")
    try:
        metrics = evaluate_recommender(eval_sample, func, K=5)
        metrics['Model'] = name
        results.append(metrics)
        print("Done.")
    except Exception as e:
        print(f"Failed: {e}")

# Display results
kpi_df = pd.DataFrame(results)
kpi_df = kpi_df[['Model', 'NDCG@5', 'MRR', 'Precision@5', 'Latency (s)']]
display(kpi_df.sort_values(by='NDCG@5', ascending=False))

Sampled 10 queries for evaluation.
Evaluating Basic Cosine... Done.
Evaluating PCA Features... Done.
Evaluating Content (TF-IDF)... Done.
Evaluating Content + PCA... Done.
Evaluating Review Text... Done.
Evaluating Sentiment... Done.
Evaluating Topic Modeling... Done.
Evaluating Reviewer Overlap... Done.
Evaluating SVD (Collab)... Done.
Evaluating SVD Optimized... Done.
Evaluating KNN (Numeric)... Done.


Unnamed: 0,Model,NDCG@5,MRR,Precision@5,Latency (s)
2,Content (TF-IDF),1.0,1.0,1.0,0.001515
3,Content + PCA,1.0,1.0,1.0,0.001478
0,Basic Cosine,0.984468,1.0,0.6,0.00203
1,PCA Features,0.976613,1.0,0.7,0.001564
4,Review Text,0.967973,0.95,0.86,0.001531
6,Topic Modeling,0.851854,0.85,0.7,0.001483
10,KNN (Numeric),0.837005,0.77,0.44,0.001956
9,SVD Optimized,0.764837,0.725,0.64,0.012426
5,Sentiment,0.7426,0.7,0.54,0.001713
8,SVD (Collab),0.6386,0.583333,0.52,17.847005


Based on the result from SVD (Collab) , latency improved (a LOT) from 19 seconds to the SVD Optimized to 0.012 seconds and with a much better NDCG and MRR.

We will now check what is the get_weighted_hybrid_recommendations score if we will use SVD Optimized.

In [19]:
#get weights this time using svd optimized
import numpy as np
import pandas as pd
import random
from model_training import get_weighted_hybrid_recommendations

def randomized_weight_search(df, test_data, n_iter=50, K=5):
    """
    Randomly samples weight combinations to find the best configuration.
    """
    model_keys = [
        'basic_cosine', 'pca_features', 'content_tfidf', 'content_pca', 
        'review_text', 'sentiment', 'topic_lda', 'reviewer_overlap', 
        'knn_numeric', 'svd_collaborative_optimized'
    ]
    
    results = []
    best_score = -1
    best_weights = None

    print(f"Starting randomized search for {n_iter} iterations...")

    for i in range(n_iter):
        # 1. Generate random weights that sum to 1.0
        raw_weights = np.random.dirichlet(np.ones(len(model_keys)), size=1)[0]
        current_weights = dict(zip(model_keys, raw_weights))

        # 2. Updated wrapper signature to match evaluate_recommender (pid, N)
        def hybrid_func(pid, N):
            # Pass the global df to the hybrid function as required by model_training.py
            return get_weighted_hybrid_recommendations(df, pid, N=N, weights=current_weights, model_keys=model_keys)

        # 3. Evaluate using the category-based truth
        scores = evaluate_recommender(test_data, hybrid_func, K=K)
        score = scores[f'NDCG@{K}']
        
        results.append({'weights': current_weights, 'ndcg': score})

        if score > best_score:
            best_score = score
            best_weights = current_weights
            print(f"Iteration {i}: New Best NDCG@{K} = {best_score:.4f}")

    return best_weights, best_score

# --- Run the Optimization ---
category_truth = generate_category_ground_truth(df)
sample_truth = dict(list(category_truth.items())[:20]) 

optimal_weights_svd, top_score = randomized_weight_search(df, sample_truth, n_iter=30)

print("\n--- OPTIMAL HYBRID WEIGHTS ---")
for k, v in optimal_weights_svd.items():
    print(f"{k:20}: {v:.4f}")

Starting randomized search for 30 iterations...
Iteration 0: New Best NDCG@5 = 0.6662
Iteration 2: New Best NDCG@5 = 0.8391
Iteration 3: New Best NDCG@5 = 0.9530
Iteration 12: New Best NDCG@5 = 0.9565
Iteration 14: New Best NDCG@5 = 0.9631
Iteration 21: New Best NDCG@5 = 0.9750
Iteration 24: New Best NDCG@5 = 0.9815

--- OPTIMAL HYBRID WEIGHTS ---
basic_cosine        : 0.1083
pca_features        : 0.1068
content_tfidf       : 0.1962
content_pca         : 0.3365
review_text         : 0.0156
sentiment           : 0.0682
topic_lda           : 0.0693
reviewer_overlap    : 0.0159
knn_numeric         : 0.0709
svd_collaborative_optimized: 0.0123


In [20]:
#weighted  hybrid evaluation
import time
import pandas as pd
import numpy as np
from sklearn.metrics import ndcg_score

# 1. Setup Ground Truth (Filtered for rich data)
test_set = generate_category_ground_truth(df)
test_set_filtered = {k: v for k, v in test_set.items() if len(v) > 1}
# Expanding sample size to 30 for more stable KPIs
eval_sample = dict(list(test_set_filtered.items())[:30]) 

print(f"Evaluating Optimized Hybrid on {len(eval_sample)} queries...")

# 2. Define the Optimized Hybrid Wrapper
# Uses your best weights from the randomized search
best_hybrid_weights = optimal_weights_svd  # From previous optimization step

# The lambda ensures the evaluator only needs to pass (pid, N)
hybrid_func = lambda pid, N: get_weighted_hybrid_recommendations(df, pid, N=N, weights=best_hybrid_weights)

# 3. Run Single Evaluation
try:
    start_wall = time.time()
    metrics = evaluate_recommender(eval_sample, hybrid_func, K=5)
    total_time = time.time() - start_wall
    
    metrics['Model'] = "OPTIMIZED HYBRID"
    
    # 4. Display results in a clean table
    kpi_df = pd.DataFrame([metrics])
    cols = ['Model', 'NDCG@5', 'MRR', 'Precision@5', 'Latency (s)']
    display(kpi_df[cols])
    
    print(f"\nTotal Wall Time for 30 samples: {total_time:.2f} seconds")

except Exception as e:
    print(f"Evaluation failed: {e}")

Evaluating Optimized Hybrid on 30 queries...


Unnamed: 0,Model,NDCG@5,MRR,Precision@5,Latency (s)
0,OPTIMIZED HYBRID,0.99943,1.0,0.98,0.024907



Total Wall Time for 30 samples: 0.76 seconds


latency significantly dropped but NDCG dropped from 1 to 0.98, but I think the difference is ok/negligible.

However, I tested this in the front-end and it took around 6secs to load the product suggestion,
which I think is still not optimal (not within our target)

So at this point, we need to optimize the models we will use from 10 to top4

To optimize your recommendation engine, we transitioned from a complex 10-model ensemble to a streamlined "Core 4" architecture. This decision was driven by the need to eliminate redundancy, reduce latency, and prioritize the models that delivered the highest NDCG and Catalog Coverage results. (catalog coverage will be discussed in the next notebook)

basic_cosine and pca features are basically the same in theory but we applied PCA so will use that, and pca generated a much better latency result.

content_tfidf and content pca are the same concept as well, the difference is we apply pca so we will use that, pca generated also a much better latency result.

We retained the following four "experts" because they represent the four distinct mathematical pillars of a modern recommendation system:

Content + PCA - Rationale: This achieved a perfect 1.0 NDCG. It uses TF-IDF and Principal Component Analysis to find products that are mathematically identical in description and category.
Purpose: Essential for "cold-start" items that have no sales history but clear technical specifications.

Review Text (The Semantic Pillar): Rationale: With an NDCG of 0.968 and the highest Precision@5 (0.86), this model captures how users actually describe and use the products.
Purpose: It finds deeper connections that standard metadata might miss, such as "great for travel" or "very durable".

PCA Features (The Attribute Pillar): Rationale: This model maintained a perfect 1.0 MRR, meaning its top-ranked choice was consistently accurate. It focuses on hard data points like price, dimensions, and rating counts.
Purpose: Ensures that recommendations stay within the same "tier" (e.g., suggesting a premium battery for a premium device).

SVD Optimized (The Behavioral Pillar): Rationale: While it had lower accuracy than content models, it is the only model capable of Collaborative Filtering.
Purpose: It identifies "hidden" links based on user behavior (e.g., users who buy batteries also buy SD cards), which is critical for Catalog Coverage and preventing a "filter bubble".

So we will apply this only 4 recommendation model to our get_weighted_hybrid_recommendations as our final model to use in our recommendation engine


In [21]:
#from 10 models to 4
import numpy as np
import pandas as pd
import random
from model_training import get_weighted_hybrid_recommendations

def randomized_weight_search(df, test_data, n_iter=50, K=5):
    """
    Randomly samples weight combinations to find the best configuration.
    """
    model_keys = [
        'pca_features',  'content_pca', 
        'review_text', 'svd_collaborative_optimized'
    ]
        
    
    results = []
    best_score = -1
    best_weights = None

    print(f"Starting randomized search for {n_iter} iterations...")

    for i in range(n_iter):
        # 1. Generate random weights that sum to 1.0
        raw_weights = np.random.dirichlet(np.ones(len(model_keys)), size=1)[0]
        current_weights = dict(zip(model_keys, raw_weights))

        # 2. Updated wrapper signature to match evaluate_recommender (pid, N)
        def hybrid_func(pid, N):
            # Pass the global df to the hybrid function as required by model_training.py
            return get_weighted_hybrid_recommendations(df, pid, N=N, weights=current_weights, model_keys=model_keys)

        # 3. Evaluate using the category-based truth
        scores = evaluate_recommender(test_data, hybrid_func, K=K)
        score = scores[f'NDCG@{K}']
        
        results.append({'weights': current_weights, 'ndcg': score})

        if score > best_score:
            best_score = score
            best_weights = current_weights
            print(f"Iteration {i}: New Best NDCG@{K} = {best_score:.4f}")

    return best_weights, best_score

# --- Run the Optimization ---
category_truth = generate_category_ground_truth(df)
sample_truth = dict(list(category_truth.items())[:20]) 

optimal_weights_top_4, top_score = randomized_weight_search(df, sample_truth, n_iter=30)

print("\n--- OPTIMAL HYBRID WEIGHTS ---")
for k, v in optimal_weights_top_4.items():
    print(f"{k:20}: {v:.4f}")

Starting randomized search for 30 iterations...
Iteration 0: New Best NDCG@5 = 0.4688
Iteration 1: New Best NDCG@5 = 0.6312
Iteration 2: New Best NDCG@5 = 0.6746
Iteration 4: New Best NDCG@5 = 0.9077
Iteration 5: New Best NDCG@5 = 0.9262
Iteration 13: New Best NDCG@5 = 0.9446

--- OPTIMAL HYBRID WEIGHTS ---
pca_features        : 0.3446
content_pca         : 0.4789
review_text         : 0.0585
svd_collaborative_optimized: 0.1180


After running the randomized search this is the hybrid weights with the best NDCG.

Content PCA (47.89%): This is the primary driver of our recommendation engine. With nearly half the voting power, the system first looks for products that are mathematically similar in their descriptions and categories. This ensures that if a user looks at a battery, the recommendations are almost guaranteed to be other batteries or related electronics.

SVD Collaborative Optimized (11.80%): This is our diversity engine. It carries significant weight to ensure the system doesn't just recommend "more of the same." It uses past user behavior to find items that are "vibewise" similar, even if they don't share the same text descriptions.

Review Text (5.85%): This provides semantic depth. It ensures the recommendations align with how users actually describe the products in reviews (e.g., "long-lasting" or "heavy-duty"), adding a layer of human sentiment to the mathematical content matching.

PCA Features (34.46%): This acts as a fine-tuner. While its weight is low, it helps break ties between very similar items by looking at structured specs like price points or physical dimensions to ensure the recommendation "fit" is precise.

In [22]:
#weighted  hybrid evaluation of 4 models
import time
import pandas as pd
import numpy as np
from sklearn.metrics import ndcg_score

# 1. Setup Ground Truth (Filtered for rich data)
test_set = generate_category_ground_truth(df)
test_set_filtered = {k: v for k, v in test_set.items() if len(v) > 1}
# Expanding sample size to 30 for more stable KPIs
eval_sample = dict(list(test_set_filtered.items())[:30]) 

print(f"Evaluating Optimized Hybrid on {len(eval_sample)} queries...")
model_keys = [
            'pca_features',  
            'content_pca', 
            'review_text',  
            'svd_collaborative_optimized'
        ]

# 2. Define the Optimized Hybrid Wrapper
# Uses your best weights from the randomized search
best_hybrid_weights = optimal_weights_top_4  # From previous optimization step

# The lambda ensures the evaluator only needs to pass (pid, N)
hybrid_func = lambda pid, N: get_weighted_hybrid_recommendations(df, pid, N=N, weights=best_hybrid_weights, model_keys=model_keys)

# 3. Run Single Evaluation
try:
    start_wall = time.time()
    metrics = evaluate_recommender(eval_sample, hybrid_func, K=5)
    total_time = time.time() - start_wall
    
    metrics['Model'] = "OPTIMIZED HYBRID"
    
    # 4. Display results in a clean table
    kpi_df = pd.DataFrame([metrics])
    cols = ['Model', 'NDCG@5', 'MRR', 'Precision@5', 'Latency (s)']
    display(kpi_df[cols])
    
    print(f"\nTotal Wall Time for 30 samples: {total_time:.2f} seconds")

except Exception as e:
    print(f"Evaluation failed: {e}")

Evaluating Optimized Hybrid on 30 queries...


Unnamed: 0,Model,NDCG@5,MRR,Precision@5,Latency (s)
0,OPTIMIZED HYBRID,0.997719,1.0,0.953333,0.024661



Total Wall Time for 30 samples: 0.75 seconds


The weighted hybrid score has met our KPI our target and in front-end/web product recommendation loads at 349 (although Latency KPI was not met but close, Latency KPI target is <200ms>) ms which is waaaayyy faster compared to the 19seconds when we are using the 10models (svd not optimized).
