**Getting product recommedations for customer**

Here I use collaborative filtering using the surprise library for getting recommendations

In [11]:
#Step 1: Importing necessary libraries
import pandas as pd
from surprise import Dataset, Reader, SVD 
from surprise.model_selection import cross_validate 

In [12]:
#Step 2: Load the dataset into jupyter notebook
data = pd.read_csv('OnlineRetail.csv')

In [15]:
#Step 3: Viewing 5 rows of data
data.head(5)

Unnamed: 0,InvoiceNo,StockCode,Description,Quantity,InvoiceDate,UnitPrice,CustomerID,Country
0,536365,85123A,WHITE HANGING HEART T-LIGHT HOLDER,6,12/1/2010 8:26,2.55,17850.0,United Kingdom
1,536365,71053,WHITE METAL LANTERN,6,12/1/2010 8:26,3.39,17850.0,United Kingdom
2,536365,84406B,CREAM CUPID HEARTS COAT HANGER,8,12/1/2010 8:26,2.75,17850.0,United Kingdom
3,536365,84029G,KNITTED UNION FLAG HOT WATER BOTTLE,6,12/1/2010 8:26,3.39,17850.0,United Kingdom
4,536365,84029E,RED WOOLLY HOTTIE WHITE HEART.,6,12/1/2010 8:26,3.39,17850.0,United Kingdom


In [19]:
#Step 4: Creating a Reader object and specifying the rating scale
reader = Reader(rating_scale=(0, data['Quantity'].max()))

In [23]:
#Step 5: Creating the dataset from the pandas dataframe
data_for_surprise = Dataset.load_from_df(data[['CustomerID', 'StockCode', 'Quantity']], reader)

In [25]:
#Step 6: Using the Singular value decomposition (SVD) algorithm for collaborative filtering
algo = SVD()

In [27]:
#Step 7: Evaluating the algorithm with cross-validation
cross_validate(algo, data_for_surprise, measures=['RMSE', 'MAE'], cv=5, verbose=True)

Evaluating RMSE, MAE of algorithm SVD on 5 split(s).

                  Fold 1  Fold 2  Fold 3  Fold 4  Fold 5  Mean    Std     
RMSE (testset)    80747.188480757.405180757.478980756.594180742.277480752.18886.2903  
MAE (testset)     80509.576580528.749380530.345080528.356680498.772480519.160012.7211 
Fit time          5.18    5.39    5.11    4.99    5.03    5.14    0.14    
Test time         1.10    0.71    0.92    1.01    0.70    0.89    0.16    


{'test_rmse': array([80747.18841081, 80757.4050995 , 80757.47889524, 80756.59411981,
        80742.27740388]),
 'test_mae': array([80509.57652417, 80528.74934292, 80530.34502347, 80528.3566206 ,
        80498.77239529]),
 'fit_time': (5.1846535205841064,
  5.390363454818726,
  5.112430572509766,
  4.992536306381226,
  5.032274484634399),
 'test_time': (1.101330041885376,
  0.707421064376831,
  0.9241266250610352,
  1.0134682655334473,
  0.7025132179260254)}

In [28]:
#Step 8: Training the model on the entire dataset
trainset = data_for_surprise.build_full_trainset()
algo.fit(trainset)

<surprise.prediction_algorithms.matrix_factorization.SVD at 0x1280a8eb830>

In [31]:
#Step 9: Function to get top n recommendations for a given customer

def top_recommendations(customer_id, n=15):

    customer_id = float(customer_id)
    
    #list of all products
    all_products = data['Description'].unique()
    
    #list of products the customer has already bought
    purchased_products = data[data['CustomerID'] == customer_id]['Description'].unique()
    
    #list of products the customer has not bought yet 
    products_to_predict = [product_description for product_description in all_products if product_description not in purchased_products] 
    
    # Predict the ratings for all products the customer has not bought yet
    predictions = [algo.predict(customer_id, product_description) for product_description in products_to_predict]
    
    # Sort the predictions by estimated rating
    predictions.sort(key=lambda x: x.est)
        
    # top N recommendations
    top_recommendations = [pred.iid for pred in predictions[:n]]
    
    return top_recommendations

In [33]:
#Step 10: Getting the recommendated product list

customer_id = float(input('Enter the Customer ID'))
if customer_id not in data['CustomerID'].unique():
            print(f"Customer ID {customer_id} not found in the data.")
else:
    top_product_recommendations = top_recommendations(customer_id, n=5)
    print(f'\nTop 10 recommendated products for {customer_id}:\n')

    for product in top_product_recommendations:
        print(product)

Enter the Customer ID 123456


Customer ID 123456.0 not found in the data.


In [35]:
customer_id = float(input('Enter the Customer ID'))
if customer_id not in data['CustomerID'].unique():
            print(f"Customer ID {customer_id} not found in the data.")
else:
    top_product_recommendations = top_recommendations(customer_id, n=5)
    print(f'\nTop 10 recommendated products for {customer_id}:\n')

    for product in top_product_recommendations:
        print(product)

Enter the Customer ID 18283



Top 10 recommendated products for 18283.0:

WHITE METAL LANTERN
CREAM CUPID HEARTS COAT HANGER
KNITTED UNION FLAG HOT WATER BOTTLE
RED WOOLLY HOTTIE WHITE HEART.
SET 7 BABUSHKA NESTING BOXES
