## Recommeder System - Amazon Fine Food Reviews
____


**by Michael Xiao**

### Overview:

1. Importing libraries
2. Preparing data
3. Surprise! Package
4. Matrix Factorisation - SVD
5. Recommender System

## Importing libraries

In [3]:
import pandas as pd
import numpy as np

import pickle

from surprise import Reader
from surprise import Dataset

from surprise import NormalPredictor
from surprise import KNNBasic
from surprise import KNNWithMeans
from surprise import KNNWithZScore
from surprise import KNNBaseline
from surprise import SVD
from surprise import BaselineOnly
from surprise import SVDpp
from surprise import NMF
from surprise import SlopeOne
from surprise import CoClustering
from surprise import accuracy
from surprise.model_selection import cross_validate, train_test_split, GridSearchCV
from surprise.accuracy import rmse
from surprise.similarities import cosine

from scipy.sparse.linalg import svds

from sklearn.metrics.pairwise import cosine_similarity

pd.set_option('display.width', 5000)
pd.set_option('display.max_rows', 500)     #ease of viewing
pd.set_option('display.max_columns', 120)
pd.set_option('display.max_colwidth', 500)

## 2.0 Preparing data

In [2]:
#Please refer to 'Preparing for RecSys.ipynb' to find out how svd_df was derived
svd_df = pd.read_csv('../data/svd_df.csv')
svd_df.drop(labels=['Unnamed: 0'],axis=1,inplace=True)

In [3]:
svd_df.shape

(5612, 3)

In [4]:
svd_df.sort_values(by='UserId').head()

Unnamed: 0,UserId,ProductId,hybrid_score
3131,A1007PT85CIPMD,B0009ETA76,0.010526
5051,A1007PT85CIPMD,B0009F3POY,0.62322
3369,A100UZGZNZ9ZYN,B002ANA9QA,0.35241
2840,A100WO06OQR8BQ,B002LANN56,0.43845
2614,A100WO06OQR8BQ,B005CUU25G,0.04076


In [5]:
svd_df.hybrid_score.describe()

count    5612.000000
mean        0.157657
std         0.201277
min        -0.948285
25%         0.038012
50%         0.148079
75%         0.276674
max         0.985700
Name: hybrid_score, dtype: float64

In [6]:
reader = Reader(rating_scale=(-1, 1)) #values may range from -1 to 1. 'higher value = more positive' and vice versa.
data = Dataset.load_from_df(svd_df, reader)

## 2.1 Surprise! Package

#### 2.1.1 Benchmark

We use Surprise! to understand which model gives a better RMSE score and which hyper parameter(s) work best for our data.

In [7]:
benchmark = []
# Iterate over all algorithms
for algorithm in [SVD(), SVDpp(), SlopeOne(), NMF(), NormalPredictor(), KNNBaseline(), 
                  KNNBasic(), KNNWithMeans(), KNNWithZScore(), BaselineOnly(), CoClustering()]:
    
    # Perform cross validation
    results = cross_validate(algorithm, data, measures=['RMSE'], cv=3, verbose=True)
    
    # Get results & append algorithm name
    tmp = pd.DataFrame.from_dict(results).mean(axis=0)
    tmp = tmp.append(pd.Series([str(algorithm).split(' ')[0].split('.')[-1]], index=['Algorithm']))
    benchmark.append(tmp)

Evaluating RMSE of algorithm SVD on 3 split(s).

                  Fold 1  Fold 2  Fold 3  Mean    Std     
RMSE (testset)    0.2146  0.2131  0.2167  0.2148  0.0015  
Fit time          0.28    0.25    0.26    0.26    0.01    
Test time         0.02    0.01    0.01    0.01    0.00    
Evaluating RMSE of algorithm SVDpp on 3 split(s).

                  Fold 1  Fold 2  Fold 3  Mean    Std     
RMSE (testset)    0.2049  0.2141  0.2062  0.2084  0.0040  
Fit time          0.61    0.59    0.58    0.60    0.01    
Test time         0.03    0.05    0.03    0.03    0.01    
Evaluating RMSE of algorithm SlopeOne on 3 split(s).

                  Fold 1  Fold 2  Fold 3  Mean    Std     
RMSE (testset)    0.2517  0.2561  0.2607  0.2562  0.0037  
Fit time          0.14    0.17    0.17    0.16    0.02    
Test time         0.02    0.02    0.02    0.02    0.00    
Evaluating RMSE of algorithm NMF on 3 split(s).

                  Fold 1  Fold 2  Fold 3  Mean    Std     
RMSE (testset)    0.3474  0.32

In [8]:
surprise_results = pd.DataFrame(benchmark).set_index('Algorithm').sort_values('test_rmse')
surprise_results

Unnamed: 0_level_0,test_rmse,fit_time,test_time
Algorithm,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
BaselineOnly,0.200888,0.032725,0.0185
KNNBaseline,0.204233,0.231611,0.031377
KNNBasic,0.205716,0.094067,0.024034
SVDpp,0.20839,0.595241,0.034908
SVD,0.214781,0.263585,0.014443
KNNWithMeans,0.237786,0.151016,0.032645
KNNWithZScore,0.240268,0.31741,0.043443
SlopeOne,0.256172,0.159425,0.018298
NormalPredictor,0.288187,0.011059,0.040445
NMF,0.33935,0.579683,0.01773


#### 2.1.2 Optimising SVD Algorithm Hyperparameters 

In [9]:
trainset, testset = train_test_split(data, test_size=.20, random_state=42)

In [10]:
%%time

param_grid = {'n_factors': [120, 140, 150], 
              'n_epochs': [10, 20, 30, 50], 
              'lr_all': [0.002, 0.003, 0.004],
              'reg_all': [0.02, 0.05, 0.08, 0.1]
             }

gs = GridSearchCV(SVD, param_grid, measures=['rmse'], cv=3, n_jobs=-1, joblib_verbose=1)

gs.fit(data)
# best RMSE score
print(gs.best_score['rmse'])
# combination of parameters that gave the best RMSE score     
print(gs.best_params['rmse'])

algo = gs.best_estimator['rmse']

[Parallel(n_jobs=-1)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=-1)]: Done  42 tasks      | elapsed:    9.9s
[Parallel(n_jobs=-1)]: Done 192 tasks      | elapsed:  1.1min


0.2134894463286684
{'n_factors': 120, 'n_epochs': 50, 'lr_all': 0.004, 'reg_all': 0.1}
CPU times: user 1min 29s, sys: 1.02 s, total: 1min 30s
Wall time: 2min 38s


[Parallel(n_jobs=-1)]: Done 432 out of 432 | elapsed:  2.6min finished


In [11]:
#table of best parameters and best rmse score
results_df = pd.DataFrame.from_dict(gs.cv_results) 
results_df.sort_values(by='rank_test_rmse', ascending=True).head(1)

Unnamed: 0,split0_test_rmse,split1_test_rmse,split2_test_rmse,mean_test_rmse,std_test_rmse,rank_test_rmse,mean_fit_time,std_fit_time,mean_test_time,std_test_time,params,param_n_factors,param_n_epochs,param_lr_all,param_reg_all
47,0.209998,0.212437,0.218034,0.213489,0.003364,1,1.999346,0.221137,0.033285,0.005868,"{'n_factors': 120, 'n_epochs': 50, 'lr_all': 0.004, 'reg_all': 0.1}",120,50,0.004,0.1


In [12]:
#saving gs.best_estimator['rmse'] into a pickle file

svd_algo = open("svd_algo.pickle","wb") 
pickle.dump(algo, svd_algo)
svd_algo.close()

In [13]:
#loading gs.best_estimator['rmse'] pickle file

#best_est = open("svd_algo.pickle",'rb') 
#svd = pickle.load(best_est)
#best_est.close()

In [14]:
# Algorithm on trainset and testset
algo.fit(trainset)
test_pred = algo.test(testset)
print("SVD : Test Set")
accuracy.rmse(test_pred, verbose=True)

SVD : Test Set
RMSE: 0.2107


0.21073400961417244

#### 2.1.3 Surprise! Predictions on data

In [15]:
pred_df = pd.DataFrame(test_pred)                    #r_ui = original hybrid_score
pred_df.drop(labels='details',axis=1,inplace=True)   #est = estimated score with SVD algo

In [16]:
pred_df['error'] = (pred_df['est'] - pred_df['r_ui'])
pred_df1 = pred_df[pred_df['error'].between(-0.01, 0.01, inclusive=True)] 

In [17]:
best_predictions = pred_df1.sort_values(by='error')[:10]
worst_predictions = pred_df.sort_values(by='error')[-10:]

In [18]:
best_predictions

Unnamed: 0,uid,iid,r_ui,est,error
718,A1F14BB4PV053A,B000FDKQCO,0.145208,0.135289,-0.009919
1093,A1F1A0QQP2XVH5,B002IEZJMA,0.144144,0.134384,-0.009761
822,A2MNB77YGJ3CN0,B003WEFSAI,0.200813,0.191653,-0.00916
117,A2SH7OWE8QJYNC,B0014WYXYW,0.236974,0.228858,-0.008116
144,A1HQV09XG1Z0YW,B000E1694G,0.148387,0.140395,-0.007993
481,A18R9WW8XB2IJ6,B0085RVY0A,0.218172,0.210302,-0.00787
214,A2UK25KI4T6GCI,B003ZT61E2,0.127667,0.120267,-0.0074
305,A3OXHLG6DIBRW8,B001EQ55RW,0.185568,0.178541,-0.007027
161,A3POAWC2JPQQQP,B0046GRD0O,0.328012,0.322289,-0.005723
386,A3F1G6UH4Y39X2,B000XH8T8U,0.189738,0.184312,-0.005426


In [19]:
worst_predictions

Unnamed: 0,uid,iid,r_ui,est,error
290,A2SB0OKNB1ODN5,B000FK63IS,-0.406725,0.165924,0.572649
795,A3GX6U4H2CRY71,B001AHJ2D8,-0.369285,0.210493,0.579778
496,A2KBFB6A2D7PNO,B000KAJ51U,-0.44425,0.135757,0.580007
629,A3NZVCL9N8CLHB,B0013NUGDE,-0.490374,0.108638,0.599012
38,A2DPYMNI2HCIOI,B003VIN0QE,-0.450399,0.178768,0.629167
263,A2PNOU7NXB1JE4,B001EO616S,-0.507173,0.171899,0.679072
70,A20OQMLRFNZADL,B003ANFMY8,-0.345277,0.387272,0.732549
1047,AYNH2BHO8SO52,B003QNJYXM,-0.70884,0.031191,0.740031
10,A4VMQ6ZTSXSSL,B0089SPENI,-0.507889,0.389837,0.897726
430,A2S78HC3GA9W8M,B001TNW23U,-0.70047,0.198341,0.898811


In [20]:
#saving pred_df into pickle file

svdprediction = open("svd_predictions.pickle", "wb")
pickle.dump(pred_df, svdprediction)
svdprediction.close()

In [21]:
#loading pred_df

#svdprediction = open("svd_predictions.pickle",'rb')
#pred_df = pickle.load(svdprediction)
#svdprediction.close()

## 2.2 Matrix Factorisation - SVD

Using matrix factorisation, we can fix our sparsity problem with our dataset

In [22]:
svd_df_pivot = svd_df.pivot_table(index='UserId',columns='ProductId',values='hybrid_score').fillna(0)
svd_df_pivot_matrix = svd_df_pivot.as_matrix()

  


In [23]:
svd_df_pivot_matrix

array([[0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       ...,
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.]])

In [24]:
print(svd_df_pivot.shape)
svd_df.nunique()

(2172, 2422)


UserId          2172
ProductId       2422
hybrid_score    5585
dtype: int64

In [25]:
#Performs matrix factorization of the original user item matrix
U, sigma, Vt = svds(svd_df_pivot_matrix, k = 120, maxiter = 20) #'{'n_factors': 120, 'n_epochs': 20, 'lr_all': 0.004, 'reg_all': 0.1}

In [26]:
print(U.shape)
print(Vt.shape)
sigma = np.diag(sigma)
print(sigma.shape)

(2172, 120)
(120, 2422)
(120, 120)


In [27]:
predicted_ratings = np.dot(np.dot(U, sigma), Vt) 
cf_df = pd.DataFrame(predicted_ratings, index= svd_df_pivot.index, columns = svd_df_pivot.columns)

In [28]:
cf_df.head()

ProductId,B00004RAMX,B000084E6V,B000084ETV,B000084F1I,B000084F1Z,B00008DF91,B00008DFNV,B00008WUA9,B0000A0BS8,B0000AH3QW,B0000C69FB,B0000CDEPD,B0000CG4I0,B0000CNU1X,B0000CNU2Q,B0000D8DI0,B0000D9N59,B0000DCWWI,B0000DG86X,B0000DGF9V,B0000DJDJZ,B0000DJT3C,B0000E65WB,B0000EIEDS,B0000GH6U6,B0000GH6UQ,B0000GIVA0,B0000GIVDC,B0000TBK64,B0000TL6CC,B0000TLEEW,B0000VLTZY,B0000VLU0I,B0001217BS,B00012182G,B000121BY6,B00012OHZ6,B00013C2MA,B00013C2TS,B00013EWNM,B00014IVPQ,B0001590LO,B00015HOTE,B00015HOUS,B00016AU3K,B00016JGY4,B00016LA9I,B00016UX0K,B000173IHE,B00017LEXO,B00017LEY8,B00018CWLG,B00018CX06,B00018CX60,B000197ZQM,B0001BGU0C,B0001BGU3Y,B0001BVD04,B0001BVO9Y,B0001CXRLQ,...,B007FK3HHG,B007FRDXMI,B007H13SYA,B007HOWZJQ,B007I7Z3Z0,B007JBLLK6,B007JFMH96,B007K449CE,B007N04BY6,B007OSBEV0,B007OSBGOK,B007OXJJE4,B007OXJK3Y,B007OXJLM4,B007PA33MA,B007PA34DS,B007PE7ANY,B007POA2L6,B007POT6RM,B007R1PGVS,B007RJELUM,B007RTR89S,B007RTR8AC,B007RTR8TS,B007RTR9E2,B007TGDXMK,B007TGDXMU,B007TGDXNO,B007TGO1U8,B007TJGY46,B007TJGY4Q,B007TJGZ54,B007TJGZ5E,B007XXLWHW,B0080YLBTM,B00817GYZO,B0081XIA1E,B0085G4A7U,B0085G4ACA,B0085RVY0A,B0085V3YFO,B00866AM2G,B0087GH4US,B0089Q2AAA,B0089SPDUW,B0089SPENI,B008BLFCK8,B008C2JCUW,B008EG59KS,B008FHUDW0,B008O2EHNC,B008OV8RE8,B008QLRJH2,B008RWUKXK,B008Z4VAPM,B008ZRKZSM,B0090X8IPM,B0090X8JUG,B0092X7OGY,B0096EZHM2
UserId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1,Unnamed: 51_level_1,Unnamed: 52_level_1,Unnamed: 53_level_1,Unnamed: 54_level_1,Unnamed: 55_level_1,Unnamed: 56_level_1,Unnamed: 57_level_1,Unnamed: 58_level_1,Unnamed: 59_level_1,Unnamed: 60_level_1,Unnamed: 61_level_1,Unnamed: 62_level_1,Unnamed: 63_level_1,Unnamed: 64_level_1,Unnamed: 65_level_1,Unnamed: 66_level_1,Unnamed: 67_level_1,Unnamed: 68_level_1,Unnamed: 69_level_1,Unnamed: 70_level_1,Unnamed: 71_level_1,Unnamed: 72_level_1,Unnamed: 73_level_1,Unnamed: 74_level_1,Unnamed: 75_level_1,Unnamed: 76_level_1,Unnamed: 77_level_1,Unnamed: 78_level_1,Unnamed: 79_level_1,Unnamed: 80_level_1,Unnamed: 81_level_1,Unnamed: 82_level_1,Unnamed: 83_level_1,Unnamed: 84_level_1,Unnamed: 85_level_1,Unnamed: 86_level_1,Unnamed: 87_level_1,Unnamed: 88_level_1,Unnamed: 89_level_1,Unnamed: 90_level_1,Unnamed: 91_level_1,Unnamed: 92_level_1,Unnamed: 93_level_1,Unnamed: 94_level_1,Unnamed: 95_level_1,Unnamed: 96_level_1,Unnamed: 97_level_1,Unnamed: 98_level_1,Unnamed: 99_level_1,Unnamed: 100_level_1,Unnamed: 101_level_1,Unnamed: 102_level_1,Unnamed: 103_level_1,Unnamed: 104_level_1,Unnamed: 105_level_1,Unnamed: 106_level_1,Unnamed: 107_level_1,Unnamed: 108_level_1,Unnamed: 109_level_1,Unnamed: 110_level_1,Unnamed: 111_level_1,Unnamed: 112_level_1,Unnamed: 113_level_1,Unnamed: 114_level_1,Unnamed: 115_level_1,Unnamed: 116_level_1,Unnamed: 117_level_1,Unnamed: 118_level_1,Unnamed: 119_level_1,Unnamed: 120_level_1,Unnamed: 121_level_1
A1007PT85CIPMD,1.900386e-08,1.094901e-06,0.005649932,0.000124912,5.731913e-06,-3.807087e-07,4.700181e-18,-0.0001445724,-0.000403327,-6.158736e-18,-0.001353531,1.261607e-05,-0.0002294644,-0.0001201503,1.411536e-05,-1.766223e-06,-0.0011128,-6.571366999999999e-19,3.3125609999999996e-19,6.932449e-05,0.0006993328,0.001798528,-1.464521e-05,-0.0003833315,-2.009398e-06,-0.001750652,-3.227368e-18,-0.002569856,-2.235948e-05,-6.896679e-17,-1.722288e-05,-5.221104e-08,-4.354803e-05,0.0001190601,-4.378709e-05,-0.0008163246,7.030277e-05,-6.176146e-06,7.239649e-06,2.396684e-05,-0.0002065322,-0.0005499148,-0.000801729,0.006816554,-0.001624297,-7.483201e-06,4.958336e-05,-6.293768e-06,0.0001030006,1.0441410000000001e-18,2.531055e-05,4.295069e-18,-5.687375e-07,9.239204000000001e-17,8.291795e-05,-8.750753e-05,-1.503251e-05,0.002892753,5.283136e-07,4.8348309999999996e-20,...,0.0001166905,6.44209e-05,2.113732e-06,6.05594e-18,-0.0043406,8.813724e-07,0.00074301,-0.0007248008,2.92914e-05,0.0003520016,1.889667e-05,0.00203294,0.005462327,-0.0005332642,-0.0003142666,-1.741159e-09,0.001283199,0.0001231279,-0.0004226071,-8.804645e-05,1.518069e-18,0.004745986,0.002270525,0.0008821839,-0.006204698,-0.005638213,0.0002758668,-0.000443512,-0.0001335547,-0.0001681903,0.00158576,-8.215199e-18,0.0003798408,0.0007016165,8.178852e-19,3.13337e-19,-9.263362e-08,-1.200638e-05,-1.400315e-18,0.0004010611,3.8062020000000004e-18,-2.461885e-08,0.01056206,-1.529069e-05,-0.004813079,0.000562378,0.0002714701,-1.524409e-06,-1.241066e-05,-0.002105232,-9.187853e-07,-1.000665e-05,-0.0001343785,-0.0003691994,1.751553e-06,-5.087143e-06,0.005275178,0.0004803073,-5.681667e-05,7.141231e-05
A100UZGZNZ9ZYN,-3.357044e-25,-6.113129e-25,1.081556e-20,-3.932414e-21,4.0427260000000003e-25,1.4175219999999999e-24,6.0198209999999995e-37,5.488328e-22,-1.421008e-21,1.486893e-36,-1.509011e-21,-1.134507e-23,2.0028370000000001e-22,-5.852211e-21,-9.876124e-22,-1.070346e-21,-2.360985e-21,4.269883e-37,7.821806e-38,-6.221856000000001e-23,1.1482179999999999e-20,1.1537469999999999e-20,4.895478e-22,-4.273568e-22,-7.335309000000001e-22,3.752862e-22,-5.493943e-37,2.303898e-20,-7.585463e-23,-5.045629e-36,4.762124000000001e-22,-1.0515419999999999e-26,-4.8550380000000005e-23,-1.924205e-22,8.035519000000001e-23,2.005331e-20,5.681548000000001e-23,1.582111e-21,1.767582e-21,-1.176483e-22,-2.007092e-22,-6.130833e-22,8.012878e-22,-2.749949e-21,7.833644e-21,-3.40448e-25,2.010269e-22,-5.965196e-24,-2.05887e-23,2.260617e-37,-2.911862e-21,-3.046235e-37,-2.4034090000000003e-23,-2.089873e-35,-3.236299e-22,5.695076000000001e-23,9.608874e-24,1.3050979999999999e-20,1.9335870000000002e-23,1.859369e-38,...,-5.70551e-23,5.457092e-24,-2.148668e-26,3.383157e-37,7.791638e-21,5.3471589999999996e-24,-2.231607e-20,-4.266081e-22,-1.11867e-21,5.516056e-22,-9.389277000000001e-23,2.359725e-21,4.478438e-22,5.006088000000001e-22,3.52971e-23,-1.671397e-25,1.308007e-21,-6.858385000000001e-22,-1.052872e-21,7.027439000000001e-23,1.3604399999999999e-36,-1.659893e-20,-2.1899919999999998e-20,-2.658939e-23,-6.971717e-21,3.644752e-21,-3.967007e-22,-1.34026e-21,2.806071e-21,-4.25987e-22,-2.066623e-21,-2.727553e-37,-2.9764479999999996e-20,-2.136697e-21,-3.304759e-37,-1.266074e-37,-9.828629e-28,3.363634e-22,6.257692e-38,1.1864429999999999e-20,4.604603e-37,-4.112831e-27,3.919191e-20,-1.359899e-23,6.0989829999999996e-21,1.968643e-20,-2.371781e-22,-5.612805e-23,5.399077000000001e-23,8.20831e-21,1.528429e-23,3.0474180000000004e-23,-7.515906000000001e-23,-3.4472029999999996e-20,-2.3567690000000002e-23,-1.1295220000000001e-23,2.9743389999999996e-20,-1.558337e-21,-2.767392e-21,-8.605625e-22
A100WO06OQR8BQ,-3.879052e-09,-2.069836e-07,-1.744426e-05,0.00039227,6.101085e-08,-1.250998e-06,1.752242e-18,-0.001003582,0.0001887492,9.818345e-19,-0.0001504203,-2.508176e-05,0.000112265,2.996395e-05,0.0002041479,-2.034555e-07,0.001267347,2.961401e-20,-1.375133e-19,-3.52601e-06,-0.001868561,0.0004599118,0.0001646492,-4.259982e-05,7.161586e-07,0.0007105212,-2.5197279999999997e-19,-8.705258e-05,1.901194e-05,1.910593e-17,0.0001647641,-1.37532e-09,-4.839571e-06,7.171403e-06,-2.379636e-05,7.251983e-05,-1.002888e-05,1.773874e-05,-1.60537e-05,1.331072e-05,0.0001157038,-6.111301e-05,-0.00205271,0.0002555298,0.0007650028,-1.300826e-08,3.993655e-06,7.055115e-07,0.0001321084,5.575401999999999e-19,0.01390597,-1.447641e-19,3.008892e-06,4.730042e-18,5.765201e-05,-9.616784e-07,-1.63399e-07,0.0002336746,-2.327247e-06,-1.597765e-20,...,4.871273e-05,-1.384903e-06,1.217142e-06,-8.866630999999999e-19,-0.0002336923,-9.483015e-07,0.01481911,0.001094844,-1.589088e-06,-4.854647e-06,5.507519e-05,-0.0009664632,0.0005732883,0.0001112115,7.683572e-06,6.220232e-12,-0.0003887287,3.882918e-06,-0.0001853939,6.657913e-06,8.952611e-19,-0.023992,0.0003833154,-0.002977269,-0.007541868,-0.001069771,2.713757e-05,0.0006859616,-0.002457889,0.0001305169,0.0004451963,-1.511003e-18,0.015565,-0.0001174175,-2.038956e-19,-7.811370999999999e-20,-3.244186e-08,-4.24234e-05,-7.280689e-20,0.0004563756,-2.9999579999999996e-19,-5.021922e-08,0.0001338797,1.794383e-06,0.0004973378,-0.002714104,-0.0003390724,1.612838e-07,-3.434686e-07,0.0006599756,-2.003439e-06,1.674641e-06,0.0002513733,-0.0008656775,9.151132e-06,-4.128744e-06,-0.007725908,-0.001096181,1.416935e-05,4.002942e-06
A103FPM7ABVMAW,-3.765871e-09,-2.842282e-07,0.0001668039,2.501437e-05,1.430053e-06,1.460817e-08,3.0348839999999997e-19,-7.632616e-05,0.0001417284,-6.426873e-18,-4.759758e-05,8.981306e-07,9.247538e-05,-2.151162e-05,1.245684e-05,1.233342e-07,-2.868289e-05,-5.780630999999999e-19,-1.641036e-23,8.138457e-06,-0.0009332565,-5.906532e-05,0.0006639844,-1.347994e-05,-1.07244e-07,8.111423e-05,6.458554999999999e-20,-0.0025783,5.424793e-06,-3.006097e-17,0.0006589956,-5.671022e-10,-1.531388e-06,2.582237e-06,1.025057e-05,0.0003088294,1.1798e-05,5.562539e-06,4.513254e-05,-1.569703e-06,3.799822e-05,-1.933802e-05,2.956603e-05,-0.0008054711,-0.000274039,-7.923951e-07,-5.537251e-06,-8.074502e-07,7.302295e-06,-2.829511e-19,-2.198382e-05,2.99995e-19,4.79103e-07,2.2742150000000002e-17,-4.993406e-05,1.738649e-06,2.940817e-07,0.001643006,-3.890211e-07,6.507292e-20,...,-1.267427e-05,1.75943e-05,-1.323011e-07,-6.056911e-19,0.0002278389,-1.546757e-06,0.0006120445,0.0003006875,-2.154979e-05,-6.214427e-05,1.035787e-05,-2.293203e-05,0.000153566,-0.0001370625,2.868275e-07,2.234092e-10,-0.0002019948,2.014055e-06,1.604079e-05,1.354002e-05,-4.582192e-18,0.001322462,4.62023e-05,-0.0008086664,0.0001886254,-0.0002587024,4.960306e-06,0.0001350227,-2.901113e-05,5.450197e-05,0.0009934282,5.596493e-18,-0.0001915311,4.404709e-06,-4.1453719999999994e-19,-1.5881179999999998e-19,4.279307e-09,8.4123e-05,1.2389389999999998e-19,0.000154049,-5.588442999999999e-19,-6.671814e-09,-0.002157781,-1.782126e-06,-0.001067245,-0.0001435628,2.816251e-05,-2.770719e-08,6.884509e-06,-0.0005171552,-2.897368e-06,-6.282121e-08,-7.995616e-05,3.10231e-05,1.540751e-06,7.864505e-07,-0.001578278,0.0001583576,-1.017241e-05,2.022832e-06
A106ZCP7RSXMRU,5.746364e-09,-2.835399e-08,-0.0002234826,1.667051e-05,4.242912e-08,2.780852e-08,1.5108829999999998e-19,-1.683586e-05,4.654259e-06,-5.289226e-20,-4.348669e-05,-5.163451e-07,4.274013e-06,-7.364817e-06,7.679168e-06,-6.160291e-08,1.865855e-05,4.7786149999999996e-20,5.930204e-22,-4.334844e-07,0.0001027055,-0.0001425695,1.898595e-05,-1.231582e-05,-9.05721e-09,4.743328e-05,1.679101e-20,5.275103e-05,-2.694056e-07,-5.469049999999999e-19,1.885454e-05,-8.253775e-10,-1.399125e-06,-2.272685e-06,-8.152115e-07,8.371815e-05,4.133859e-07,3.472655e-06,1.894889e-05,1.288396e-06,-0.0001013383,-1.766784e-05,-4.965791e-05,-0.0001834592,-0.000152538,-7.889572e-08,1.216848e-06,-7.167248e-07,1.71115e-06,-3.598458e-21,4.002333e-05,-8.069334e-22,2.109287e-07,5.747466e-19,4.293349e-06,-1.892611e-06,-3.228131e-07,-2.68054e-05,-1.535786e-07,4.314942e-22,...,-7.04012e-06,-6.858396e-07,-2.046718e-08,-2.970208e-21,7.221248e-05,9.980872e-08,0.0005045985,4.744336e-05,-1.526384e-05,3.503224e-06,-7.862019e-06,-1.554902e-05,6.826457e-05,9.922196e-06,6.197624e-07,4.264983e-12,-1.374312e-05,-5.475506e-07,0.0002139413,1.348458e-06,2.114302e-20,0.0004209674,-0.0001632513,-0.001329125,0.001762024,-7.410769e-05,-4.904789e-06,-0.0001054022,0.0001707499,7.093744e-07,-2.180179e-06,-3.302708e-20,-0.0008580654,-1.907296e-05,6.930622e-22,2.655165e-22,-2.394221e-11,2.161098e-05,-1.412367e-22,1.03008e-05,2.9125439999999997e-20,2.402313e-09,0.0001449354,-1.706156e-06,6.287407e-05,0.000203953,-1.31034e-05,2.828949e-07,2.144055e-06,-0.0001514139,-7.421204e-08,2.72024e-07,7.58479e-06,6.470809e-05,-1.692158e-06,3.463092e-07,-1.170343e-05,4.45152e-05,-3.482674e-06,3.955889e-06


In [29]:
cf_df.to_csv('../data/cf_df.csv')

## 2.3 Recommender Systems

#### 2.3.1 Item Recommender

- **Input**: Product id
- **Output**: Get similar products with similar categories and similar hybrid score

In [30]:
recsys_df = pd.read_csv('../data/recsys_df.csv')
recsys_df.drop(labels='Unnamed: 0',axis=1,inplace=True)

In [31]:
recsys_df.columns = ['UserId', 'ProductId', 'hybrid_score', 'product_name', 'CAT1', 'CAT2', 'CAT3', 'CAT4', 'CAT5', 'CAT6']

In [32]:
for i in recsys_df[['CAT1', 'CAT2', 'CAT3', 'CAT4', 'CAT5', 'CAT6']]:
    recsys_df[i] = recsys_df[i].str.replace("'", '')

In [33]:
recsys_df.head()

Unnamed: 0,UserId,ProductId,hybrid_score,product_name,CAT1,CAT2,CAT3,CAT4,CAT5,CAT6
0,AOVROBZ8BNTP7,B001EO5QW8,0.494453,"McCANN'S Instant Irish Oatmeal, Variety Pack of Regular, Apples & Cinnamon, and Maple & Brown Sugar, 10-Count Boxes (Pack of 6)",Grocery & Gourmet Food,Breakfast Foods,Cereals,Oatmeal,-,-
1,A27TKQHFW0FB5N,B001GVISJC,0.069048,"Albanese Concord Grape Gummi Bears, 5 Pound Bags (Pack of 2)",Grocery & Gourmet Food,Candy & Chocolate,Jelly Beans & Gummy Candy,Gummy Candy,-,-
2,A37AO20OXS51QA,B001UJEN6C,0.223402,"Steaz Berry Energy Shot, 2.5 Ounce Bottles (Pack of 12)",Grocery & Gourmet Food,Beverages,Bottled Beverages,Water & Drink Mixes,Energy Drinks,-
3,A2LFHPZFG1OHBZ,B001UJEN6C,0.092079,"Steaz Berry Energy Shot, 2.5 Ounce Bottles (Pack of 12)",Grocery & Gourmet Food,Beverages,Bottled Beverages,Water & Drink Mixes,Energy Drinks,-
4,ALSAOZ1V546VT,B001ELL6O8,0.403563,"Arrowhead Mills Pancake & Waffle Mix, Buttermilk, 2 Pound Bags (Pack of 4)",Grocery & Gourmet Food,Pantry Staples,Cooking & Baking,Baking Mixes,Pancakes & Waffles,-


##### product similarity

Cosine similarity of SVD output

In [34]:
product_similarity = cosine_similarity(cf_df.T)
product_similarity_df = pd.DataFrame(product_similarity, index = svd_df_pivot.columns, columns = svd_df_pivot.columns)

In [35]:
product_similarity_df.head(2)

ProductId,B00004RAMX,B000084E6V,B000084ETV,B000084F1I,B000084F1Z,B00008DF91,B00008DFNV,B00008WUA9,B0000A0BS8,B0000AH3QW,B0000C69FB,B0000CDEPD,B0000CG4I0,B0000CNU1X,B0000CNU2Q,B0000D8DI0,B0000D9N59,B0000DCWWI,B0000DG86X,B0000DGF9V,B0000DJDJZ,B0000DJT3C,B0000E65WB,B0000EIEDS,B0000GH6U6,B0000GH6UQ,B0000GIVA0,B0000GIVDC,B0000TBK64,B0000TL6CC,B0000TLEEW,B0000VLTZY,B0000VLU0I,B0001217BS,B00012182G,B000121BY6,B00012OHZ6,B00013C2MA,B00013C2TS,B00013EWNM,B00014IVPQ,B0001590LO,B00015HOTE,B00015HOUS,B00016AU3K,B00016JGY4,B00016LA9I,B00016UX0K,B000173IHE,B00017LEXO,B00017LEY8,B00018CWLG,B00018CX06,B00018CX60,B000197ZQM,B0001BGU0C,B0001BGU3Y,B0001BVD04,B0001BVO9Y,B0001CXRLQ,...,B007FK3HHG,B007FRDXMI,B007H13SYA,B007HOWZJQ,B007I7Z3Z0,B007JBLLK6,B007JFMH96,B007K449CE,B007N04BY6,B007OSBEV0,B007OSBGOK,B007OXJJE4,B007OXJK3Y,B007OXJLM4,B007PA33MA,B007PA34DS,B007PE7ANY,B007POA2L6,B007POT6RM,B007R1PGVS,B007RJELUM,B007RTR89S,B007RTR8AC,B007RTR8TS,B007RTR9E2,B007TGDXMK,B007TGDXMU,B007TGDXNO,B007TGO1U8,B007TJGY46,B007TJGY4Q,B007TJGZ54,B007TJGZ5E,B007XXLWHW,B0080YLBTM,B00817GYZO,B0081XIA1E,B0085G4A7U,B0085G4ACA,B0085RVY0A,B0085V3YFO,B00866AM2G,B0087GH4US,B0089Q2AAA,B0089SPDUW,B0089SPENI,B008BLFCK8,B008C2JCUW,B008EG59KS,B008FHUDW0,B008O2EHNC,B008OV8RE8,B008QLRJH2,B008RWUKXK,B008Z4VAPM,B008ZRKZSM,B0090X8IPM,B0090X8JUG,B0092X7OGY,B0096EZHM2
ProductId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1,Unnamed: 51_level_1,Unnamed: 52_level_1,Unnamed: 53_level_1,Unnamed: 54_level_1,Unnamed: 55_level_1,Unnamed: 56_level_1,Unnamed: 57_level_1,Unnamed: 58_level_1,Unnamed: 59_level_1,Unnamed: 60_level_1,Unnamed: 61_level_1,Unnamed: 62_level_1,Unnamed: 63_level_1,Unnamed: 64_level_1,Unnamed: 65_level_1,Unnamed: 66_level_1,Unnamed: 67_level_1,Unnamed: 68_level_1,Unnamed: 69_level_1,Unnamed: 70_level_1,Unnamed: 71_level_1,Unnamed: 72_level_1,Unnamed: 73_level_1,Unnamed: 74_level_1,Unnamed: 75_level_1,Unnamed: 76_level_1,Unnamed: 77_level_1,Unnamed: 78_level_1,Unnamed: 79_level_1,Unnamed: 80_level_1,Unnamed: 81_level_1,Unnamed: 82_level_1,Unnamed: 83_level_1,Unnamed: 84_level_1,Unnamed: 85_level_1,Unnamed: 86_level_1,Unnamed: 87_level_1,Unnamed: 88_level_1,Unnamed: 89_level_1,Unnamed: 90_level_1,Unnamed: 91_level_1,Unnamed: 92_level_1,Unnamed: 93_level_1,Unnamed: 94_level_1,Unnamed: 95_level_1,Unnamed: 96_level_1,Unnamed: 97_level_1,Unnamed: 98_level_1,Unnamed: 99_level_1,Unnamed: 100_level_1,Unnamed: 101_level_1,Unnamed: 102_level_1,Unnamed: 103_level_1,Unnamed: 104_level_1,Unnamed: 105_level_1,Unnamed: 106_level_1,Unnamed: 107_level_1,Unnamed: 108_level_1,Unnamed: 109_level_1,Unnamed: 110_level_1,Unnamed: 111_level_1,Unnamed: 112_level_1,Unnamed: 113_level_1,Unnamed: 114_level_1,Unnamed: 115_level_1,Unnamed: 116_level_1,Unnamed: 117_level_1,Unnamed: 118_level_1,Unnamed: 119_level_1,Unnamed: 120_level_1,Unnamed: 121_level_1
B00004RAMX,1.0,-0.031951,-2e-05,-0.000368,0.005641,0.000494,-0.219149,0.007424,-0.004446,0.013934,-0.001174,-0.009274,-0.006115,-0.000561,-0.000279,4.465584e-06,0.002859,-0.151254,0.054056,-0.003932,-0.005157,0.001137,0.0367,-0.001174,-5.1e-05,0.000387,0.064956,0.000639,0.00161,0.027193,0.037388,0.005227,-0.001174,-0.001277,-0.020008,0.000926,0.000369,0.59077,0.033533,-0.001046,-0.059003,-0.001174,0.004293,-0.016799,-0.000656,-0.001525,-0.009071,0.017697,0.000133,0.007663,0.000454,0.118007,0.004135,-0.048784,-0.000653,0.004262,0.004364,-0.003239,-0.004047,-0.021901,...,0.000309,-0.01011,-0.001714,-0.13847,0.000373,-0.003129,-0.01836,-0.003938,-0.040967,-0.017706,0.01267,0.00212,-0.002118,0.007648,-0.008462,2.4e-05,-0.017995,-0.002017,0.015313,0.007057,-0.038491,-0.016797,-0.006674,-0.013131,-0.004539,0.010171,-0.00231,0.010336,0.005415,0.127276,-0.006682,0.031386,-0.001891,-0.000447,-0.069654,-0.069654,0.02842,0.039831,-0.040179,0.002814,-0.061282,0.020466,-0.009271,0.017397,0.004393,0.002259,-0.001124,0.004241,0.000759,-9.7e-05,-0.064916,0.000447,0.012149,0.0009,-0.032263,-0.000744,-0.008421,-0.015459,-0.000561,-0.000464
B000084E6V,-0.031951,1.0,0.254623,0.023676,-0.294086,-0.161908,-0.256178,0.029892,0.131885,-0.020257,-0.018807,0.019466,0.191439,0.006755,0.017496,3.490694e-07,-0.22322,0.127922,-0.044983,0.086065,0.002074,-0.014151,-0.159112,-0.018807,0.000182,-0.018887,0.182008,-0.270882,0.21368,0.162097,-0.163526,0.057003,-0.018807,0.092891,0.019807,-0.250794,0.093431,0.001916,0.020161,-0.017179,0.062623,-0.018807,0.102156,0.020644,0.001845,-0.06908,0.011412,0.034791,-0.015039,-0.024117,-0.024534,0.111505,-0.015858,0.145858,0.12035,0.057351,0.058127,-0.134756,0.015773,-0.046581,...,-0.004567,0.122259,0.135985,0.252142,-0.000252,-0.109036,0.056684,-0.202764,-0.024288,-0.141977,-0.0668,-0.031268,-0.049098,0.07341,-0.089636,0.34402,-0.061237,-0.004906,0.004356,-0.679859,0.138701,0.103087,-0.004022,0.033572,-0.024568,0.063376,0.18032,-0.042594,-0.116751,0.035601,-0.022104,0.095124,-0.005305,0.316758,0.237358,0.237358,-0.029167,-0.009002,0.031228,-0.004846,-0.088573,0.015994,-0.072378,0.021197,0.056046,0.010948,0.067794,0.001304,-0.073529,-0.022917,-0.072526,-0.316758,0.150681,-0.006445,-0.014095,0.008088,-0.146154,0.129428,0.006755,-0.028697


In [41]:
product_similarity.shape

(2422, 2422)

In [36]:
#product_similarity_df.to_csv('../data/product_similarity_df.csv')

In [37]:
#product_similarity_df = pd.read_csv('../data/product_similarity_df.csv')

In [38]:
recsys_df.head()

Unnamed: 0,UserId,ProductId,hybrid_score,product_name,CAT1,CAT2,CAT3,CAT4,CAT5,CAT6
0,AOVROBZ8BNTP7,B001EO5QW8,0.494453,"McCANN'S Instant Irish Oatmeal, Variety Pack of Regular, Apples & Cinnamon, and Maple & Brown Sugar, 10-Count Boxes (Pack of 6)",Grocery & Gourmet Food,Breakfast Foods,Cereals,Oatmeal,-,-
1,A27TKQHFW0FB5N,B001GVISJC,0.069048,"Albanese Concord Grape Gummi Bears, 5 Pound Bags (Pack of 2)",Grocery & Gourmet Food,Candy & Chocolate,Jelly Beans & Gummy Candy,Gummy Candy,-,-
2,A37AO20OXS51QA,B001UJEN6C,0.223402,"Steaz Berry Energy Shot, 2.5 Ounce Bottles (Pack of 12)",Grocery & Gourmet Food,Beverages,Bottled Beverages,Water & Drink Mixes,Energy Drinks,-
3,A2LFHPZFG1OHBZ,B001UJEN6C,0.092079,"Steaz Berry Energy Shot, 2.5 Ounce Bottles (Pack of 12)",Grocery & Gourmet Food,Beverages,Bottled Beverages,Water & Drink Mixes,Energy Drinks,-
4,ALSAOZ1V546VT,B001ELL6O8,0.403563,"Arrowhead Mills Pancake & Waffle Mix, Buttermilk, 2 Pound Bags (Pack of 4)",Grocery & Gourmet Food,Pantry Staples,Cooking & Baking,Baking Mixes,Pancakes & Waffles,-


##### Recommender DataFrame

In [39]:
df = recsys_df.drop(labels='UserId',axis=1) 
df1 = df.groupby(by='ProductId').mean()
df2 = df.drop(labels=['hybrid_score'],axis=1)

In [40]:
recommender_df = df1.merge(df2,on='ProductId',how='inner')
recommender_df.drop_duplicates('ProductId',keep='first',inplace=True)
recommender_df.reset_index(inplace=True)
recommender_df.drop(labels='index',axis=1,inplace=True)
recommender_df.set_index('ProductId', inplace=True)

In [66]:
#hybrid score is aggregated for recommender_df
recommender_df.head()

Unnamed: 0_level_0,hybrid_score,product_name,CAT1,CAT2,CAT3,CAT4,CAT5,CAT6
ProductId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
B00004RAMX,0.053044,Victor Easy Set Gopher Trap 0610 - Weather-resistant,Patio,Lawn & Garden,Pest Control,Traps,-,-
B000084E6V,0.124407,Nylabone Dental Dinosaur Chew,Pet Supplies,Dogs,Toys,Chew Toys,-,-
B000084ETV,0.240298,"Canidae Dry Dog Food For All Life Stages, Chicken, Turkey, Lamb, And Fish, 30-Pound",Pet Supplies,Dogs,Food,Dry,-,-
B000084F1I,0.142501,Diamond Naturals Adult Real Meat Recipe Premium Dry Dog Food With Grain and Real Pasture Raised Beef Protein 40Lb,Pet Supplies,Dogs,Food,Dry,-,-
B000084F1Z,-0.030103,Natural Balance Limited Ingredient Dog Treats,Pet Supplies,Dogs,Treats,Cookies,Biscuits & Snacks,-


In [43]:
# create a Series for products so they are associated to an ordered numerical
indices = pd.Series(recommender_df.index)

# function gives top 10 products in a dataframe with its details
# based on cosine similarity scores
def recommender(pid, product_similarity = product_similarity):
    
    # initializing an empty list 
    recommended_products = []
    
    # getting the index of product that matches the product id
    idx = indices[indices == pid].index[0]

    # create a Series with similarity scores in highest similarity scores above
    sim_score = pd.Series(product_similarity[idx]).sort_values(ascending=False)

    # getting the indexes the 15 most similar products
    top_15_indexes = list(sim_score.iloc[2:17].index)
    
    # populating the list with 15 most similar products
    for i in top_15_indexes:
        recommended_products.append(list(recommender_df.index)[i])
    
    #create dataframe to merge productid details
    a = pd.DataFrame(recommended_products)
    a.columns = ['ProductId']
    b = a.merge(recommender_df, on='ProductId',how='inner')

    #recommend based on similar categories and hybrid score
    mask1 = b.CAT3.isin(recommender_df[recommender_df.index == pid].CAT3)
    mask2 = b.CAT4.isin(recommender_df[recommender_df.index == pid].CAT4)
    mask3 = b.CAT5.isin(recommender_df[recommender_df.index == pid].CAT5)
    mask4 = b.CAT6.isin(recommender_df[recommender_df.index == pid].CAT6)
    
    if b[mask1] is not None:
        return b[mask1].sort_values(by=['CAT3','hybrid_score'],ascending=False)
    elif b[mask1 & mask2] is not None:
        return b[mask1 & mask2].sort_values(by=['CAT4','hybrid_score'],ascending=False) 
    elif b[mask1 & mask2 & mask3] is not None:
        return b[mask1 & mask2 & mask3].sort_values(by=['CAT5','hybrid_score'],ascending=False)
    elif b[mask1 & mask2 & mask3 & mask4] is not None:
        return b[mask1 & mask2 & mask3 & mask4].sort_values(by=['CAT6','hybrid_score'],ascending=False)

In [44]:
recommender_df[recommender_df.index == 'B000I6PXLC']

Unnamed: 0_level_0,hybrid_score,product_name,CAT1,CAT2,CAT3,CAT4,CAT5,CAT6
ProductId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
B000I6PXLC,0.404071,"Coffee Masters Flavored Coffee, Dark Chocolate Decadence, Whole Bean, 12-Ounce Bags (Pack of 4)",Grocery & Gourmet Food,Beverages,Coffee,Tea & Cocoa,Coffee,Roasted Coffee Beans


In [45]:
recommender('B000I6PXLC')
#similar product(s) to 'B000I6PXLC'. Our recommender 

Unnamed: 0,ProductId,hybrid_score,product_name,CAT1,CAT2,CAT3,CAT4,CAT5,CAT6
2,B00021VGPU,0.73965,Kusmi Tea Russian Morning Black Tea - Blend of Ceylon and China Black Tea with Round Chocolate Taste Perfect for Breakfast Feel Refreshed and Rejuvenated (4.4oz Tin 50 Servings),Grocery & Gourmet Food,Beverages,Coffee,Tea & Cocoa,Tea,Black


In [46]:
recommender_df.sort_values(by='hybrid_score',ascending=False)

Unnamed: 0_level_0,hybrid_score,product_name,CAT1,CAT2,CAT3,CAT4,CAT5,CAT6
ProductId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
B003VAM4G4,0.985700,"Navitas Organics Wheatgrass Powder, 1-Ounce Pouches",Health & Household,Vitamins & Dietary Supplements,Supplements,-,-,-
B000H6JO1S,0.960500,Chef Jays - Tri-O-Plex Cookies - 12 per Box (85g each) - Chocolate Chip,Grocery & Gourmet Food,Breads & Bakery,Cookies,Chocolate Chip,-,-
B001GM969M,0.946909,"Honey Granules, 8 oz.",Grocery & Gourmet Food,Pantry Staples,Cooking & Baking,Syrups,Sugars & Sweeteners,Honey
B000GZSCW2,0.943200,"Fantastic World Foods Instant Refried Beans, 7-Ounce Boxes (Pack of 12)",Grocery & Gourmet Food,Pantry Staples,Dried Beans,Lentils & Peas,Beans,Pinto
B000JKJJHG,0.875200,Daves Insanity Sauce Gift Set - A Total Lapse of Reason. The Hottest Set Known to Man! This Set is the Four Hottest Sauces On the Planet. This is in All Likelyhood More Than a Lifetime Supply of Blazing Face Torture.,Grocery & Gourmet Food,Food & Beverage Gifts,Sauce,Gravy & Marinade Gifts,-,-
B002SVAYGY,0.845910,C & H Baker`s Sugar Ultra-Fine Pure Cane Sugar 4 lbs. (Pack of 3),Grocery & Gourmet Food,Pantry Staples,Cooking & Baking,Syrups,Sugars & Sweeteners,Sugars
B000E1FZCI,0.793280,"Good Seasons Garlic & Herb Salad Dressing & Recipe Mix (0.75 oz Packets, Pack of 24)",Grocery & Gourmet Food,Pantry Staples,Condiments & Salad Dressings,Salad Dressings,Italian,-
B0045TEG2K,0.784635,"Pillsbury Moist Supreme Sugar Free Devil's Food Cake Mix, 16 Ounces (Pack of 6)",Grocery & Gourmet Food,Pantry Staples,Cooking & Baking,Baking Mixes,Cakes,-
B000EMOD4I,0.761377,"Betty Crocker Baking Mix, Muffin & Quick Bread Mix, Cinnamon Streusel, 13.9 Oz Box (Pack of 12)",Grocery & Gourmet Food,Pantry Staples,Cooking & Baking,Baking Mixes,Muffins,-
B00021VGPU,0.739650,Kusmi Tea Russian Morning Black Tea - Blend of Ceylon and China Black Tea with Round Chocolate Taste Perfect for Breakfast Feel Refreshed and Rejuvenated (4.4oz Tin 50 Servings),Grocery & Gourmet Food,Beverages,Coffee,Tea & Cocoa,Tea,Black


Our item based recommender is recommending based on cosine similarity scores between products, categories and hybrid score. It seems like our recommender is doing a decent job recommending at the moment.

#### 2.3.2 User-based Collaborative Filtering

- **Input:** User Id
- **Output:** Recommended products for this user, based on previous purchases and likely hybrid scores

In [47]:
#cf_df = pd.read_csv('../data/cf_df.csv')
#cf_df.drop_index(inplace=True)

In [48]:
recsys_df.head(1)

Unnamed: 0,UserId,ProductId,hybrid_score,product_name,CAT1,CAT2,CAT3,CAT4,CAT5,CAT6
0,AOVROBZ8BNTP7,B001EO5QW8,0.494453,"McCANN'S Instant Irish Oatmeal, Variety Pack of Regular, Apples & Cinnamon, and Maple & Brown Sugar, 10-Count Boxes (Pack of 6)",Grocery & Gourmet Food,Breakfast Foods,Cereals,Oatmeal,-,-


In [49]:
#User-based collaborative filtering
cf_df1 = cf_df.T

In [50]:
cf_df1.head()

UserId,A1007PT85CIPMD,A100UZGZNZ9ZYN,A100WO06OQR8BQ,A103FPM7ABVMAW,A106ZCP7RSXMRU,A1076UA29SK59D,A1080SE9X3ECK0,A109L3WXD1SJFU,A10H24TDLK2VDP,A10IKHRUSMKP46,A10LIGIT9EGCM9,A10PEXB6XAQ5XF,A10R9LB4QJNG5X,A10TYGME2FQHO7,A10U8DJAPJJI8I,A10XLFE3T83WQM,A10ZXUNZNUJY0Z,A1194J1H29WSV,A11A9AVEM5EVU4,A11E8ZT3WEMH2Y,A11ED8O95W2103,A11EIDY6DD40CS,A11KZ906QD08C5,A11OTLEDSW8ZXD,A11T807LX2EF00,A11XAIFA10G7TS,A1205T8NP2BQ5E,A121PLHXGZXXUJ,A124PSAV4UV3BX,A124URARVE9S89,A12DQZKRKTNF5E,A12ENBT314RFXR,A12IRGQLFE4EBA,A12MQA7IMXZ7JT,A12NM11F1CCN2O,A12O5IJUK0EHIU,A12R3YGEHW7D8G,A12Y0N1S2C3YAB,A12YPC3CGHLDO5,A130VGG4P4PW5J,A131S7JQCEPFOM,A1347KUESVCYZ,A135XHGMBR0OWF,A137F1PRW4SB2Z,A13853O9CBLTEY,A13HRSMJ5TOWEZ,A13J10QRUKSLSL,A13K3ZLWAWN1EI,A13NTM92VE1U2Y,A13S959ZBAOU53,A13T0V3LHOTHDL,A144LF2QWLG1ZL,A14738H3YYX7ZC,A147FUNITGB21I,A149XXYGR6WKS9,A14BAM6KBGBWJ2,A14DV28G9OCFL0,A14EF1PPKMSEPU,A14ENWEKTHCBXR,A14HZ5EMD2WCG,...,AW7BIYHXUIZ62,AWAB7PKBO3BBT,AWBGHDHH7E51F,AWBMGLP57SAGK,AWGXF4XREHKBR,AWHZ4K1IXPFRZ,AWKZAUC0D8DYL,AWLK6NSSV0YNA,AWMZ9VHF1Q9PI,AWNSQQJ44NPBT,AWNV3TK4FNF45,AWPODHOB4GFWL,AWZR0O65DL2Q,AX0XNE6IX7N3M,AX1SE25U7P6I8,AX5JZLRL9KN9B,AX7QMRXX81L9K,AX9QZGAJOZ96O,AXC8TDCIET6LC,AXHTH0EL75SOJ,AXJGCAD36N915,AXJYL607ABWIB,AXO4PQU0XG3TG,AXQIHSF9KK7CO,AXQNEMI9N0Z2D,AXRJWP1UXPEBB,AXU3VKZE848IY,AXV5CT7AG4SYO,AXVNVV5VH5XZY,AXXWXM6K66YMZ,AY0WPNYO66YAA,AY12DBB0U420B,AY1EF0GOH80EK,AY1L1H0MUMAMC,AY1YNN6PAYNW9,AY3XPKRAMKKY7,AY54QSGO3KWEM,AY889QQ9SMKMB,AYB4ELCS5AM8P,AYDS27E60FH0A,AYGEP8I4BQ3CK,AYGIIQGSHKZNI,AYGJ96W5KQMUJ,AYHHNMEJ271NL,AYNAH993VDECT,AYNH2BHO8SO52,AYOMAHLWRQHUG,AYQWJUNE09ZWE,AYWHCM0TJ4737,AYWPUWMMWS40Y,AYWUHB7N8XGZQ,AZ5X928CQPRJN,AZBZ6AMM3Z492,AZM22KBPUN0BH,AZMTHQIU02OGB,AZNSBRQ0DS8LK,AZV26LP92E6WU,AZWRZZAMX90VT,AZXON596A1VXC,AZZFJQFHITBZ5
ProductId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1,Unnamed: 51_level_1,Unnamed: 52_level_1,Unnamed: 53_level_1,Unnamed: 54_level_1,Unnamed: 55_level_1,Unnamed: 56_level_1,Unnamed: 57_level_1,Unnamed: 58_level_1,Unnamed: 59_level_1,Unnamed: 60_level_1,Unnamed: 61_level_1,Unnamed: 62_level_1,Unnamed: 63_level_1,Unnamed: 64_level_1,Unnamed: 65_level_1,Unnamed: 66_level_1,Unnamed: 67_level_1,Unnamed: 68_level_1,Unnamed: 69_level_1,Unnamed: 70_level_1,Unnamed: 71_level_1,Unnamed: 72_level_1,Unnamed: 73_level_1,Unnamed: 74_level_1,Unnamed: 75_level_1,Unnamed: 76_level_1,Unnamed: 77_level_1,Unnamed: 78_level_1,Unnamed: 79_level_1,Unnamed: 80_level_1,Unnamed: 81_level_1,Unnamed: 82_level_1,Unnamed: 83_level_1,Unnamed: 84_level_1,Unnamed: 85_level_1,Unnamed: 86_level_1,Unnamed: 87_level_1,Unnamed: 88_level_1,Unnamed: 89_level_1,Unnamed: 90_level_1,Unnamed: 91_level_1,Unnamed: 92_level_1,Unnamed: 93_level_1,Unnamed: 94_level_1,Unnamed: 95_level_1,Unnamed: 96_level_1,Unnamed: 97_level_1,Unnamed: 98_level_1,Unnamed: 99_level_1,Unnamed: 100_level_1,Unnamed: 101_level_1,Unnamed: 102_level_1,Unnamed: 103_level_1,Unnamed: 104_level_1,Unnamed: 105_level_1,Unnamed: 106_level_1,Unnamed: 107_level_1,Unnamed: 108_level_1,Unnamed: 109_level_1,Unnamed: 110_level_1,Unnamed: 111_level_1,Unnamed: 112_level_1,Unnamed: 113_level_1,Unnamed: 114_level_1,Unnamed: 115_level_1,Unnamed: 116_level_1,Unnamed: 117_level_1,Unnamed: 118_level_1,Unnamed: 119_level_1,Unnamed: 120_level_1,Unnamed: 121_level_1
B00004RAMX,1.900386e-08,-3.357044e-25,-3.879052e-09,-3.765871e-09,5.746364e-09,1.333628e-22,-1.168493e-21,6.969149e-08,6.124042e-09,-2.369803e-06,-1.280908e-09,2.87182e-09,5.108821e-09,-5.318381e-09,-4.924217e-10,6.710325e-09,1.949246e-09,1.191089e-09,2.233513e-09,4.256301e-09,3.820716e-11,3.116236e-08,9.296952e-10,-6.237484e-08,5.241071e-08,5.644212e-12,6.568608e-12,7.901925e-10,-1.353747e-22,6.430573e-10,-1.082842e-08,2.390505e-10,7.519613e-22,-3.244749e-08,-3.748202e-10,9.807915e-24,2.353979e-09,-6.177004000000001e-23,7.840129e-10,9.468128e-09,-2.892166e-08,6.294298e-11,-6.762569e-22,3.84936e-09,9.941522e-10,-4.677769e-09,-2.400593e-08,-8.697814e-08,-1.24833e-10,-1.114502e-09,-9.524199e-11,-1.800865e-09,-4.302219e-23,-1.604774e-09,-5.319398e-23,-7.807737e-12,2.493503e-08,2.933883e-09,-3.513878e-08,3.351483e-09,...,-2.538674e-10,-2.530769e-08,-1.084294e-07,-6.649112e-13,3.953216e-09,5.19128e-08,6.191523e-08,-1.269447e-07,1.020859e-09,8.669646000000001e-23,2.43923e-10,-1.118303e-09,-5.736061e-09,-1.98897e-09,-5.194785e-11,2.247262e-10,-1.148767e-08,-1.825564e-08,5.78241e-09,9.491384e-10,2.505745e-07,8.841272e-09,-3.100044e-09,-2.313851e-09,-3.380222e-08,9.181096e-10,4.997434e-09,3.371393e-09,7.441169e-10,-5.017164e-09,-1.069677e-08,-3.325451e-09,2.689552e-07,-1.011153e-08,2.232332e-12,6.944244e-08,-1.413774e-22,7.46811e-10,3.602243e-08,2.842646e-22,-2.48555e-09,-5.902306e-09,2.055649e-08,-4.547416e-11,-6.252765e-09,1.383359e-08,2.934115e-08,-5.674633e-10,-1.075415e-08,5.922914e-12,-1.705813e-08,-4.084294e-09,1.979491e-10,7.179716e-11,-1.022857e-09,-4.754305e-10,8.051291e-09,2.223868e-09,5.969819e-10,-3.234615e-12
B000084E6V,1.094901e-06,-6.113129e-25,-2.069836e-07,-2.842282e-07,-2.835399e-08,2.726274e-22,1.251049e-21,3.377316e-08,-4.959213e-08,4.798248e-08,1.119578e-07,-8.948092e-08,4.865227e-07,4.727911e-08,1.303541e-07,6.010642e-10,-3.533608e-08,3.428349e-09,-1.187187e-10,8.920249e-08,-1.510774e-09,-4.704187e-09,-3.733667e-09,-2.337604e-07,-2.802024e-08,2.921179e-11,7.036553e-08,-2.457867e-08,1.9066840000000002e-23,-3.505148e-09,1.137412e-07,-1.417678e-10,3.622462e-21,4.417509e-08,1.08997e-09,1.20184e-22,-4.151807e-09,-2.5184890000000003e-23,6.665318e-10,-8.951652e-09,3.467417e-08,-1.321906e-08,-5.933576000000001e-22,-1.630502e-08,2.554954e-10,-2.91515e-09,1.125626e-07,4.096693e-07,-2.758249e-09,-6.028615e-08,-1.975035e-08,-2.277011e-08,2.6875210000000004e-23,6.076917e-07,-1.450158e-21,4.258224e-09,-1.896817e-07,1.318653e-09,-3.674558e-07,-2.548229e-08,...,4.083254e-08,-2.197111e-07,1.914466e-07,5.667567e-11,1.989531e-10,-9.296533e-09,-5.20982e-08,2.342749e-07,-1.08181e-07,1.242504e-21,-6.516212e-09,3.426901e-07,-9.19051e-08,-4.685522e-08,7.507043e-10,-3.020641e-09,-1.679935e-07,3.630798e-07,3.055198e-08,8.462912e-09,8.97051e-08,2.737667e-08,1.287693e-08,-1.211724e-07,-2.441792e-06,-9.728374e-08,-4.387583e-07,1.231837e-09,4.681476e-08,-3.559793e-08,1.255694e-08,-3.586836e-08,-3.60317e-06,2.02739e-07,-3.811352e-12,5.557715e-08,5.439934000000001e-22,-1.342629e-09,3.07007e-08,2.818888e-21,1.405183e-07,6.852882e-08,-4.036623e-07,-3.567051e-09,-1.80411e-07,2.463758e-07,1.965465e-07,-4.019906e-08,-1.537391e-07,-1.282063e-09,-1.778195e-07,5.816709e-08,-1.307578e-08,-3.721619e-10,8.871844e-08,2.259008e-07,-3.594716e-07,2.078559e-09,-1.319701e-09,5.565972e-10
B000084ETV,0.005649932,1.081556e-20,-1.744426e-05,0.0001668039,-0.0002234826,2.110498e-18,4.0397000000000005e-17,0.001247845,0.001204822,0.0002398774,5.008614e-05,-0.001146053,0.000589921,0.001282203,-0.0008090621,-1.233604e-05,0.01648042,-4.375412e-06,-3.269553e-09,0.002765477,-1.905738e-06,0.0006065249,0.0001050813,-0.000241034,0.0006827352,1.578381e-07,-7.704971e-05,5.705323e-05,1.2193590000000001e-17,-2.046822e-05,-0.0004592226,2.271314e-05,-1.8066740000000002e-17,9.93417e-05,-2.566464e-08,3.272872e-18,-0.0001964733,-4.667929999999999e-19,9.596393e-06,-4.264895e-05,8.356046e-05,0.0001594383,-1.8061770000000003e-17,-0.004919184,-0.0001329905,-0.0004775159,0.0003635173,0.001284978,-1.729445e-05,0.0001460593,1.511645e-05,-0.0001324274,1.0473239999999999e-19,-3.349419e-05,1.9882020000000002e-17,-1.300404e-06,0.0004220512,9.995727e-05,0.0001346115,3.377898e-05,...,-0.0002648303,-8.387653e-06,0.0004293332,2.604541e-07,1.773978e-05,0.0006857541,-0.003086578,0.0005126218,-0.0001206202,3.3838770000000004e-17,-0.000140369,-0.0008895756,7.01656e-05,-0.00103152,-9.239704e-07,2.222633e-05,0.0002165782,-8.166546e-05,-8.04109e-05,-3.119791e-05,0.0005938482,-0.0005947543,6.873158e-05,-0.0001769941,-0.09199833,-0.000112829,0.0001426609,-0.0001923113,-1.655023e-05,0.0001979956,0.001638688,1.480949e-05,-0.002903382,-0.001016854,-2.212116e-07,0.0001302851,-1.1304870000000001e-17,-3.272539e-05,0.0003781771,9.524006000000001e-17,-5.140419e-05,6.676171e-05,-0.0008724492,7.262143e-06,-0.0003075275,-0.0002709376,0.0006860317,-0.0001171456,-0.0006815689,7.098368e-05,-0.001318105,-0.003325775,-1.636639e-06,-2.855899e-06,0.001335191,0.003841934,-0.01498812,1.540944e-05,9.736517e-06,-1.721398e-05
B000084F1I,0.000124912,-3.932414e-21,0.00039227,2.501437e-05,1.667051e-05,-1.483997e-18,8.748913e-19,0.0001197179,-5.465037e-05,-4.372559e-05,5.639633e-05,0.002184057,-0.0002867724,4.51541e-05,6.269914e-05,5.33343e-05,5.449596e-06,9.201102e-06,-9.533513e-09,-0.0001176046,1.078323e-06,-8.986002e-06,-7.024701e-06,-0.0002310763,0.0001117836,1.261385e-08,-2.067781e-05,0.0001320054,1.40104e-18,1.505092e-06,4.24111e-05,-1.360368e-06,-1.6523590000000002e-17,-1.041394e-05,-3.535395e-06,-1.431032e-19,9.527924e-05,-4.2408589999999994e-19,5.100594e-07,-2.582578e-06,-9.466051e-06,9.848576e-05,1.390748e-17,0.0001362184,1.175975e-05,6.734064e-05,-8.493196e-05,-0.0003139949,2.262192e-06,-3.614676e-05,-3.110454e-06,-2.837784e-05,5.131975e-20,8.728795e-05,4.143537e-18,2.452198e-08,-0.0001068441,-6.936931e-06,-0.003256421,-2.049322e-05,...,0.0001770366,0.0002955055,-4.31547e-05,-1.07998e-08,-4.46031e-07,-8.463898e-05,4.297008e-05,-5.179333e-05,-3.42871e-05,-2.092552e-19,6.93834e-06,0.000263541,3.780391e-05,1.776222e-05,3.128683e-07,-1.227063e-06,0.0001522155,0.0002640519,-4.90159e-05,2.664022e-06,-5.535324e-06,5.35128e-05,-6.048011e-06,-0.0003743436,0.0002392073,-2.951044e-05,-2.520588e-05,3.814706e-05,1.424848e-06,-0.001189004,-0.0001967333,2.346634e-05,-0.002052313,-1.247603e-05,7.296077e-08,-2.612635e-05,3.957729e-18,1.115968e-05,2.522042e-05,8.088726e-18,2.12423e-05,2.651297e-05,0.0001795452,1.481447e-06,-0.0003600031,-9.491092e-05,-0.00301286,1.204018e-05,0.0001859838,3.339488e-06,-7.732331e-05,5.326932e-06,2.702056e-05,2.700954e-07,-1.277062e-05,-0.0004357495,8.753539e-05,-4.58113e-07,-4.917939e-07,4.229796e-07
B000084F1Z,5.731913e-06,4.0427260000000003e-25,6.101085e-08,1.430053e-06,4.242912e-08,4.703031e-22,-1.607154e-21,-1.783032e-07,4.74867e-07,4.140817e-08,1.426454e-07,9.324779e-08,1.968972e-07,2.011227e-06,6.301886e-06,-8.319581e-09,-3.077862e-08,-1.491202e-08,2.181727e-10,-5.055856e-07,-1.880908e-08,-4.934551e-08,-1.347467e-07,-1.676549e-06,1.353317e-07,-8.398879e-09,-4.943118e-06,3.814916e-07,6.160209e-21,-1.980378e-08,1.35972e-06,1.806568e-08,-1.1317619999999999e-20,6.749264e-08,-2.065072e-08,1.149862e-21,-4.18637e-08,-3.6085780000000004e-22,-1.171848e-08,1.098527e-08,8.071066e-08,4.54897e-07,1.07492e-20,-1.920205e-06,-8.408133e-08,-4.822067e-08,-1.18521e-07,-4.31628e-07,8.396168e-09,7.017116e-09,2.241118e-08,-6.594237e-07,-4.710072e-22,-9.680165e-07,1.815039e-20,1.053203e-08,7.082056e-07,-3.327637e-08,1.664015e-06,-1.493403e-07,...,-5.655911e-06,-2.422798e-08,1.645467e-07,-8.660326e-10,5.374904e-08,1.713746e-06,-1.750104e-07,1.908141e-07,-4.587706e-08,1.848009e-20,-1.363635e-07,4.691763e-07,2.029187e-07,2.302394e-07,2.331951e-09,-1.124942e-09,-7.525828e-07,-1.959844e-07,2.037307e-07,-5.176309e-08,1.570877e-06,-1.863353e-07,-4.592971e-09,-1.631675e-07,2.831382e-05,-4.210397e-08,-5.927226e-07,-8.080095e-09,-7.975828e-09,3.553775e-08,7.025265e-07,7.74415e-08,1.48962e-06,-1.136511e-05,-4.850635e-11,2.417424e-07,1.88387e-21,-9.621732e-09,-4.772504e-07,-1.856645e-20,-1.8244e-08,2.085334e-07,8.529981e-07,-4.49888e-09,1.219606e-08,-4.682847e-07,6.414997e-07,4.472841e-07,-2.496403e-07,2.010252e-08,1.312161e-06,2.796989e-07,-5.411801e-08,-2.645223e-09,-3.140372e-06,3.325368e-06,6.528682e-06,3.967872e-08,2.670325e-10,-4.883925e-09


In [51]:
cfrecommender_df = recommender_df.reset_index()

In [52]:
cfrecommender_df.drop(labels='hybrid_score',axis=1,inplace=True)

In [54]:
recsys_df1 = recsys_df[['ProductId', 'product_name', 'CAT1', 'CAT2', 'CAT3', 'CAT4', 'CAT5', 'CAT6']]

In [55]:
class CFRecommender:
    
    MODEL_NAME = 'User-Based Collaborative Filtering'
        
    def __init__(self, cf_df1, recsys_df=None):
        self.cf_df1 = cf_df1
        self.recsys_df1 = recsys_df1
        
    def get_model_name(self):
        return self.MODEL_NAME
        
    def recommend(self, user_id, topn=10, verbose=False):
        # sort predictions based on users' similarity
        sorted_user_predictions = self.cf_df1[user_id].sort_values(ascending=False)\
                                   .reset_index().rename(columns={user_id: 'pred_score'})

        recommendations_df = sorted_user_predictions.merge(cfrecommender_df,on='ProductId',how='right')
        
        mask1 = recommendations_df.CAT2.isin(recsys_df[recsys_df.UserId == user_id].CAT2)
        mask2 = recommendations_df.CAT3.isin(recsys_df[recsys_df.UserId == user_id].CAT3)
        mask3 = recommendations_df.CAT4.isin(recsys_df[recsys_df.UserId == user_id].CAT4)
        mask4 = recommendations_df.CAT5.isin(recsys_df[recsys_df.UserId == user_id].CAT5)
        
        #recommends products based on similar categories and similar hybrid scores
        if recommendations_df[mask1] is not None:
            return recommendations_df[mask1].sort_values(by=['CAT2','pred_score'],ascending=False).head(topn)
        elif recommendations_df[mask1 & mask2] is not None:
            return recommendations_df[mask1 & mask2].sort_values(by=['CAT3','pred_score'],ascending=False).head(topn)
        elif recommendations_df[mask1 & mask2 & mask3] is not None:
            return recommendations_df[mask1 & mask2 & mask3].sort_values(by=['CAT4','pred_score'],ascending=False).head(topn)
        elif recommendations_df[mask1 & mask2 & mask3 & mask4] is not None:
            return recommendations_df[mask1 & mask2 & mask3 & mask4].sort_values(by=['CAT5','pred_score'],ascending=False).head(topn)
    
cf_recommender_model_user = CFRecommender(cf_df1, recsys_df1)

In [72]:
recsys_df[recsys_df.UserId == 'A35V32HZEGZH04']
#user A37AO20OXS51QA's purchase history

Unnamed: 0,UserId,ProductId,hybrid_score,product_name,CAT1,CAT2,CAT3,CAT4,CAT5,CAT6
1391,A35V32HZEGZH04,B000E65OJM,-0.072008,"Celestial Seasonings Herbal Tea, Black Cherry Berry, 20 Count (Pack of 6)",Grocery & Gourmet Food,Beverages,Coffee,Tea & Cocoa,Tea,Herbal
3491,A35V32HZEGZH04,B000FA158G,0.9769,"Biscos Sugar Wafers with Creme Filling, 8.5 Ounce (Pack of 6)",Sports & Outdoors,Sports & Fitness,Exercise & Fitness,Accessories,Gloves,-
4005,A35V32HZEGZH04,B0090X8IPM,0.191939,"Starbucks Natural Fusions Vanilla Ground Coffee, 11 Ounce (Pack of 6)",Grocery & Gourmet Food,Beverages,Coffee,Tea & Cocoa,Coffee,Ground Coffee
4997,A35V32HZEGZH04,B004U49QU2,0.353106,"Chips Ahoy! Chewy Gooey Megafudge, 10-Ounce (Pack of 4)",Grocery & Gourmet Food,Breads & Bakery,Cookies,Chocolate,-,-


In [73]:
#products recommended are based off users' purchase history
cf_recommender_model_user.recommend('A35V32HZEGZH04')

Unnamed: 0,ProductId,pred_score,product_name,CAT1,CAT2,CAT3,CAT4,CAT5,CAT6
0,B000FA158G,0.908448,"Biscos Sugar Wafers with Creme Filling, 8.5 Ounce (Pack of 6)",Sports & Outdoors,Sports & Fitness,Exercise & Fitness,Accessories,Gloves,-
467,B00451W2ZG,6.9e-05,"Nestle Coffee Mate Coffee Creamer, Hazelnut, Liquid Creamer Singles, 180 Count (Pack of 1)",Sports & Outdoors,Sports & Fitness,Exercise & Fitness,Accessories,Gloves,-
662,B00451U9Q0,1.5e-05,"Nestle Coffee Mate Coffee Creamer, French Vanilla, Liquid Creamer Singles, Pack of 180",Sports & Outdoors,Sports & Fitness,Exercise & Fitness,Accessories,-,-
2008,B000E8WIAS,-0.0002,"SweetLeaf Sweet Drops Liquid Stevia Sweetener, Vanilla Creme, 2 Ounce",Sports & Outdoors,Sports & Fitness,Boating & Sailing,Boating,Boats,Inflatable Rafts
1,B004U49QU2,0.428944,"Chips Ahoy! Chewy Gooey Megafudge, 10-Ounce (Pack of 4)",Grocery & Gourmet Food,Breads & Bakery,Cookies,Chocolate,-,-
3,B002NBIE34,0.043968,"Nabisco, 100 Calorie Packs, Oreo Dipped Delight Bars, 5.46oz Box - 1 Package",Grocery & Gourmet Food,Breads & Bakery,Cookies,Sandwich,-,-
5,B004U43ZO0,0.024094,"Chewy Chips Ahoy! Fudge Filled Soft Cookies , 10-Ounce (Pack of 4)",Grocery & Gourmet Food,Breads & Bakery,Cookies,Chocolate Chip,-,-
56,B0052OI128,0.003,"WhoNu? Chocolate Sandwich Creme Cookies, 12.9oz Box (Pack of 3)",Grocery & Gourmet Food,Breads & Bakery,Cookies,Sandwich,-,-
65,B004FELBH8,0.002604,"Newtons Fruit Thins Fig and Honey, 10.5-Ounce (Pack of4)",Grocery & Gourmet Food,Breads & Bakery,Cookies,Fruit,-,-
83,B004ZIF3SW,0.001881,"Mrs Crimble's Traditional Coconut Macaroons,6.7-Ounce (Pack of 6)",Grocery & Gourmet Food,Breads & Bakery,Cookies,-,-,-


Our collaborative filtering recommender recommends products based on a user's purchase history, and its predicted scores from our SVD model.

In [None]:
!jupyter nbconvert RecSys_Slides.ipynb --to slides --post serve