<a href="https://colab.research.google.com/github/kapilmulchandani/E-commerce-recommedation-system/blob/master/Collaborative-Filtering/Collaborative_Filtering.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Collaborative Filtering



Installing required packages:

In [0]:
!pip install surprise



Importing all the libraries:

In [0]:
from surprise.model_selection import train_test_split
from surprise import KNNWithMeans
from surprise import Dataset
from surprise import accuracy
from surprise import Reader
import os
import pandas as pd
import numpy as np

Loading Data:

In [0]:
data = pd.read_csv('reviews.csv')
data.head()

Unnamed: 0,reviewerID,asin,reviewerName,helpful,reviewText,overall,summary,unixReviewTime,reviewTime
0,A1YJEY40YUW4SE,7806397051,Andrea,"[3, 4]",Very oily and creamy. Not at all what I expect...,1.0,Don't waste your money,1391040000,"01 30, 2014"
1,A60XNB876KYML,7806397051,Jessica H.,"[1, 1]",This palette was a decent price and I was look...,3.0,OK Palette!,1397779200,"04 18, 2014"
2,A3G6XNM240RMWA,7806397051,Karen,"[0, 1]",The texture of this concealer pallet is fantas...,4.0,great quality,1378425600,"09 6, 2013"
3,A1PQFP6SAJ6D80,7806397051,Norah,"[2, 2]",I really can't tell what exactly this thing is...,2.0,Do not work on my face,1386460800,"12 8, 2013"
4,A38FVHZTNQ271F,7806397051,Nova Amor,"[0, 0]","It was a little smaller than I expected, but t...",3.0,It's okay.,1382140800,"10 19, 2013"


Reading the dataset using reader

In [0]:
reader = Reader(rating_scale=(1, 5))
data = Dataset.load_from_df(data[['reviewerID', 'asin', 'overall']], reader)

Splitting the dataset into training and testing set

In [0]:
trainset, testset = train_test_split(data, test_size=0.3, random_state=10)

## User based Collaborative Filtering

##### Using Cosine Similarity:


In [0]:
algo = KNNWithMeans(k=5, sim_options={'name': 'cosine', 'user_based': True})
algo.fit(trainset)

Computing the cosine similarity matrix...
Done computing similarity matrix.


<surprise.prediction_algorithms.knns.KNNWithMeans at 0x7fd2a9d7f9e8>


Run the trained model against the testset



In [0]:
test_pred = algo.test(testset)

Get RMSE of User-based Model using Cosine Similarity

In [0]:
print("RMSE of User-based Model using Cosine Similarity: Test Set")
user_cosine_rmse = accuracy.rmse(test_pred, verbose=True)

User-based Model : Test Set
RMSE: 1.2051


1.2051337701872882

##### Using Mean Squared Difference Similarity:

In [0]:
algo = KNNWithMeans(k=5, sim_options={'name': 'msd', 'user_based': True})
algo.fit(trainset)

Computing the msd similarity matrix...
Done computing similarity matrix.


<surprise.prediction_algorithms.knns.KNNWithMeans at 0x7fe2493c7ac8>

Run the trained model against the testset

In [0]:
test_pred = algo.test(testset)

Get RMSE of User-based Model using Mean Squared Difference Similarity

In [0]:
print("RMSE of User-based Model using Mean Squared Difference: Test Set")
user_msd_rmse = accuracy.rmse(user_test_pred)
user_msd_rmse

##### Using Pearson Correlation:


In [0]:
algo = KNNWithMeans(k=5, sim_options={'name': 'pearson_baseline', 'user_based': True})
algo.fit(trainset)

Estimating biases using als...
Computing the pearson_baseline similarity matrix...
Done computing similarity matrix.


<surprise.prediction_algorithms.knns.KNNWithMeans at 0x7fe25510a3c8>

Run the trained model against the testset

In [0]:
user_test_pred = algo.test(testset)

Get RMSE of User-based Model using Pearson Baseline

In [0]:
print("RMSE of User-based Model using Pearson Baseline : Test Set")
user_pearson_baseline_rmse = accuracy.rmse(user_test_pred)
user_pearson_baseline_rmse

User-based Model : Test Set
RMSE: 1.1782


1.1782235257370004