<a href="https://colab.research.google.com/github/kapilmulchandani/E-commerce-recommedation-system/blob/master/Collaborative-Filtering/Collaborative_Filtering.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Collaborative Filtering



Installing required packages:

In [1]:
!pip install surprise



Importing all the libraries:

In [0]:
from surprise.model_selection import train_test_split
from surprise import KNNWithMeans
from surprise import Dataset
from surprise import accuracy
from surprise import Reader
import pandas as pd
import numpy as np
import os

Loading Data:

In [6]:
data = pd.read_csv('reviews.csv')
data.head()

Unnamed: 0,reviewerID,asin,reviewerName,helpful,reviewText,overall,summary,unixReviewTime,reviewTime
0,A1YJEY40YUW4SE,7806397051,Andrea,"[3, 4]",Very oily and creamy. Not at all what I expect...,1.0,Don't waste your money,1391040000,"01 30, 2014"
1,A60XNB876KYML,7806397051,Jessica H.,"[1, 1]",This palette was a decent price and I was look...,3.0,OK Palette!,1397779200,"04 18, 2014"
2,A3G6XNM240RMWA,7806397051,Karen,"[0, 1]",The texture of this concealer pallet is fantas...,4.0,great quality,1378425600,"09 6, 2013"
3,A1PQFP6SAJ6D80,7806397051,Norah,"[2, 2]",I really can't tell what exactly this thing is...,2.0,Do not work on my face,1386460800,"12 8, 2013"
4,A38FVHZTNQ271F,7806397051,Nova Amor,"[0, 0]","It was a little smaller than I expected, but t...",3.0,It's okay.,1382140800,"10 19, 2013"


Reading the dataset using reader

In [0]:
reader = Reader(rating_scale=(1, 5))
data = Dataset.load_from_df(data[['reviewerID', 'asin', 'overall']], reader)

Splitting the dataset into training and testing set

In [0]:
trainset, testset = train_test_split(data, test_size=0.3, random_state=10)

## User based Collaborative Filtering

##### Using Cosine Similarity:


In [12]:
algo = KNNWithMeans(k=5, sim_options={'name': 'cosine', 'user_based': True})
algo.fit(trainset)

<surprise.trainset.Trainset object at 0x7f5c6747f588>
Computing the cosine similarity matrix...
Done computing similarity matrix.
<surprise.trainset.Trainset object at 0x7f5c6747f588>



Run the trained model against the testset



In [0]:
user_cosine_test_pred = algo.test(testset)

Get RMSE of User-based Model using Cosine Similarity

In [14]:
print("RMSE of User-based Model using Cosine Similarity: Test Set")
user_cosine_rmse = accuracy.rmse(user_cosine_test_pred)

RMSE of User-based Model using Cosine Similarity: Test Set
RMSE: 1.2051


##### Using Mean Squared Difference Similarity:

In [16]:
algo = KNNWithMeans(k=5, sim_options={'name': 'msd', 'user_based': True})
algo.fit(trainset)

Computing the msd similarity matrix...
Done computing similarity matrix.


<surprise.prediction_algorithms.knns.KNNWithMeans at 0x7f5c61df5860>

Run the trained model against the testset

In [0]:
user_msd_test_pred = algo.test(testset)

Get RMSE of User-based Model using Mean Squared Difference Similarity

In [18]:
print("RMSE of User-based Model using Mean Squared Difference: Test Set")
user_msd_rmse = accuracy.rmse(user_msd_test_pred)
user_msd_rmse

RMSE of User-based Model using Mean Squared Difference: Test Set
RMSE: 1.2080


1.2080426835721234

##### Using Pearson Baseline Correlation:


In [19]:
algo = KNNWithMeans(k=5, sim_options={'name': 'pearson_baseline', 'user_based': True})
algo.fit(trainset)

Estimating biases using als...
Computing the pearson_baseline similarity matrix...
Done computing similarity matrix.


<surprise.prediction_algorithms.knns.KNNWithMeans at 0x7f5c5b49ef98>

Run the trained model against the testset

In [0]:
user_pb_test_pred = algo.test(testset)

Get RMSE of User-based Model using Pearson Baseline

In [21]:
print("RMSE of User-based Model using Pearson Baseline : Test Set")
user_pb_rmse = accuracy.rmse(user_pb_test_pred)
user_pb_rmse

RMSE of User-based Model using Pearson Baseline : Test Set
RMSE: 1.1782


1.1782235257370004

## Item based Collaborative Filtering

##### Using Cosine Similarity:


In [22]:
algo = KNNWithMeans(k=5, sim_options={'name': 'cosine', 'user_based': False})
algo.fit(trainset)

Computing the cosine similarity matrix...
Done computing similarity matrix.


<surprise.prediction_algorithms.knns.KNNWithMeans at 0x7f5c5b49eef0>


Run the trained model against the testset



In [0]:
item_cosine_test_pred = algo.test(testset)

Get RMSE of Item-based Model using Cosine Similarity

In [24]:
print("RMSE of Item-based Model using Cosine Similarity: Test Set")
item_cosine_rmse = accuracy.rmse(item_cosine_test_pred)

RMSE of Item-based Model using Cosine Similarity: Test Set
RMSE: 1.2101


##### Using Mean Squared Difference Similarity:

In [25]:
algo = KNNWithMeans(k=5, sim_options={'name': 'msd', 'user_based': False})
algo.fit(trainset)

Computing the msd similarity matrix...
Done computing similarity matrix.


<surprise.prediction_algorithms.knns.KNNWithMeans at 0x7f5c5b49e278>

Run the trained model against the testset

In [0]:
item_msd_test_pred = algo.test(testset)

Get RMSE of Item-based Model using Mean Squared Difference Similarity

In [27]:
print("RMSE of Item-based Model using Mean Squared Difference: Test Set")
user_msd_rmse = accuracy.rmse(item_msd_test_pred)
user_msd_rmse

RMSE of Item-based Model using Mean Squared Difference: Test Set
RMSE: 1.2139


1.2139296967207853

##### Using Pearson Baseline Correlation:


In [29]:
algo = KNNWithMeans(k=5, sim_options={'name': 'pearson_baseline', 'user_based': False})
algo.fit(trainset)

Estimating biases using als...
Computing the pearson_baseline similarity matrix...
Done computing similarity matrix.


<surprise.prediction_algorithms.knns.KNNWithMeans at 0x7f5c5b49e198>

Run the trained model against the testset

In [0]:
item_pb_test_pred = algo.test(testset)

Get RMSE of Item-based Model using Pearson Baseline

In [31]:
print("RMSE of Item-based Model using Pearson Baseline : Test Set")
item_pb_rmse = accuracy.rmse(item_pb_test_pred)
item_pb_rmse

RMSE of Item-based Model using Pearson Baseline : Test Set
RMSE: 1.1883


1.1883096647877711