<a href="https://www.kaggle.com/code/gpreda/collaborative-filtering-svd-tuning-evaluate?scriptVersionId=128777383" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

# Introduction

We tune Singular Value Decomposition method and then compare tunned and not-tuned version.


# Analysis preparation

In [1]:
import numpy as np
import pandas as pd
import re
import os
from surprise.model_selection import cross_validate
from surprise.model_selection import GridSearchCV
from surprise import NormalPredictor, SVD

In [2]:
from recommender_metrics import RecommenderMetrics
from movie_lens_data import MovieLensData
from evaluator import Evaluator

# Read the data

In [3]:
path = "/kaggle/input/movielens-100k-dataset/ml-100k"
movie_lens_data = MovieLensData(
    users_path = os.path.join(path, "u.user"),
    ratings_path = os.path.join(path, "u.data"), 
    movies_path = os.path.join(path, "u.item"), 
    genre_path = os.path.join(path, "u.genre") 
    )

evaluation_data = movie_lens_data.read_ratings_data()
movie_data = movie_lens_data.read_movies_data()
popularity_rankings = movie_lens_data.get_popularity_ranks()
ratings = movie_lens_data.get_ratings()

# Prepare evaluator

In [4]:
evaluator = Evaluator(evaluation_data, popularity_rankings)

Number of full trainset users: 943
Number of full trainset items: 1682
Number of trainset users: 943
Number of trainset items: 1641
Size of testset: 25000
Estimating biases using als...
Computing the cosine similarity matrix...
Done computing similarity matrix.


# Add random recommender to evaluator

In [5]:
algo_np = NormalPredictor()
evaluator.add_algorithm(algo_np, "Random")

# Add SVD

In [6]:
algo_svd = SVD()
evaluator.add_algorithm(algo_svd, "SVD")

# Tune and add SVD tuned

In [7]:
param_grid = {'n_epochs': [20, 25, 30], 'lr_all': [0.005, 0.0075, 0.010],
              'n_factors': [50, 75, 100]}
gs = GridSearchCV(SVD, param_grid, measures=['rmse', 'mae'], cv=3)

In [8]:
gs.fit(evaluation_data)

In [9]:
print(gs.best_params['rmse'])

{'n_epochs': 25, 'lr_all': 0.005, 'n_factors': 50}


In [10]:
params = gs.best_params['rmse']
algo_SVD_tuned = SVD(n_epochs = params['n_epochs'], lr_all = params['lr_all'], n_factors = params['n_factors'])
evaluator.add_algorithm(algo_SVD_tuned, "SVD-Tuned")

# Evaluate algorithms

In [11]:
evaluator.evaluate(do_top_n=False)

Evaluating  Random ...
Evaluating accuracy...
Analysis complete.
Evaluating  SVD ...
Evaluating accuracy...
Analysis complete.
Evaluating  SVD-Tuned ...
Evaluating accuracy...
Analysis complete.


Algorithm  RMSE       MAE        FCP       
Random     1.5205     1.2238     0.4964    
SVD        0.9437     0.7431     0.6960    
SVD-Tuned  0.9372     0.7369     0.7031    

Legend:

RMSE:      Root Mean Squared Error. Lower values mean better accuracy.
MAE:       Mean Absolute Error. Lower values mean better accuracy.
FCP:       Fraction of Concordant Pairs. Higher values mean better accuracy.


# Evaluate topN recommendations

In [12]:
evaluator.sample_top_n_recs(movie_lens_data, test_subject=85, k=10)


Using recommender  Random

Building recommendation model...
Computing recommendations...

We recommend:
Heavyweights 5
Romy and Michele's High School Reunion 5
Batman Forever 5
Crumb 5
French Twist 5
Dangerous Minds 5
Miller's Crossing 5
Amistad 5
Mimic 5
My Family 5

Using recommender  SVD

Building recommendation model...
Computing recommendations...

We recommend:
12 Angry Men 4.3180115646420205
Glory 4.249907669569958
Wrong Trousers, The 4.2434423761891225
Wallace & Gromit: The Best of Aardman Animation 4.204108514558328
Close Shave, A 4.1851979248975795
L.A. Confidential 4.184611136400802
Rear Window 4.151459829873411
Hoop Dreams 4.14258805558699
M 4.142421504742167
Shall We Dance? 4.125151483020396

Using recommender  SVD-Tuned

Building recommendation model...
Computing recommendations...

We recommend:
Close Shave, A 4.509246291043993
Wrong Trousers, The 4.3976631387361
12 Angry Men 4.377773485623745
Wallace & Gromit: The Best of Aardman Animation 4.335151730649278
Usual Suspe

In [13]:
evaluator.sample_top_n_recs(movie_lens_data, test_subject=314, k=10)


Using recommender  Random

Building recommendation model...
Computing recommendations...

We recommend:
Jungle Book, The 5
Sabrina 5
Get Shorty 5
Conan the Barbarian 5
Angels and Insects 5
Brothers McMullen, The 5
To Kill a Mockingbird 5
Taxi Driver 5
Raiders of the Lost Ark 5
Annie Hall 5

Using recommender  SVD

Building recommendation model...
Computing recommendations...

We recommend:
Jurassic Park 5
Game, The 4.975242863052553
Air Force One 4.937496600449121
Close Shave, A 4.9241345769940565
Raiders of the Lost Ark 4.897317898910013
Terminator 2: Judgment Day 4.833521405462666
Star Wars 4.777479988615321
Apartment, The 4.760170512473256
Singin' in the Rain 4.708449411519849
Good, The Bad and The Ugly, The 4.684318141382523

Using recommender  SVD-Tuned

Building recommendation model...
Computing recommendations...

We recommend:
Return of the Jedi 5
Gone with the Wind 5
Game, The 5
Titanic 5
True Lies 4.988729825921464
Indiana Jones and the Last Crusade 4.969762428649412
Good Wi