## Recommender systems with decomposition

Instead of using cosine similarity, we can also consider decomposition. We convert the utility matrix $M\,(r\times m)$ into $M'= UV^T\approx M$ with $U\,(r\times l)$ and $V^T (l\times m)$.

In [3]:
import pandas as pd
import numpy as np
from sklearn.decomposition import NMF

Use the same data as before:

In [5]:
# load data
ratings = pd.read_csv('ratings.csv')

# sample dataset
# be careful, once again a very heavy operation
ratings = ratings[:1000]

print(ratings.head())

# print some information
noMovies = len(ratings['movieId'].unique())
noUsers = len(ratings['userId'].unique())
print(str(noMovies)+" from "+str(noUsers)+' users')

   userId  movieId  rating   timestamp
0       1        2     3.5  1112486027
1       1       29     3.5  1112484676
2       1       32     3.5  1112484819
3       1       47     3.5  1112484727
4       1       50     3.5  1112484580
698 from 11 users


Do the same pre-processing as before.

In [6]:
# create empty utility matrix
utility = np.zeros(shape=(noUsers,noMovies))

# store movieIds as indices to use in utility matrix
movieIds = {}
midi = 0
for value in ratings['movieId'].unique():
    movieIds[value]=midi
    midi = midi + 1

# populate utility matrix
for index, line in ratings.iterrows():
    uid = int(line['userId'])-1
    mid = movieIds[line['movieId']]
    rating = line['rating']
    utility[uid,mid]=rating

Doing the matrix factorisation with scikit-learn:

In [8]:
decomposition = NMF(n_components=50, init='random', random_state=0)
U = decomposition.fit_transform(utility)
V_T = decomposition.components_

Shape of $U$ and $V^T$:

In [9]:
print('Shape of U (#reviewers x #latent factors): ', np.shape(U))
print('Shape of V_T (#latent factors x #movies): ', np.shape(V_T))

Shape of U (#reviewers x #latent factors):  (11, 50)
Shape of V_T (#latent factors x #movies):  (50, 698)


Now we can calculate $M'$:

In [10]:
M_ = np.dot(U, V_T)
print(np.shape(M_))

(11, 698)


We can see how more dimensions provide a closer approximation of the original matrix:

In [13]:
for n_comp in range(20,201,10):
    decomposition = NMF(n_components=n_comp, init='random', random_state=0)
    U = decomposition.fit_transform(utility)
    V_T = decomposition.components_
    M_ = np.dot(U, V_T)
    
    # calculate difference between both matrices
    diff = utility-M_
    print(f"for {n_comp} components,\tdifference was {np.sum(diff)}")
    
print("Done running!") # this will run for a while, up to a minute or two.
    
# more reading here: https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.NMF.html

# What do you see? is the difference between matrixes getting larger or smaller? is that good?

for 20 components,	difference was -0.9904296275331765
for 30 components,	difference was -1.6447788497376665
for 40 components,	difference was -1.4804721037678807
for 50 components,	difference was -0.003048584337641386
for 60 components,	difference was -0.35718655808037253
for 70 components,	difference was -0.24226269910080145
for 80 components,	difference was -1.9405478729626118
for 90 components,	difference was -0.9677320887082694
for 100 components,	difference was -0.12184139938983987
for 110 components,	difference was -1.3483386266728918
for 120 components,	difference was -0.9912272300992634
for 130 components,	difference was -0.5863343456619985
for 140 components,	difference was -0.011559465340278248
for 150 components,	difference was -0.04268803166269824
for 160 components,	difference was -0.6816772322296608
for 170 components,	difference was -0.006335860241039059
for 180 components,	difference was -0.725167764703508
for 190 components,	difference was -0.0035302648949677652
for 20