# Recommender Systems 2021/22

### Practice - PureSVD

PureSVD relies on the SVD decomposition of the URM, which is a well known matrix decompositoin technique available in most numerical libraries.

In our case, an SVD decomposition of the URM *R* as ($m \times n$) is as follows

$$ R = U \Sigma V^* $$

Where $U$ is an orthogonal $m \times m$ matrix, $\Sigma$ is a rectangular diagonal matrix ($m \times n$), and $V^*$ is the conjugate transposed of an $n \times n$ matrix. 

The SVD decomposition will try to approximate *exactly* the original matrix, this is not what we want! 
We use instead the *truncated* SVD that will limit the decomposition at the desired number of latent dimensions, approximating the original matrix.


$$ \widehat{R} = U_{t} \Sigma_{t} V^T_{t} $$

Where $U_{t}$ is a $m \times t$ matrix, $\Sigma_{t}$ is a $t \times t$ diagonal matrix, and $V^*_{t}$ is a $t \times n$ matrix. For this approximation, only the $t$ largest singular values are kept.


In [1]:
import time
import numpy as np

In [2]:
from Data_manager.Movielens.Movielens10MReader import Movielens10MReader
from Data_manager.split_functions.split_train_validation_random_holdout import split_train_in_two_percentage_global_sample


data_reader = Movielens10MReader()
data_loaded = data_reader.load_data()

URM_all = data_loaded.get_URM_all()

URM_train, URM_test = split_train_in_two_percentage_global_sample(URM_all, train_percentage = 0.80)

Movielens10M: Verifying data consistency...
Movielens10M: Verifying data consistency... Passed!
DataReader: current dataset is: <class 'Data_manager.Dataset.Dataset'>
	Number of items: 10681
	Number of users: 69878
	Number of interactions in URM_all: 10000054
	Value range in URM_all: 0.50-5.00
	Interaction density: 1.34E-02
	Interactions per user:
		 Min: 2.00E+01
		 Avg: 1.43E+02
		 Max: 7.36E+03
	Interactions per item:
		 Min: 0.00E+00
		 Avg: 9.36E+02
		 Max: 3.49E+04
	Gini Index: 0.57

	ICM name: ICM_all, Value range: 1.00 / 69.00, Num features: 10126, feature occurrences: 128384, density 1.19E-03
	ICM name: ICM_genres, Value range: 1.00 / 1.00, Num features: 20, feature occurrences: 21564, density 1.01E-01
	ICM name: ICM_tags, Value range: 1.00 / 69.00, Num features: 10106, feature occurrences: 106820, density 9.90E-04
	ICM name: ICM_year, Value range: 6.00E+00 / 2.01E+03, Num features: 1, feature occurrences: 10681, density 1.00E+00




In [3]:
URM_train

<69878x10681 sparse matrix of type '<class 'numpy.float64'>'
	with 8000043 stored elements in Compressed Sparse Row format>

### What do we need for PureSVD?

* A numerical library like sklearn
* ... nothing else really


In [4]:
n_users, n_items = URM_train.shape

## Step one and only: Compute the decomposition

In this case I use randomized_svd, but other approximate decompositions are also available which may rely on different algorithms to find the result.

In [5]:
from sklearn.utils.extmath import randomized_svd

num_factors = 10

U, Sigma, VT = randomized_svd(URM_train,
                              n_components=num_factors)

In [6]:
U.shape

(69878, 10)

In [7]:
U

array([[ 8.29745777e-04, -3.27470639e-03, -8.15057301e-04, ...,
        -1.15551192e-03, -1.41987009e-03,  1.11381179e-04],
       [ 6.74183091e-04, -1.02728990e-03, -1.74801108e-04, ...,
         4.72396428e-03,  2.22452459e-03,  7.66579892e-04],
       [ 6.82133856e-04,  9.01357290e-05, -9.59301908e-04, ...,
         9.20645031e-05,  2.71983328e-03,  5.21528159e-04],
       ...,
       [ 3.03711593e-03,  2.38879330e-03,  6.24619035e-03, ...,
         1.85254816e-04,  1.61042181e-03,  6.50981356e-04],
       [ 1.48702188e-03, -5.76873078e-03,  9.02103631e-04, ...,
         9.43520411e-04, -2.40377260e-03,  2.21736478e-03],
       [ 1.65682542e-03, -8.06054399e-04, -8.67286133e-04, ...,
         2.89531411e-03, -3.67103461e-04,  5.52247035e-03]])

In [8]:
Sigma.shape

(10,)

In [9]:
Sigma

array([4274.34712234, 1783.94079629, 1532.69955439, 1226.23777693,
       1181.77418321, 1012.75372053,  960.90948964,  908.11669303,
        843.63067778,  745.68885263])

In [10]:
VT.shape

(10, 10681)

In [11]:
VT

array([[ 0.00659513,  0.03287326,  0.04179272, ...,  0.        ,
         0.        ,  0.        ],
       [-0.01460375, -0.09528498, -0.07152948, ..., -0.        ,
        -0.        , -0.        ],
       [ 0.00269548, -0.01149541, -0.04279691, ..., -0.        ,
        -0.        , -0.        ],
       ...,
       [ 0.00125656, -0.02294984, -0.03211044, ...,  0.        ,
         0.        ,  0.        ],
       [-0.00659959, -0.00630734, -0.0306876 , ...,  0.        ,
         0.        ,  0.        ],
       [-0.01302404,  0.01671774, -0.08365515, ...,  0.        ,
         0.        ,  0.        ]])

### Now we can compute the predictions

In order to compute the prediction we simply have to "reconstruct" the URM starting from the decomposition we have obtained, hence:

$$ \widehat{URM} = U_{t} \Sigma_{t} V^T_{t} $$

In [12]:
user_id = 17025
item_id = 468

user_factors = np.dot(U, np.diag(Sigma))
item_factors = VT

predicted_rating_mf = np.dot(user_factors[user_id,:], item_factors[:,item_id])
predicted_rating_mf

2.2289523994900016

## Item-based version of PureSVD

It is proven that via folding-in you can construct a matematically equivalent version of PureSVD that is item-based.
See for example: Paolo Cremonesi, Yehuda Koren, and Roberto Turrin. 2010. Performance of recommender algorithms on top-n recommendation tasks. https://doi.org/10.1145/1864708.1864721

Why would you want to do that?
* Allows to compute recommendations for users that did not exist when you trained the model (you still need some interactions in their user profile to be able to compute recommendations)
* Allows to create hybrid item-item similarities

You can represent the user embeddings as $U_t \Sigma_t$ and the item embeddings as $V$.

The equivalence tells you that you can write $$ R = U_t \Sigma_t V^T_t = R V V^T $$

In [13]:
item_item_similarity = np.dot(VT.T,VT)
item_item_similarity.shape

(10681, 10681)

In [14]:
predicted_rating_similarity = URM_train[user_id,:].dot(item_item_similarity[:,item_id])
predicted_rating_similarity

array([2.22820517])

The predictios are almost identical, some small numerical diffrences can occur as the representation is always approximate

### Non-Negative MF

Another strategy for matrix decomposition that guarantees no latent dimension will be negative

In [17]:
from sklearn.decomposition import NMF

nmf_solver = NMF(n_components  = num_factors,
                 init = "random",
                 solver = "mu", #"multiplicative_update",
                 beta_loss = "frobenius",
                 l1_ratio = 0.01,
                 shuffle = True,
                 verbose = True,
                 max_iter = 500)

In [18]:
nmf_solver.fit(URM_train)

ITEM_factors = nmf_solver.components_.copy().T
USER_factors = nmf_solver.transform(URM_train)

Epoch 10 reached after 4.480 seconds, error: 9277.490591
Epoch 20 reached after 8.806 seconds, error: 8994.546701
Epoch 30 reached after 13.027 seconds, error: 8923.966636
Epoch 40 reached after 17.363 seconds, error: 8897.652511
Epoch 50 reached after 21.683 seconds, error: 8887.053704
Epoch 60 reached after 25.825 seconds, error: 8881.777195
Epoch 70 reached after 29.973 seconds, error: 8878.741659
Epoch 80 reached after 34.116 seconds, error: 8876.816537
Epoch 90 reached after 38.227 seconds, error: 8875.477266
Epoch 100 reached after 42.346 seconds, error: 8874.472954
Epoch 10 reached after 0.667 seconds, error: 8878.913049
Epoch 20 reached after 0.998 seconds, error: 8874.952676
Epoch 30 reached after 1.298 seconds, error: 8874.328325


In [19]:
ITEM_factors

array([[8.82562067e-22, 3.55522618e-01, 1.18789896e-01, ...,
        9.52500331e-18, 8.46799368e-13, 3.25385175e-21],
       [3.91054831e-03, 3.06543921e+00, 1.91465805e-02, ...,
        7.36999049e-09, 2.53566618e-01, 3.49509879e-01],
       [1.11367844e+00, 2.63463522e+00, 6.11578516e-03, ...,
        4.01144167e-05, 1.73533819e-01, 2.92685888e-05],
       ...,
       [0.00000000e+00, 0.00000000e+00, 0.00000000e+00, ...,
        0.00000000e+00, 0.00000000e+00, 0.00000000e+00],
       [0.00000000e+00, 0.00000000e+00, 0.00000000e+00, ...,
        0.00000000e+00, 0.00000000e+00, 0.00000000e+00],
       [0.00000000e+00, 0.00000000e+00, 0.00000000e+00, ...,
        0.00000000e+00, 0.00000000e+00, 0.00000000e+00]])

In [20]:
USER_factors

array([[2.53384552e-06, 2.86892153e-01, 7.50472301e-10, ...,
        2.27148242e-09, 6.20762936e-03, 5.34420629e-02],
       [1.38230732e-10, 1.13653026e-02, 2.83458134e-01, ...,
        5.31755615e-02, 6.84947315e-08, 7.30046392e-02],
       [4.40034653e-03, 2.40529237e-02, 2.85509615e-18, ...,
        5.04689455e-02, 1.85018548e-01, 1.12577829e-13],
       ...,
       [2.00520375e-07, 8.76694619e-08, 3.55827828e-05, ...,
        4.12833182e-01, 6.20688931e-05, 1.11279117e-01],
       [1.74809493e-05, 3.62548301e-01, 9.75225487e-02, ...,
        9.81888247e-07, 5.78517505e-07, 3.05222623e-05],
       [2.35326269e-06, 2.57301618e-02, 9.80433117e-02, ...,
        7.58979904e-02, 1.91714844e-07, 5.00901430e-01]])