# Recommender Systems 2020/21

### Practice - Implicit Alternating Least Squares

See:
Y. Hu, Y. Koren and C. Volinsky, Collaborative filtering for implicit feedback datasets, ICDM 2008.
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.167.5120&rep=rep1&type=pdf

R. Pan et al., One-class collaborative filtering, ICDM 2008.
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.306.4684&rep=rep1&type=pdf

Factorization model for binary feedback.
First, splits the feedback matrix R as the element-wise a Preference matrix P and a Confidence matrix C.
Then computes the decomposition of them into the dot product of two matrices X and Y of latent factors.
X represent the user latent factors, Y the item latent factors.

The model is learned by solving the following regularized Least-squares objective function with Stochastic Gradient Descent
    
$$\frac{1}{2}\sum_{i,j}{c_{ij}\left(p_{ij}-x_i^T y_j\right) + \lambda\left(\sum_{i}{||x_i||^2} + \sum_{j}{||y_j||^2}\right)}$$


In [1]:
import time
import numpy as np

In [2]:
from Notebooks_utils.data_splitter import train_test_holdout
from Data_manager.Movielens.Movielens10MReader import Movielens10MReader

data_reader = Movielens10MReader()
data_loaded = data_reader.load_data()

URM_all = data_loaded.get_URM_all()

URM_train, URM_test = train_test_holdout(URM_all, train_perc = 0.8)

Movielens10M: Verifying data consistency...
Movielens10M: Verifying data consistency... Passed!
DataReader: current dataset is: <class 'Data_manager.Dataset.Dataset'>
	Number of items: 10681
	Number of users: 69878
	Number of interactions in URM_all: 10000054
	Value range in URM_all: 0.50-5.00
	Interaction density: 1.34E-02
	Interactions per user:
		 Min: 2.00E+01
		 Avg: 1.43E+02
		 Max: 7.36E+03
	Interactions per item:
		 Min: 0.00E+00
		 Avg: 9.36E+02
		 Max: 3.49E+04
	Gini Index: 0.57

	ICM name: ICM_genres, Value range: 1.00 / 1.00, Num features: 20, feature occurrences: 21564, density 1.01E-01
	ICM name: ICM_tags, Value range: 1.00 / 69.00, Num features: 10217, feature occurrences: 108563, density 9.95E-04
	ICM name: ICM_all, Value range: 1.00 / 69.00, Num features: 10237, feature occurrences: 130127, density 1.19E-03




In [3]:
URM_train

<69878x10681 sparse matrix of type '<class 'numpy.float64'>'
	with 8001405 stored elements in Compressed Sparse Row format>

### What do we need for IALS?

* User factor and Item factor matrices
* Confidence function
* Update rule for items
* Update rule for users
* Training loop and some patience


In [4]:
n_users, n_items = URM_train.shape

## Step 1: We create the dense latent factor matrices
### In a MF model you have two matrices, one with a row per user and the other with a column per item. The other dimension, columns for the first one and rows for the second one is called latent factors

In [5]:
num_factors = 10

user_factors = np.random.random((n_users, num_factors))
item_factors = np.random.random((n_items, num_factors))

In [6]:
user_factors

array([[0.56445343, 0.01627608, 0.17512001, ..., 0.87661018, 0.3899033 ,
        0.32641433],
       [0.1088799 , 0.19870041, 0.10541365, ..., 0.20993058, 0.02079788,
        0.08099792],
       [0.13317211, 0.50273745, 0.43626129, ..., 0.42791938, 0.82443448,
        0.38202853],
       ...,
       [0.10204191, 0.85591125, 0.20562087, ..., 0.03072743, 0.47650508,
        0.31997839],
       [0.63161141, 0.30725951, 0.98158763, ..., 0.42191041, 0.93405159,
        0.74128234],
       [0.36176751, 0.02773363, 0.91659522, ..., 0.45599821, 0.95870616,
        0.85224132]])

In [7]:
item_factors

array([[0.96778511, 0.03326854, 0.42771504, ..., 0.24565161, 0.29795505,
        0.82082031],
       [0.82495307, 0.34789801, 0.93389078, ..., 0.10382001, 0.15947663,
        0.80584883],
       [0.16349275, 0.07321222, 0.33470242, ..., 0.35797385, 0.77474128,
        0.01197913],
       ...,
       [0.04299959, 0.73249113, 0.17246724, ..., 0.51621872, 0.67975724,
        0.53853707],
       [0.3429618 , 0.98190495, 0.6485606 , ..., 0.81386613, 0.669452  ,
        0.56823051],
       [0.47554082, 0.39055113, 0.37510228, ..., 0.999072  , 0.22196863,
        0.4787177 ]])

## Step 2: We define a function to transform the interaction data in a "confidence" value. 
* If you have explicit data, the higher it is the higher the confidence (logarithmic, linear?)
* Other options include scaling the data lowering it if the item or use has very few interactions (lower support)

In [8]:
def linear_confidence_function(URM_train, alpha):
    
    URM_train.data = 1.0 + alpha*URM_train.data
    
    return URM_train

In [9]:
alpha = 0.5
C_URM_train = linear_confidence_function(URM_train, alpha)

C_URM_train.data[:10]

array([3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5])

## Step 3: Define the update rules for the user factors


Update latent factors for a single user or item.

Y = |n_interactions|x|n_factors|

YtY =   |n_factors|x|n_factors|



Latent factors ony of item/users for which an interaction exists in the interaction profile
Y_interactions = Y[interaction_profile, :]

Following the notation of the original paper we report the update rule for the Item factors (User factors are identical):
* __Y__ are the item factors |n_items|x|n_factors|
* __Cu__ is a diagonal matrix |n_interactions|x|n_interactions| with the user confidence for the observed items
* __p(u)__ is a boolean vectors indexing only observed items. Here it will disappear as we already extract only the observed latent factors however, it will have an impact in the dimensions of the matrix, since it transforms Cu from a diagonal matrix to a row vector of 1 row and |n_interactions| columns

$$(Yt*Cu*Y + reg*I)^-1 * Yt*Cu*profile$$ which can be decomposed as $$(YtY + Yt*(Cu-I)*Y + reg*I)^-1 * Yt*Cu*p(u)$$ 

* __A__ = (|n_interactions|x|n_factors|) dot (|n_interactions|x|n_interactions| ) dot (|n_interactions|x|n_factors| )
  = |n_factors|x|n_factors|
  
We use an equivalent formulation (v * k.T).T which is much faster
* __A__ = Y_interactions.T.dot(((interaction_confidence - 1) * Y_interactions.T).T)
* __B__ = YtY + A + self.regularization_diagonal
* __new factors__ = np.dot(np.linalg.inv(B), Y_interactions.T.dot(interaction_confidence))


In [10]:
def _update_row(interaction_profile, interaction_confidence, Y, YtY, regularization_diagonal):

    Y_interactions = Y[interaction_profile, :]
    
    A = Y_interactions.T.dot(((interaction_confidence - 1) * Y_interactions.T).T)

    B = YtY + A + regularization_diagonal

    return np.dot(np.linalg.inv(B), Y_interactions.T.dot(interaction_confidence))


In [11]:
regularization_coefficient = 1e-4

regularization_diagonal = np.diag(regularization_coefficient * np.ones(num_factors))

In [12]:
# VV = n_factors x n_factors
VV = item_factors.T.dot(item_factors)
VV.shape

(10, 10)

In [13]:
user_id = 154

In [14]:
start_pos = C_URM_train.indptr[user_id]
end_pos = C_URM_train.indptr[user_id + 1]

user_profile = C_URM_train.indices[start_pos:end_pos]
user_confidence = C_URM_train.data[start_pos:end_pos]

user_factors[user_id, :] = _update_row(user_profile, user_confidence, item_factors, VV, regularization_diagonal)

## Step 4: Apply updates on the user item factors as well

In [15]:
# UU = n_factors x n_factors
UU = user_factors.T.dot(user_factors)
UU.shape

(10, 10)

In [16]:
item_id = 154

In [17]:
C_URM_train_csc = C_URM_train.tocsc()

start_pos = C_URM_train_csc.indptr[item_id]
end_pos = C_URM_train_csc.indptr[item_id + 1]

item_profile = C_URM_train_csc.indices[start_pos:end_pos]
item_confidence = C_URM_train_csc.data[start_pos:end_pos]

item_factors[item_id, :] = _update_row(item_profile, item_confidence, user_factors, UU, regularization_diagonal)

### Let's put all together in a training loop.

In [18]:
C_URM_train_csc = C_URM_train.tocsc()

num_factors = 10

user_factors = np.random.random((n_users, num_factors))
item_factors = np.random.random((n_items, num_factors))


for n_epoch in range(10):
    
    start_time = time.time()

    for user_id in range(C_URM_train.shape[0]):

        start_pos = C_URM_train.indptr[user_id]
        end_pos = C_URM_train.indptr[user_id + 1]

        user_profile = C_URM_train.indices[start_pos:end_pos]
        user_confidence = C_URM_train.data[start_pos:end_pos]

        user_factors[user_id, :] = _update_row(user_profile, user_confidence, item_factors, VV, regularization_diagonal)   

        # Print some stats
        if (user_id +1)% 100000 == 0 or user_id == C_URM_train.shape[0]-1:
            elapsed_time = time.time() - start_time
            samples_per_second = user_id/elapsed_time
            print("Iteration {} in {:.2f} seconds. Users per second {:.2f}".format(user_id+1, elapsed_time, samples_per_second))


    for item_id in range(C_URM_train.shape[1]):

        start_pos = C_URM_train_csc.indptr[item_id]
        end_pos = C_URM_train_csc.indptr[item_id + 1]

        item_profile = C_URM_train_csc.indices[start_pos:end_pos]
        item_confidence = C_URM_train_csc.data[start_pos:end_pos]

        item_factors[item_id, :] = _update_row(item_profile, item_confidence, user_factors, UU, regularization_diagonal)    

        # Print some stats
        if (item_id +1)% 100000 == 0 or item_id == C_URM_train.shape[1]-1:
            elapsed_time = time.time() - start_time
            samples_per_second = item_id/elapsed_time
            print("Iteration {} in {:.2f} seconds. Items per second {:.2f}".format(item_id+1, elapsed_time, samples_per_second))

    total_epoch_time = time.time() - start_time  
    print("Epoch {} complete in in {:.2f} seconds".format(n_epoch+1, total_epoch_time))


Iteration 69878 in 9.08 seconds. Users per second 7695.57
Iteration 10681 in 11.38 seconds. Items per second 938.49
Epoch 1 complete in in 11.39 seconds
Iteration 69878 in 8.96 seconds. Users per second 7798.79
Iteration 10681 in 11.30 seconds. Items per second 945.14
Epoch 2 complete in in 11.30 seconds
Iteration 69878 in 9.18 seconds. Users per second 7611.73
Iteration 10681 in 11.41 seconds. Items per second 936.01
Epoch 3 complete in in 11.41 seconds
Iteration 69878 in 9.48 seconds. Users per second 7371.05
Iteration 10681 in 11.87 seconds. Items per second 899.77
Epoch 4 complete in in 11.87 seconds
Iteration 69878 in 9.14 seconds. Users per second 7644.97
Iteration 10681 in 12.09 seconds. Items per second 883.35
Epoch 5 complete in in 12.10 seconds
Iteration 69878 in 9.49 seconds. Users per second 7363.08
Iteration 10681 in 11.79 seconds. Items per second 905.84
Epoch 6 complete in in 11.80 seconds
Iteration 69878 in 9.38 seconds. Users per second 7449.51
Iteration 10681 in 11.83

### How long do we train such a model?

* An epoch: a complete loop over all the train data
* Usually you train for multiple epochs. Depending on the algorithm and data 10s or 100s of epochs.

In [19]:
estimated_seconds = total_epoch_time*10
print("Estimated time with the previous training speed is {:.2f} seconds, or {:.2f} minutes".format(estimated_seconds, estimated_seconds/60))

Estimated time with the previous training speed is 115.00 seconds, or 1.92 minutes


## Lastly: Computing a prediction for any given user or item

In [20]:
user_id = 17025
item_id = 468

In [21]:
predicted_rating = np.dot(user_factors[user_id,:], item_factors[item_id,:])
predicted_rating

0.6921570260621303