# Recommender Systmes with Explicit Ratings

This notebook demos various methods of building and analyzing recommender systems that deal with explicit rating data.  We will explore baseline algorithms, neighborhood methods, matrix factorization-based ( SVD, SVD++, NMF). 

The implementation of the model is from [Surprise](https://github.com/NicolasHug/Surprise), which stands for `Simple Python RecommendatIon System Engine.`

## Loading Data

MovieLens data sets were collected by the GroupLens Research Project
at the University of Minnesota.

This data set consists of:
* 100,000 ratings (1-5) from 943 users on 1682 movies.
* Each user has rated at least 20 movies.

In [2]:
from surprise import Dataset
from surprise import accuracy
from surprise.model_selection import train_test_split

from collections import Counter, defaultdict


# Load the movielens-100k dataset (download it if needed),
data = Dataset.load_builtin('ml-100k')

ucounter = defaultdict(int); icounter = defaultdict(int)
for user,item,rating,ts in data.raw_ratings:
    ucounter[user] += 1
    icounter[item] += 1
print ("{} ratings (1-5) from {} users and {} movies ".format(len(data.raw_ratings), len(ucounter), len(icounter)) )

maxuser = max(ucounter, key=lambda u: ucounter[u])
minuser = min(ucounter, key=lambda u: ucounter[u])
print ("user {} has rated the least movies with {} times".format(minuser, ucounter[minuser]))
print ("user {} has rated the most movies with {} times".format(maxuser, ucounter[maxuser]))

maxmovie = max(icounter, key=lambda i: icounter[i])
minmovie = min(icounter, key=lambda i: icounter[i])

print ("movie {} was rated the least with {} times".format(minmovie, icounter[minmovie]))
print ("movie {} was rated the most with {} times".format(maxmovie, icounter[maxmovie]))

100000 ratings (1-5) from 943 users and 1682 movies 
user 166 has rated the least movies with 20 times
user 405 has rated the most movies with 737 times
movie 1348 was rated the least with 1 times
movie 50 was rated the most with 583 times


Sample random trainset and testset

Test set is made of 25% of the ratings.

In [3]:

trainset, testset = train_test_split(data, test_size=.25)
print ("there are {} of items, {} of users, and {} of ratings in the trainig set ".format(trainset.n_items, trainset.n_users, trainset.n_ratings))
print ("there are {} of records in the test set ".format(len(testset))) #test set is a list

there are 1636 of items, 943 of users, and 75000 of ratings in the trainig set 
there are 25000 of records in the test set 


In [6]:
testset[:10]

[('214', '137', 4.0),
 ('440', '312', 5.0),
 ('671', '222', 1.0),
 ('173', '294', 5.0),
 ('7', '227', 3.0),
 ('21', '100', 5.0),
 ('288', '318', 4.0),
 ('490', '987', 3.0),
 ('449', '462', 5.0),
 ('293', '223', 4.0)]

### Utility function: 
Define the prediction function that we will use to spot check the model predictions

In [31]:
## Utility function to print out the prediction results from Surprise
def printPrediction(pred):
    s = 'user: {uid:<10} '.format(uid=pred.uid)
    s += 'item: {iid:<10} '.format(iid=pred.iid)
    if pred.r_ui is not None:
        s += 'r_ui = {r_ui:1.2f}   '.format(r_ui=pred.r_ui)
    else:
        s += 'r_ui = None   '
    s += 'est = {est:1.2f}   '.format(est=pred.est)
    print (s)

In [32]:
def testPrediction(model):
    for uid, iid, rating in testset[16:25]:
        printPrediction(model.predict(uid, iid, rating, verbose=False))
    uid = str(196)  # raw user id (as in the ratings file). They are **strings**!
    iid = str(302)  # raw item id (as in the ratings file). They are **strings**!
    # get a prediction for specific users and items.
    pred = model.predict(uid, iid, r_ui=4, verbose=False)
    printPrediction(pred)
    #return pred

## Random Prediction

Algorithm predicting a random rating based on the distribution of the training set, which is assumed to be normal.

The prediction $\hat{r}_{ui}$ is generated from a normal distribution $N(\mu, \sigma^2)$ where $\mu$ and $\sigma$ are estimated from the training data using Maximum Likelihood Estimation:

$$ \hat{\mu} = \frac{1}{|R_{train}|} \sum_{r_{ui} \in R_{train}}r_{ui} $$

$$\hat{\sigma} = \sqrt{\sum_{r_{ui} \in R_{train}}
        \frac{(r_{ui} - \hat{\mu})^2}{|R_{train}|}}$$

In [33]:
from surprise import NormalPredictor

algo = NormalPredictor()

# Train the algorithm on the trainset, and predict ratings for the testset
algo.fit(trainset)
predictions = algo.test(testset)

# Then compute RMSE
accuracy.rmse(predictions)

# Then show some prediction examples
testPrediction(algo)

RMSE: 1.5295
user: 312        item: 152        r_ui = 2.00   est = 1.94   
user: 940        item: 95         r_ui = 5.00   est = 2.61   
user: 881        item: 161        r_ui = 3.00   est = 4.68   
user: 436        item: 234        r_ui = 3.00   est = 2.04   
user: 398        item: 514        r_ui = 4.00   est = 2.65   
user: 790        item: 763        r_ui = 3.00   est = 3.97   
user: 104        item: 290        r_ui = 4.00   est = 1.53   
user: 32         item: 271        r_ui = 3.00   est = 5.00   
user: 189        item: 1099       r_ui = 5.00   est = 1.00   
user: 196        item: 302        r_ui = 4.00   est = 3.18   


## KNN inspired algorithms

### KNN Basic
A basic k-nearest neighbor based collaborative filtering algorithm.
The prediction $\hat{r}_{ui}$ is set as:
$$\hat{r}_{ui} = \frac{
\sum\limits_{v \in N^k_i(u)} \text{sim}(u, v) \cdot r_{vi}}
{\sum\limits_{v \in N^k_i(u)} \text{sim}(u, v)}$$
or
$$\hat{r}_{ui} = \frac{
\sum\limits_{j \in N^k_u(i)} \text{sim}(i, j) \cdot r_{uj}}
{\sum\limits_{j \in N^k_u(i)} \text{sim}(i, j)}$$
        
depending on the `user_based` field of the `sim_options` parameter.

In [37]:
from surprise import KNNBasic

print('User-based KNNBasic method with cosine similarity, with stochastic gradient descent')
bsl_options = {'method': 'sgd',
               'n_epochs': 20,
               }
sim_options = {'name': 'cosine', 'user_based': True}
algo = KNNBasic(bsl_options=bsl_options, sim_options=sim_options)

# Train the algorithm on the trainset, and predict ratings for the testset
algo.fit(trainset)
predictions = algo.test(testset)

# Then compute RMSE
accuracy.rmse(predictions)

# Then show some prediction examples
testPrediction(algo)

User-based KNNBasic method with cosine similarity, with stochastic gradient descent
Computing the cosine similarity matrix...
Done computing similarity matrix.
RMSE: 1.0219
user: 312        item: 152        r_ui = 2.00   est = 3.65   
user: 940        item: 95         r_ui = 5.00   est = 4.05   
user: 881        item: 161        r_ui = 3.00   est = 3.80   
user: 436        item: 234        r_ui = 3.00   est = 3.85   
user: 398        item: 514        r_ui = 4.00   est = 4.28   
user: 790        item: 763        r_ui = 3.00   est = 3.58   
user: 104        item: 290        r_ui = 4.00   est = 3.33   
user: 32         item: 271        r_ui = 3.00   est = 3.25   
user: 189        item: 1099       r_ui = 5.00   est = 3.36   
user: 196        item: 302        r_ui = 4.00   est = 4.08   


### KNN with Means
A basic collaborative filtering algorithm, taking into account the mean
    ratings of each user.
The prediction $\hat{r}_{ui}$ is set as:

$$\hat{r}_{ui} = \mu_u + \frac{ \sum\limits_{v \in N^k_i(u)}
\text{sim}(u, v) \cdot (r_{vi} - \mu_v)} {\sum\limits_{v \in
N^k_i(u)} \text{sim}(u, v)}$$

$$\hat{r}_{ui} = \mu_i + \frac{ \sum\limits_{j \in N^k_u(i)}
\text{sim}(i, j) \cdot (r_{uj} - \mu_j)} {\sum\limits_{j \in
N^k_u(i)} \text{sim}(i, j)}$$

depending on the `user_based` field of the `sim_options` parameter.

In [50]:
from surprise import KNNWithMeans

print('User-based method with cosine similarity, with ALS')
bsl_options = {'method': 'als',
               'n_epochs': 20,
               }
sim_options = {'name': 'cosine', 'user_based': True}
algo = KNNWithMeans(bsl_options=bsl_options, sim_options=sim_options)

# Train the algorithm on the trainset, and predict ratings for the testset
algo.fit(trainset)
predictions = algo.test(testset)

# Then compute RMSE
accuracy.rmse(predictions)

# Then show some prediction examples
testPrediction(algo)

User-based method with cosine similarity, with ALS
Computing the cosine similarity matrix...
Done computing similarity matrix.
RMSE: 0.9582
user: 312        item: 152        r_ui = 2.00   est = 4.37   
user: 940        item: 95         r_ui = 5.00   est = 3.65   
user: 881        item: 161        r_ui = 3.00   est = 3.29   
user: 436        item: 234        r_ui = 3.00   est = 3.70   
user: 398        item: 514        r_ui = 4.00   est = 4.05   
user: 790        item: 763        r_ui = 3.00   est = 2.88   
user: 104        item: 290        r_ui = 4.00   est = 2.31   
user: 32         item: 271        r_ui = 3.00   est = 3.08   
user: 189        item: 1099       r_ui = 5.00   est = 3.97   
user: 196        item: 302        r_ui = 4.00   est = 4.20   


### KNN with ZScore
A basic collaborative filtering algorithm, taking into account
    the z-score normalization of each user.

The prediction $\hat{r}_{ui}$ is set as:

$$\hat{r}_{ui} = \mu_u + \sigma_u \frac{ \sum\limits_{v \in N^k_i(u)}
\text{sim}(u, v) \cdot (r_{vi} - \mu_v) / \sigma_v} {\sum\limits_{v
\in N^k_i(u)} \text{sim}(u, v)}$$

or
$$\hat{r}_{ui} = \mu_i + \sigma_i \frac{ \sum\limits_{j \in N^k_u(i)}
\text{sim}(i, j) \cdot (r_{uj} - \mu_j) / \sigma_j} {\sum\limits_{j
\in N^k_u(i)} \text{sim}(i, j)}$$

    depending on the ``user_based`` field of the ``sim_options`` parameter.

In [41]:
from surprise import KNNWithZScore

print('Item-based method with pearson baseline similarity, with ALS')
bsl_options = {'method': 'als',
               'n_epochs': 20,
               }
sim_options = {'name': 'pearson_baseline', 'user_based': False}
algo = KNNWithZScore(bsl_options=bsl_options, sim_options=sim_options)

# Train the algorithm on the trainset, and predict ratings for the testset
algo.fit(trainset)
predictions = algo.test(testset)

# Then compute RMSE
accuracy.rmse(predictions)

# Then show some prediction examples
testPrediction(algo)

Item-based method with pearson baseline similarity, with ALS
Estimating biases using als...
Computing the pearson_baseline similarity matrix...
Done computing similarity matrix.
RMSE: 0.9283
user: 312        item: 152        r_ui = 2.00   est = 3.97   
user: 940        item: 95         r_ui = 5.00   est = 3.68   
user: 881        item: 161        r_ui = 3.00   est = 3.59   
user: 436        item: 234        r_ui = 3.00   est = 3.92   
user: 398        item: 514        r_ui = 4.00   est = 3.80   
user: 790        item: 763        r_ui = 3.00   est = 3.17   
user: 104        item: 290        r_ui = 4.00   est = 2.39   
user: 32         item: 271        r_ui = 3.00   est = 3.11   
user: 189        item: 1099       r_ui = 5.00   est = 3.53   
user: 196        item: 302        r_ui = 4.00   est = 3.92   


### KNN Baseline and Neighborhood analysis
A basic collaborative filtering algorithm taking into account a
    *baseline* rating.

The prediction $\hat{r}_{ui}$ is set as:

$$\hat{r}_{ui} = b_{ui} + \frac{ \sum\limits_{v \in N^k_i(u)}
\text{sim}(u, v) \cdot (r_{vi} - b_{vi})} {\sum\limits_{v \in
N^k_i(u)} \text{sim}(u, v)}$$

or

$$\hat{r}_{ui} = b_{ui} + \frac{ \sum\limits_{j \in N^k_u(i)}
\text{sim}(i, j) \cdot (r_{uj} - b_{uj})} {\sum\limits_{j \in
N^k_u(i)} \text{sim}(i, j)}$$

    depending on the `user_based` field of the `sim_options` parameter.

In [42]:
import io

from surprise import KNNBaseline
from surprise import Dataset
from surprise import get_dataset_dir

def read_item_names():
    """Read the u.item file from MovieLens 100-k dataset and return two
    mappings to convert raw ids into movie names and movie names into raw ids.
    """

    file_name = get_dataset_dir() + '/ml-100k/ml-100k/u.item'
    rid_to_name = {}
    name_to_rid = {}
    with io.open(file_name, 'r', encoding='ISO-8859-1') as f:
        for line in f:
            line = line.split('|')
            rid_to_name[line[0]] = line[1]
            name_to_rid[line[1]] = line[0]

    return rid_to_name, name_to_rid

# First, train the algortihm to compute the similarities between items
#data = Dataset.load_builtin('ml-100k')
#trainset = data.build_full_trainset()
sim_options = {'name': 'pearson_baseline', 'user_based': False}
algo = KNNBaseline(sim_options=sim_options)
algo.fit(trainset)
predictions = algo.test(testset)

# Then compute RMSE
accuracy.rmse(predictions)

# Then show some prediction examples
testPrediction(algo)

Estimating biases using als...
Computing the pearson_baseline similarity matrix...
Done computing similarity matrix.
RMSE: 0.9184
user: 312        item: 152        r_ui = 2.00   est = 3.98   
user: 940        item: 95         r_ui = 5.00   est = 3.69   
user: 881        item: 161        r_ui = 3.00   est = 3.58   
user: 436        item: 234        r_ui = 3.00   est = 3.90   
user: 398        item: 514        r_ui = 4.00   est = 3.80   
user: 790        item: 763        r_ui = 3.00   est = 3.13   
user: 104        item: 290        r_ui = 4.00   est = 2.36   
user: 32         item: 271        r_ui = 3.00   est = 3.21   
user: 189        item: 1099       r_ui = 5.00   est = 3.66   
user: 196        item: 302        r_ui = 4.00   est = 3.95   


#### Neighborhood analysis
**Nearest Neighbor**: Next Let's take a look at what are the 10 nearest neighbors of the movie `Toy Story`

In [43]:
# Read the mappings raw id <-> movie name
rid_to_name, name_to_rid = read_item_names()

# Retrieve inner id of the movie Toy Story
toy_story_raw_id = name_to_rid['Toy Story (1995)']
toy_story_inner_id = algo.trainset.to_inner_iid(toy_story_raw_id)

# Retrieve inner ids of the nearest neighbors of Toy Story.
toy_story_neighbors = algo.get_neighbors(toy_story_inner_id, k=10)

# Convert inner ids of the neighbors into names.
toy_story_neighbors = (algo.trainset.to_raw_iid(inner_id)
                       for inner_id in toy_story_neighbors)
toy_story_neighbors = (rid_to_name[rid]
                       for rid in toy_story_neighbors)

print()
print('The 10 nearest neighbors of Toy Story are:')
for movie in toy_story_neighbors:
    print(movie)



The 10 nearest neighbors of Toy Story are:
Liar Liar (1997)
Jurassic Park (1993)
Craft, The (1996)
Raiders of the Lost Ark (1981)
Aladdin (1992)
That Thing You Do! (1996)
Mission: Impossible (1996)
Indiana Jones and the Last Crusade (1989)
Lion King, The (1994)
Beauty and the Beast (1991)


### SlopeOne

A simple yet accurate collaborative filtering algorithm.

This is a straightforward implementation of the SlopeOne algorithm

The prediction $\hat{r}_{ui}$ is set as:


$$\hat{r}_{ui} = \mu_u + \frac{1}{
|R_i(u)|}
\sum\limits_{j \in R_i(u)} \text{dev}(i, j),$$

where $R_i(u)$ is the set of relevant items, i.e. the set of items
$j$ rated by $u$ that also have at least one common user with
$i$. $\text{dev}_(i, j)$ is defined as the average difference
between the ratings of $i$ and those of $j$:


In [54]:
from surprise import SlopeOne
algo = SlopeOne()

# Train the algorithm on the trainset, and predict ratings for the testset
algo.fit(trainset)
predictions = algo.test(testset)

# Then compute RMSE
accuracy.rmse(predictions)

# Then show some prediction examples
testPrediction(algo)

RMSE: 0.9476
user: 312        item: 152        r_ui = 2.00   est = 4.24   
user: 940        item: 95         r_ui = 5.00   est = 3.29   
user: 881        item: 161        r_ui = 3.00   est = 3.31   
user: 436        item: 234        r_ui = 3.00   est = 4.13   
user: 398        item: 514        r_ui = 4.00   est = 3.72   
user: 790        item: 763        r_ui = 3.00   est = 3.02   
user: 104        item: 290        r_ui = 4.00   est = 2.20   
user: 32         item: 271        r_ui = 3.00   est = 3.09   
user: 189        item: 1099       r_ui = 5.00   est = 3.70   
user: 196        item: 302        r_ui = 4.00   est = 4.26   


## Matrix Factorization

### Baseline Algorithm

Typical CF data exhibit large user and item effects—systematic tendencies for
some users to give higher ratings than others—and for some items to receive
higher ratings than others. It is customary to adjust the data by accounting for
these effects, which we encapsulate within the baseline estimates. 
Denote by $μ$ the overall average rating. 

A baseline estimate for an unknown rating $r_{ui}$ is
denoted by $b_{ui}$ and accounts for the user and item effects:

$b_{ui} = μ + b_u + b_i$ 
The parameters $b_u$ and $b_i$ indicate the observed deviations of user u and item i,
respectively, from the average. For example, suppose that we want a baseline
estimate for the rating of the movie Titanic by user Joe. Now, say that the
average rating over all movies, $μ$, is 3.7 stars. Furthermore, Titanic is better
than an average movie, so it tends to be rated 0.5 stars above the average. On
the other hand, Joe is a critical user, who tends to rate 0.3 stars lower than the
average. Thus, the baseline estimate for Titanic’s rating by Joe would be 3.9
stars by calculating 3.7−0.3+0.5. In order to estimate bu and bi one can solve
the least squares problem. 

More details in the theory section


In [51]:
from surprise import BaselineOnly

# Example using Baseline ALS
print('Using ALS')
bsl_options = {'method': 'als',
               'n_epochs': 5,
               'reg_u': 12,
               'reg_i': 5
               }
algo = BaselineOnly(bsl_options=bsl_options)

# Train the algorithm on the trainset, and predict ratings for the testset
algo.fit(trainset)
predictions = algo.test(testset)

# Then compute RMSE
accuracy.rmse(predictions)

# Then show some prediction examples
testPrediction(algo)

Using ALS
Estimating biases using als...
RMSE: 0.9417
user: 312        item: 152        r_ui = 2.00   est = 4.19   
user: 940        item: 95         r_ui = 5.00   est = 3.46   
user: 881        item: 161        r_ui = 3.00   est = 3.28   
user: 436        item: 234        r_ui = 3.00   est = 4.00   
user: 398        item: 514        r_ui = 4.00   est = 3.83   
user: 790        item: 763        r_ui = 3.00   est = 3.03   
user: 104        item: 290        r_ui = 4.00   est = 2.36   
user: 32         item: 271        r_ui = 3.00   est = 3.17   
user: 189        item: 1099       r_ui = 5.00   est = 3.74   
user: 196        item: 302        r_ui = 4.00   est = 4.13   


### SVD

The famous *SVD* algorithm, as popularized by Simon Funk during the Netflix
    Prize. 
The prediction $\hat{r}_{ui}$ is set as:

$$\hat{r}_{ui} = \mu + b_u + b_i + q_i^Tp_u$$

If user $u$ is unknown, then the bias $b_u$ and the factors
$p_u$ are assumed to be zero. The same applies for item $i$
with $b_i$ and $q_i$.

To estimate all the unknown, we minimize the following regularized squared
error:

$$\sum_{r_{ui} \in R_{train}} \left(r_{ui} - \hat{r}_{ui} \right)^2 +
\lambda\left(b_i^2 + b_u^2 + ||q_i||^2 + ||p_u||^2\right)$$

The minimization is performed by a very straightforward stochastic gradient
descent:

$$
b_u \leftarrow b_u + \gamma (e_{ui} - \lambda b_u)\\
b_i \leftarrow b_i + \gamma (e_{ui} - \lambda b_i)\\
p_u \leftarrow p_u + \gamma (e_{ui} \cdot q_i - \lambda p_u)\\
q_i \leftarrow q_i + \gamma (e_{ui} \cdot p_u - \lambda q_i)$$

where $e_{ui} = r_{ui} - \hat{r}_{ui}$. These steps are performed
over all the ratings of the trainset and repeated $n_epochs$ times.
Baselines are initialized to $0$. User and item factors are randomly
initialized according to a normal distribution, which can be tuned using
the $init_mean$ and $init_std_dev$ parameters.

You also have control over the learning rate $\gamma$ and the
regularization term $\lambda$. Both can be different for each
kind of parameter (see below). By default, learning rates are set to
$0.005$ and regularization terms are set to $0.02$.


In [52]:
from surprise import SVD

# We'll use the famous SVD algorithm.
algo = SVD(n_factors=100, lr_all=0.005, reg_all=0.02, verbose=True)

# Train the algorithm on the trainset, and predict ratings for the testset
algo.fit(trainset)
predictions = algo.test(testset)

# Then compute RMSE
accuracy.rmse(predictions)

# Then show some prediction examples
testPrediction(algo)

Processing epoch 0
Processing epoch 1
Processing epoch 2
Processing epoch 3
Processing epoch 4
Processing epoch 5
Processing epoch 6
Processing epoch 7
Processing epoch 8
Processing epoch 9
Processing epoch 10
Processing epoch 11
Processing epoch 12
Processing epoch 13
Processing epoch 14
Processing epoch 15
Processing epoch 16
Processing epoch 17
Processing epoch 18
Processing epoch 19
RMSE: 0.9345
user: 312        item: 152        r_ui = 2.00   est = 3.91   
user: 940        item: 95         r_ui = 5.00   est = 3.48   
user: 881        item: 161        r_ui = 3.00   est = 3.41   
user: 436        item: 234        r_ui = 3.00   est = 3.87   
user: 398        item: 514        r_ui = 4.00   est = 3.68   
user: 790        item: 763        r_ui = 3.00   est = 3.23   
user: 104        item: 290        r_ui = 4.00   est = 2.33   
user: 32         item: 271        r_ui = 3.00   est = 3.15   
user: 189        item: 1099       r_ui = 5.00   est = 3.52   
user: 196        item: 302        r_ui 

### SVD++

The *SVD++* algorithm, an extension of $SVD$ taking into account
    implicit ratings.

The prediction $\hat{r}_{ui}$ is set as:


$$\hat{r}_{ui} = \mu + b_u + b_i + q_i^T\left(p_u +
|I_u|^{-\frac{1}{2}} \sum_{j \\in I_u}y_j\right)$$

Where the $y_j$ terms are a new set of item factors that capture
implicit ratings. Here, an implicit rating describes the fact that a user
$u$ rated an item $j$, regardless of the rating value.

If user $u$ is unknown, then the bias $b_u$ and the factors
$p_u$ are assumed to be zero. The same applies for item $i$
with $b_i$, $q_i$ and $y_i$.

Just as for $SVD$, the parameters are learned using a SGD on the
regularized squared error objective.

Baselines are initialized to $0$. User and item factors are randomly
initialized according to a normal distribution, which can be tuned using
the `init_mean` and `init_std_dev` parameters.

You have control over the learning rate $\gamma$ and the
regularization term $\lambda$. Both can be different for each
kind of parameter (see below). By default, learning rates are set to
$0.005$ and regularization terms are set to $0.02$.


In [53]:
from surprise import SVDpp
algo = SVDpp(n_factors=20, lr_all=0.007, reg_all=0.02, verbose=True)

# Train the algorithm on the trainset, and predict ratings for the testset
algo.fit(trainset)
predictions = algo.test(testset)

# Then compute RMSE
accuracy.rmse(predictions)

# Then show some prediction examples
testPrediction(algo)

 processing epoch 0
 processing epoch 1
 processing epoch 2
 processing epoch 3
 processing epoch 4
 processing epoch 5
 processing epoch 6
 processing epoch 7
 processing epoch 8
 processing epoch 9
 processing epoch 10
 processing epoch 11
 processing epoch 12
 processing epoch 13
 processing epoch 14
 processing epoch 15
 processing epoch 16
 processing epoch 17
 processing epoch 18
 processing epoch 19
RMSE: 0.9236
user: 312        item: 152        r_ui = 2.00   est = 4.35   
user: 940        item: 95         r_ui = 5.00   est = 3.62   
user: 881        item: 161        r_ui = 3.00   est = 3.51   
user: 436        item: 234        r_ui = 3.00   est = 3.63   
user: 398        item: 514        r_ui = 4.00   est = 3.68   
user: 790        item: 763        r_ui = 3.00   est = 3.40   
user: 104        item: 290        r_ui = 4.00   est = 2.06   
user: 32         item: 271        r_ui = 3.00   est = 3.11   
user: 189        item: 1099       r_ui = 5.00   est = 3.52   
user: 196        it

<i>Modified by Xiaohan Zhang, based on the original version of </i>

<i>Copyright 2015, Nicolas Hug</i>