# Collaborative Filtering in Turi (formerly Dato, Formerly GraphLab)

This tutorial explains methods of collaborative filtering for recommender systems using the graphlab create package (from the company Dato). Many of the examples are manipulated versions of the the following basic tutorials:
- https://dato.com/learn/gallery/notebooks/basic_recommender_functionalities.html 
- https://dato.com/learn/gallery/notebooks/five_line_recommender.html

Furthermore, Dato has plenty of iPython notebook examples to look through that do more than just reccomendation systems, including classification, clustering, and graph analytics. 
- https://dato.com/learn/gallery/index.html

## The five line recommendation system (user-item)
This example will build a recommendation system for movie ratings given the following dataset of users and movie ratings. It is explained in detail at https://dato.com/learn/gallery/notebooks/five_line_recommender.html. This example hides much of the functionality and fine tuning possible, but works nicely for starting out with.

The dataset in this example comes from ~330 users that have rated ~7700 movies (a total of ~82,000 ratings).

In [1]:
# This is a well known graphlab example that builds a recommendation system in 5 lines of code

import graphlab as gl

data = gl.SFrame.read_csv("http://s3.amazonaws.com/dato-datasets/movie_ratings/training_data.csv", 
                          column_type_hints={"rating":int})
model = gl.recommender.create(data, user_id="user", item_id="movie", target="rating")
results = model.recommend(users=None, k=5)
model.save("my_model")

results.head() # the recommendation output


A newer version of GraphLab Create (v2.1) is available! Your current version is v1.8.5.

You can use pip to upgrade the graphlab-create package. For more information see https://dato.com/products/create/upgrade.
2016-08-03 13:59:53,837 [INFO] graphlab.cython.cy_server, 176: GraphLab Create v1.8.5 started. Logging: /tmp/graphlab_server_1470250791.log


This non-commercial license of GraphLab Create for academic use is assigned to eclarson@smu.edu and will expire on November 20, 2016.


user,movie,score,rank
Jacob Smith,The Magnificent Seven,4.78280519045,1
Jacob Smith,Citizen Kane,4.72507308519,2
Jacob Smith,Elf,4.69207881487,3
Jacob Smith,West Side Story,4.64565228021,4
Jacob Smith,Moonstruck,4.50000594652,5
Mason Smith,Roger & Me,5.67432807482,1
Mason Smith,Doctor Zhivago,4.99782393968,2
Mason Smith,Gandhi,4.75535653627,3
Mason Smith,Cool Hand Luke,4.6927850107,4
Mason Smith,The Pianist,4.66689322985,5


In the above model creation, we have found the top five highest ranking items for each user. Two users are shown with their corresponding highest ranking items (that they have not rated).
___

In [2]:
data.head()

user,movie,rating
Jacob Smith,Flirting with Disaster,4
Jacob Smith,Indecent Proposal,3
Jacob Smith,Runaway Bride,2
Jacob Smith,Swiss Family Robinson,1
Jacob Smith,The Mexican,2
Jacob Smith,Maid in Manhattan,4
Jacob Smith,A Charlie Brown Thanksgiving / The ...,3
Jacob Smith,Brazil,1
Jacob Smith,Forrest Gump,3
Jacob Smith,It Happened One Night,4


That's great!! But we really do not know how good these results are, so let's keep moving and we will come back, but using cross-validation. 


## The item-item recommendation system
No let's look at creating the item-item similarity matrix. That is, for each item, what are the top closest items based upon user ratings.

In [3]:
# from graphlab.recommender import item_similarity_recommender

item_item = gl.recommender.item_similarity_recommender.create(data, 
                                  user_id="user", 
                                  item_id="movie", 
                                  target="rating",
                                  only_top_k=3,
                                  similarity_type="cosine")

results = item_item.get_similar_items(k=3)
results.head()

movie,similar,score,rank
Flirting with Disaster,Martin Lawrence: You So Crazy ...,0.561863587262,1
Flirting with Disaster,Shadow Magic,0.535303379031,2
Flirting with Disaster,Seinfeld: Season 4,0.507150516208,3
Indecent Proposal,Cocktail,0.568772522656,1
Indecent Proposal,Beverly Hills Cop,0.516246885143,2
Indecent Proposal,Flatliners,0.513955034568,3
Runaway Bride,Notting Hill,0.61341356583,1
Runaway Bride,Sleepless in Seattle,0.609021736748,2
Runaway Bride,Maid in Manhattan,0.608688789629,3
Swiss Family Robinson,Armed and Dangerous,0.483493778415,1


The item-item matrix is typically a good baseline. However, we can do better with a more personalized system. Something that takes into account the various preferences of specific users, rather than all users rating specific items. 
___
Moreover, we need to be performing cross validation of the data set to see what model and model parameters actually generalize well with out dataset. That also means we need a set of evaluation criteria. The first and very common measuer is the root mean squared error, RMSE. It takes into account the difference between the predicted rating and the actual rating of items. However, we can calculate it in a number of different aggregated ways (i.e., splits and aggregation). For instance, we could just take the average RMSE of every entry in the dataset. Or, we could take the average RMSE for each user, or the average RMSE for each item. Ite really depends on what we are most interested in (i.e., out business case). RMSE can be calculated in the following ways:

$$RMSE=\sqrt{\frac{1}{N}\sum_{i=1}^N (\hat{y}_i-y_i)^2}$$

Or we can calculate the RMSE for each user, U, in our data:

$$\underbrace{RMSE(U)}_{\text{user=U}}=\sqrt{\frac{1}{|U|}\sum_{u\in U} (\hat{y}_u-y_u)^2}$$

Or we can calculate the RMSE for each item, J, in our data:

$$\underbrace{RMSE(J)}_{\text{item=J}}=\sqrt{\frac{1}{|J|}\sum_{j\in J} (\hat{y}_j-y_j)^2}$$

It's importatn to understand that RMSE(U) and RMSE(J) are arrays of averages, the size of the unique number of users or unique number of items, respectively. Therefore an approach that visualizes the distribution of values is a nice evaluation technique. It also means that statistical tests of the distributions can be used to evaluate the differences of the models. That is, "Model A has statistically smaller (with 95% confidence) per user RMSE than model B, thereofore we conclude that model A has superior performance."


So let's now create a holdout set and see if we can judge the RMSE on a per-user and per-item basis:

In [4]:
train, test = gl.recommender.util.random_split_by_user(data,
                                                    user_id="user", item_id="movie",
                                                    max_num_users=100, item_test_proportion=0.2)

In [5]:
from IPython.display import display
from IPython.display import Image

gl.canvas.set_target('ipynb')


item_item = gl.recommender.item_similarity_recommender.create(train, 
                                  user_id="user", 
                                  item_id="movie", 
                                  target="rating",
                                  only_top_k=5,
                                  similarity_type="cosine")

rmse_results = item_item.evaluate(test)



Precision and recall summary statistics by cutoff
+--------+-----------------+------------------+
| cutoff |  mean_precision |   mean_recall    |
+--------+-----------------+------------------+
|   1    |       0.07      | 0.00190169183771 |
|   2    |      0.045      | 0.00210361491463 |
|   3    |       0.04      | 0.00288868294107 |
|   4    |      0.035      | 0.00319547451016 |
|   5    |       0.03      | 0.00363025711886 |
|   6    | 0.0266666666667 | 0.00369561659598 |
|   7    | 0.0228571428571 | 0.00369561659598 |
|   8    |      0.0225     | 0.00396347373884 |
|   9    | 0.0211111111111 | 0.00410633088169 |
|   10   |       0.02      | 0.00435023332072 |
+--------+-----------------+------------------+
[10 rows x 3 columns]

('\nOverall RMSE: ', 1.2578872524956006)

Per User RMSE (best)
+----------------+-------+----------------+
|      user      | count |      rmse      |
+----------------+-------+----------------+
| Jeremiah Smith |   37  | 0.574666049423 |
+--------------

In [6]:
print rmse_results.viewkeys()
print rmse_results['rmse_by_item']

dict_keys(['rmse_by_user', 'precision_recall_overall', 'rmse_by_item', 'precision_recall_by_user', 'rmse_overall'])
+------------------------------+-------+---------------+
|            movie             | count |      rmse     |
+------------------------------+-------+---------------+
|       Steel Magnolias        |   4   | 1.66249218731 |
| Monty Python's Life of Brian |   1   |      2.0      |
|   Crimes and Misdemeanors    |   2   | 1.87781068339 |
|    The Mothman Prophecies    |   1   |      0.8      |
|         ER: Season 1         |   1   |      0.0      |
|        Donnie Brasco         |   1   | 1.65979858121 |
|           Eurotrip           |   1   | 1.83333333333 |
|     Cast a Giant Shadow      |   1   |      2.0      |
|          Scooby-Doo          |   1   |      1.68     |
|          Idle Hands          |   1   |      2.0      |
+------------------------------+-------+---------------+
[2361 rows x 3 columns]
Note: Only the head of the SFrame is printed.
You can use prin

In [7]:
rmse_results['rmse_by_user']

user,count,rmse
Tucker Smith,12,0.769163901589
Donovan Smith,37,0.938261979014
Alan Smith,44,1.22698029942
Oliver Smith,18,1.49780057016
Simon Smith,14,1.24891354412
Paxton Smith,12,1.2810697566
Nicholas Smith,54,1.03665832534
Martin Smith,10,0.883952410217
Hunter Smith,50,1.34829799717
Preston Smith,94,1.13238543728


___
Another evaluation criterion is the per-user-recall or the per-user-precision. These are typically smaller values because they require users with a large number of ratings. The idea behind them is that, given a number of highly rated items for a user, how many of them did my model also recommend. This is inherently difficult to calculate because the user has not rated every item in the dataset---we may have found 10 items that the user would have chosen and rated highly, but if the user never rated them, we can't be sure how good we are recommending them. 

Even still, its a good measure of how well you are rating the items that are most important to the user (assuming the user rated items they had strong opinions about). Its not perfect, but its the best we have to work with.

We define the per user measures as follows: Let $p_k$ be a vector of the $k$ highest ranked recommendations for a particular user and let $a$ be the set of all positively ranked items for that user in the test set. 

The per-user-recall for k-items is given by:

$$R(k)=\frac{|a \cap p_k|}{|a|} $$

Which means, intuitively, "of all the items rated positively by the user, how many did your recommender find?"

The per-user-precision for k-items is given by:

$$P(k)=\frac{|a \cap p_k|}{k} $$

Which means, intuitively, "of the k items found by your recommender, how many were rated positively by the user?"

These, like per user RMSE, are arrays the same size as the uniqu number of users in the dataset. Therefore statistical comparisons can be completed to find superior performing models. 

In [8]:
rmse_results['precision_recall_by_user']

user,cutoff,precision,recall,count
Alan Smith,1,0.0,0.0,44
Alan Smith,2,0.0,0.0,44
Alan Smith,3,0.0,0.0,44
Alan Smith,4,0.0,0.0,44
Alan Smith,5,0.0,0.0,44
Alan Smith,6,0.0,0.0,44
Alan Smith,7,0.0,0.0,44
Alan Smith,8,0.0,0.0,44
Alan Smith,9,0.0,0.0,44
Alan Smith,10,0.0,0.0,44


In [9]:
import graphlab.aggregate as agg

# we will be using these aggregations
agg_list = [agg.AVG('precision'),agg.STD('precision'),agg.AVG('recall'),agg.STD('recall')]

# apply these functions to each group (we will group the results by 'k' which is the cutoff)
# the cutoff is the number of top items to look for see the following URL for the actual equation
# https://dato.com/products/create/docs/generated/graphlab.recommender.util.precision_recall_by_user.html#graphlab.recommender.util.precision_recall_by_user
rmse_results['precision_recall_by_user'].groupby('cutoff',agg_list)

# the groups are not sorted

cutoff,Avg of precision,Stdv of precision,Avg of recall,Stdv of recall
36,0.0108333333333,0.0199903526115,0.00936920726232,0.0194467870848
2,0.045,0.143090880213,0.00210361491463,0.00818731347399
46,0.0102173913043,0.0161089743566,0.0120366137415,0.0218091107228
31,0.0122580645161,0.023162997511,0.00923587392898,0.0194648510818
26,0.0130769230769,0.0267687886789,0.00746193048113,0.0170685839156
8,0.0225,0.0541410195693,0.00396347373884,0.0107982096549
5,0.03,0.0714142842854,0.00363025711886,0.0103872303323
16,0.015,0.033213325639,0.00474916494036,0.0118828709485
41,0.010487804878,0.0180075336208,0.0103778713532,0.0205000966979
4,0.035,0.0867467578645,0.00319547451016,0.00958950841791


Wow... these results appear to be not so great. Let's try something a little different and look to see if the results get better. Let's start with collaborative filtering to create the user-item matrix. 

___
## Cross Validated Collaborative Filtering

In [10]:
rec1 = gl.recommender.ranking_factorization_recommender.create(train, 
                                  user_id="user", 
                                  item_id="movie", 
                                  target="rating")

rmse_results = rec1.evaluate(test)


Precision and recall summary statistics by cutoff
+--------+----------------+------------------+
| cutoff | mean_precision |   mean_recall    |
+--------+----------------+------------------+
|   1    |      0.17      | 0.00299702822056 |
|   2    |     0.165      | 0.00655627349961 |
|   3    |      0.16      | 0.0111244515241  |
|   4    |      0.16      | 0.0150458772687  |
|   5    |     0.148      | 0.0171983520484  |
|   6    | 0.141666666667 | 0.0206636101303  |
|   7    | 0.131428571429 |  0.021766238041  |
|   8    |    0.13375     |  0.025625666838  |
|   9    | 0.134444444444 | 0.0283592316798  |
|   10   |     0.131      | 0.0311491054711  |
+--------+----------------+------------------+
[10 rows x 3 columns]

('\nOverall RMSE: ', 1.7271467332884194)

Per User RMSE (best)
+---------------+-------+----------------+
|      user     | count |      rmse      |
+---------------+-------+----------------+
| Griffin Smith |   27  | 0.882389367546 |
+---------------+-------+--------

In [11]:
rmse_results['precision_recall_by_user'].groupby('cutoff',[agg.AVG('precision'),agg.STD('precision'),agg.AVG('recall'),agg.STD('recall')])

cutoff,Avg of precision,Stdv of precision,Avg of recall,Stdv of recall
36,0.118055555556,0.0958534600443,0.0974187676501,0.0703924178075
2,0.165,0.300457983751,0.00655627349961,0.0143059035364
46,0.110869565217,0.0881708686551,0.115468625561,0.07235073341
31,0.121935483871,0.102373497481,0.0858717163917,0.0647644593016
26,0.125384615385,0.104996477822,0.0754942245077,0.0614768412593
8,0.13375,0.157375943206,0.025625666838,0.0334362811243
5,0.148,0.20123617965,0.0171983520484,0.0253049517641
16,0.12375,0.121185549881,0.0465722780621,0.0478052410924
41,0.113902439024,0.0913937946416,0.106767967381,0.0692225988265
4,0.16,0.230542837668,0.0150458772687,0.0234546883319


___
Okay, so we are getting better, but might need to tweak the results of the classifier by regularizing...
Remember that we need to come up with a good estimate of the latent factors and we need that matrix to be a good estiamte of the given ratings. We can control some of the parameters using regularization constants and increasing or decreasing the number of latent factors.

In [12]:
rec1 = gl.recommender.ranking_factorization_recommender.create(train, 
                                  user_id="user", 
                                  item_id="movie", 
                                  target="rating",
                                  num_factors=16,                 # override the default value
                                  regularization=1e-02,           # override the default value
                                  linear_regularization = 1e-3)   # override the default value

rmse_results = rec1.evaluate(test)


Precision and recall summary statistics by cutoff
+--------+-----------------+------------------+
| cutoff |  mean_precision |   mean_recall    |
+--------+-----------------+------------------+
|   1    |       0.14      | 0.00291074408546 |
|   2    |       0.1       | 0.00366679549581 |
|   3    | 0.0966666666667 | 0.00521308285449 |
|   4    |       0.11      | 0.00880874383823 |
|   5    |      0.106      | 0.0110975277486  |
|   6    |  0.111666666667 | 0.0148253837686  |
|   7    |       0.12      | 0.0177996836063  |
|   8    |     0.12125     | 0.0205013730449  |
|   9    |  0.124444444444 | 0.0240697975049  |
|   10   |      0.123      | 0.0273631452471  |
+--------+-----------------+------------------+
[10 rows x 3 columns]

('\nOverall RMSE: ', 1.035231464503951)

Per User RMSE (best)
+-------------+-------+----------------+
|     user    | count |      rmse      |
+-------------+-------+----------------+
| Devin Smith |   18  | 0.508582618684 |
+-------------+-------+-----

# Is this better then the item item matrix?

In [13]:
comparison = gl.recommender.util.compare_models(test, [item_item, rec1])

PROGRESS: Evaluate model M0

Precision and recall summary statistics by cutoff


+--------+-----------------+------------------+
| cutoff |  mean_precision |   mean_recall    |
+--------+-----------------+------------------+
|   1    |       0.06      | 0.00162279520112 |
|   2    |       0.05      | 0.00239478050205 |
|   3    | 0.0366666666667 | 0.00263868294107 |
|   4    |      0.035      | 0.00323118879587 |
|   5    |      0.034      | 0.00387418802455 |
|   6    | 0.0283333333333 | 0.00387418802455 |
|   7    | 0.0242857142857 | 0.00387418802455 |
|   8    |     0.02125     | 0.00387418802455 |
|   9    | 0.0211111111111 | 0.00415990231027 |
|   10   |       0.02      | 0.00440380474929 |
+--------+-----------------+------------------+
[10 rows x 3 columns]

('\nOverall RMSE: ', 1.2578872524956006)

Per User RMSE (best)
+----------------+-------+----------------+
|      user      | count |      rmse      |
+----------------+-------+----------------+
| Jeremiah Smith |   37  | 0.574666049423 |
+----------------+-------+----------------+
[1 rows x 3 columns]


In [14]:
 comparisonstruct = gl.compare(test,[item_item, rec1])

PROGRESS: Evaluate model M0

Precision and recall summary statistics by cutoff
+--------+-----------------+------------------+
| cutoff |  mean_precision |   mean_recall    |
+--------+-----------------+------------------+
|   1    |       0.06      | 0.00162279520112 |
|   2    |       0.05      | 0.00239478050205 |
|   3    | 0.0366666666667 | 0.00263868294107 |
|   4    |      0.035      | 0.00323118879587 |
|   5    |      0.034      | 0.00387418802455 |
|   6    | 0.0283333333333 | 0.00387418802455 |
|   7    | 0.0242857142857 | 0.00387418802455 |
|   8    |     0.02125     | 0.00387418802455 |
|   9    | 0.0211111111111 | 0.00415990231027 |
|   10   |       0.02      | 0.00440380474929 |
+--------+-----------------+------------------+
[10 rows x 3 columns]

PROGRESS: Evaluate model M1

Precision and recall summary statistics by cutoff
+--------+-----------------+------------------+
| cutoff |  mean_precision |   mean_recall    |
+--------+-----------------+------------------+
|  

In [15]:
gl.show_comparison(comparisonstruct,[item_item, rec1])

## Parameters, Parameters
There are so many parameters to search through here. It would be great if there as something we could do to change the parameters automatically and search through the best ones...

In [16]:
params = {'user_id': 'user', 
          'item_id': 'movie', 
          'target': 'rating',
          'num_factors': [8, 12, 16, 24, 32], 
          'regularization':[0.001] ,
          'linear_regularization': [0.001]}

job = gl.model_parameter_search.create( (train,test),
        gl.recommender.ranking_factorization_recommender.create,
        params,
        max_models=5,
        environment=None)

# also note thatthis evaluator also supports sklearn
# https://dato.com/products/create/docs/generated/graphlab.toolkits.model_parameter_search.create.html?highlight=model_parameter_search

2016-08-03 14:00:17,928 [INFO] graphlab.deploy.job, 22: Validating job.
2016-08-03 14:00:17,951 [INFO] graphlab.deploy.job, 36: Creating a LocalAsync environment called 'async'.
2016-08-03 14:00:17,966 [INFO] graphlab.deploy.map_job, 186: Validation complete. Job: 'Model-Parameter-Search-Aug-03-2016-14-00-1700000' ready for execution
2016-08-03 14:00:18,032 [INFO] graphlab.deploy.map_job, 192: Job: 'Model-Parameter-Search-Aug-03-2016-14-00-1700000' scheduled.
2016-08-03 14:00:30,126 [INFO] graphlab.deploy.job, 22: Validating job.
2016-08-03 14:00:30,130 [INFO] graphlab.deploy.map_job, 220: A job with name 'Model-Parameter-Search-Aug-03-2016-14-00-1700000' already exists. Renaming the job to 'Model-Parameter-Search-Aug-03-2016-14-00-1700000-e3c9a'.
2016-08-03 14:00:30,135 [INFO] graphlab.deploy.map_job, 186: Validation complete. Job: 'Model-Parameter-Search-Aug-03-2016-14-00-1700000-e3c9a' ready for execution
2016-08-03 14:00:30,191 [INFO] graphlab.deploy.map_job, 192: Job: 'Model-Param

In [17]:
job.get_status()

{'Canceled': 0, 'Completed': 0, 'Failed': 0, 'Pending': 5, 'Running': 0}

In [18]:
job_result = job.get_results()

job_result.head()

model_id,item_id,linear_regularization,max_iterations,num_factors,num_sampled_negative_exam ples ...,ranking_regularization
1,movie,0.001,25,16,8,0.1
0,movie,0.001,25,24,4,0.5
3,movie,0.001,25,8,8,0.1
2,movie,0.001,50,16,8,0.5
4,movie,0.001,50,8,4,0.1

regularization,target,user_id,training_precision@5,training_recall@5,training_rmse,validation_precision@5
0.001,rating,user,0.343113772455,0.00869091370757,0.964098834932,0.106
0.001,rating,user,0.343113772455,0.00869091370757,1.13933384162,0.114
0.001,rating,user,0.343113772455,0.00869091370757,0.964081584985,0.108
0.001,rating,user,0.359281437126,0.00899094052941,1.16103049893,0.122
0.001,rating,user,0.343113772455,0.00869091370757,0.959084075047,0.104

validation_recall@5,validation_rmse
0.0106330173105,0.977398134229
0.0119296089524,1.14548730295
0.0108655754501,0.977468225611
0.0130034775575,1.16375162021
0.0106855373368,0.973267103979


In [19]:
bst_prms = job.get_best_params()
bst_prms

{'item_id': 'movie',
 'linear_regularization': 0.001,
 'max_iterations': 50,
 'num_factors': 8,
 'num_sampled_negative_examples': 4,
 'ranking_regularization': 0.1,
 'regularization': 0.001,
 'target': 'rating',
 'user_id': 'user'}

In [20]:
models = job.get_models()
models

[Class                           : RankingFactorizationRecommender
 
 Schema
 ------
 User ID                         : user
 Item ID                         : movie
 Target                          : rating
 Additional observation features : 0
 Number of user side features    : 0
 Number of item side features    : 0
 
 Statistics
 ----------
 Number of observations          : 77009
 Number of users                 : 334
 Number of items                 : 7483
 
 Training summary
 ----------------
 Training time                   : 4.4224
 
 Model Parameters
 ----------------
 Model class                     : RankingFactorizationRecommender
 num_factors                     : 24
 binary_target                   : 0
 side_data_factorization         : 1
 solver                          : auto
 nmf                             : 0
 max_iterations                  : 25
 
 Regularization Settings
 -----------------------
 regularization                  : 0.001
 regularization_type          

In [21]:
comparisonstruct = gl.compare(test,models)
gl.show_comparison(comparisonstruct,models)

PROGRESS: Evaluate model M0

Precision and recall summary statistics by cutoff
+--------+----------------+------------------+
| cutoff | mean_precision |   mean_recall    |
+--------+----------------+------------------+
|   1    |      0.15      | 0.00305360122832 |
|   2    |      0.1       | 0.00366679549581 |
|   3    | 0.106666666667 | 0.00594223554843 |
|   4    |      0.11      | 0.00909281906628 |
|   5    |     0.114      | 0.0119296089524  |
|   6    | 0.118333333333 | 0.0144172264529  |
|   7    | 0.122857142857 | 0.0187525608118  |
|   8    |     0.1225     | 0.0209284730618  |
|   9    | 0.128888888889 | 0.0247432209283  |
|   10   |     0.125      | 0.0271196247477  |
+--------+----------------+------------------+
[10 rows x 3 columns]

PROGRESS: Evaluate model M1

Precision and recall summary statistics by cutoff
+--------+----------------+------------------+
| cutoff | mean_precision |   mean_recall    |
+--------+----------------+------------------+
|   1    |      0.13

In [22]:
models[2]

Class                           : RankingFactorizationRecommender

Schema
------
User ID                         : user
Item ID                         : movie
Target                          : rating
Additional observation features : 0
Number of user side features    : 0
Number of item side features    : 0

Statistics
----------
Number of observations          : 77009
Number of users                 : 334
Number of items                 : 7483

Training summary
----------------
Training time                   : 5.8533

Model Parameters
----------------
Model class                     : RankingFactorizationRecommender
num_factors                     : 16
binary_target                   : 0
side_data_factorization         : 1
solver                          : auto
nmf                             : 0
max_iterations                  : 50

Regularization Settings
-----------------------
regularization                  : 0.001
regularization_type             : normal
linear_regularization  