# **TuriCreate Recommender Model**
The following notebook includes an example of data import, model training and validation evaluation.

In [1]:
import turicreate as tc
import pandas as pd
import time
from skafossdk import *
import common.save_models as sm

  args, varargs, keywords, defaults = inspect.getargspec(method)


In [2]:
ska = Skafos()

2018-11-26 15:58:13,153 - skafossdk.data_engine - INFO - Connecting to DataEngine
2018-11-26 15:58:13,184 - skafossdk.data_engine - INFO - DataEngine Connection Opened


## **The Data**
There are 2 main data inputs for a recommender model:
- **items**: items that we want to recommend to a given user, ex. apples
- **actions**: actions users have taken on items, ex. John bought apples

In our example here, we have movies (items) and ratings by users on those movies(actions).

In [3]:
%%capture 
actions = tc.SFrame.read_csv('https://s3.amazonaws.com/skafos.example.data/MovieLensDataset/ml-20m/ratings.csv'); 
items = tc.SFrame.read_csv('https://s3.amazonaws.com/skafos.example.data/MovieLensDataset/ml-20m/movies.csv');

In [4]:
pd_actions = actions.to_dataframe()
sample_user_ids = pd.Series(pd_actions['userId'].unique()).sample(10000)
actions = tc.SFrame(pd_actions[pd_actions['userId'].isin(sample_user_ids)])

In [5]:
%%capture 
# split the training and validation sets up
training_data, validation_data = tc.recommender.util.random_split_by_user(actions, 'userId', 'movieId')

In [6]:
%%capture 
# build the recommender
model = tc.recommender.create(training_data, 'userId', 'movieId')

In [7]:
%%capture
# grab the results of the model
results = model.recommend();

In [8]:
# print the validation data
validation_data.print_rows(num_rows=10)

+--------+---------+--------+------------+
| userId | movieId | rating | timestamp  |
+--------+---------+--------+------------+
|  369   |    44   |  4.0   | 1130282278 |
|  369   |   1801  |  2.5   | 1130282941 |
|  369   |   1831  |  4.0   | 1130282735 |
|  369   |   2161  |  4.0   | 1130282360 |
|  369   |   2724  |  2.5   | 1130282878 |
|  369   |   3273  |  0.5   | 1130282851 |
|  369   |   3977  |  4.0   | 1130282236 |
|  369   |   4447  |  4.5   | 1130282964 |
|  369   |   5313  |  3.5   | 1130282887 |
|  369   |   5418  |  4.5   | 1130282343 |
+--------+---------+--------+------------+
[30434 rows x 4 columns]



In [9]:
# evaluate the model
model.evaluate(validation_data)





Precision and recall summary statistics by cutoff
+--------+---------------------+----------------------+
| cutoff |    mean_precision   |     mean_recall      |
+--------+---------------------+----------------------+
|   1    | 0.10800000000000004 | 0.002939350731763183 |
|   2    | 0.10649999999999996 | 0.006892096810675018 |
|   3    | 0.10966666666666673 | 0.012802113352985577 |
|   4    | 0.11649999999999994 | 0.019536039326366526 |
|   5    |  0.1194000000000001 |  0.0259580000104425  |
|   6    | 0.12349999999999992 | 0.03276071423429209  |
|   7    | 0.12757142857142842 | 0.04100380681337541  |
|   8    | 0.13400000000000006 | 0.05158009499066049  |
|   9    | 0.13899999999999987 | 0.06118391108310208  |
|   10   | 0.14099999999999996 | 0.06971178108248231  |
+--------+---------------------+----------------------+
[10 rows x 3 columns]



{'precision_recall_by_user': Columns:
 	userId	int
 	cutoff	int
 	precision	float
 	recall	float
 	count	int
 
 Rows: 18000
 
 Data:
 +--------+--------+-----------+--------+-------+
 | userId | cutoff | precision | recall | count |
 +--------+--------+-----------+--------+-------+
 |  369   |   1    |    0.0    |  0.0   |   17  |
 |  369   |   2    |    0.0    |  0.0   |   17  |
 |  369   |   3    |    0.0    |  0.0   |   17  |
 |  369   |   4    |    0.0    |  0.0   |   17  |
 |  369   |   5    |    0.0    |  0.0   |   17  |
 |  369   |   6    |    0.0    |  0.0   |   17  |
 |  369   |   7    |    0.0    |  0.0   |   17  |
 |  369   |   8    |    0.0    |  0.0   |   17  |
 |  369   |   9    |    0.0    |  0.0   |   17  |
 |  369   |   10   |    0.0    |  0.0   |   17  |
 +--------+--------+-----------+--------+-------+
 [18000 rows x 5 columns]
 Note: Only the head of the SFrame is printed.
 You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.,
 'precisio

In [10]:
# export to coreml
coreml_model_name = "recommender.mlmodel"
res = model.export_coreml(coreml_model_name)

# compress the model
compressed_model_name, compressed_model = sm.compress_model(coreml_model_name)

This model is exported as a custom Core ML model. In order to use it in your
application, you must also include "libRecommender.dylib". For additional
details see:
https://apple.github.io/turicreate/docs/userguide/recommender/coreml-deployment.html


In [11]:
# save to Skafos
sm.skafos_save_model(skafos = ska, model_name = compressed_model_name,
                     compressed_model = compressed_model,
                     permissions = 'public')

{'data': {'name': 'recommender.mlmodel.gzip', 'version': '1543248085661', 'tags': ['latest'], 'deployment_id': '59d7f942-986c-40d6-a453-81ec35f9b8fb', 'job_id': '', 'project_token': '101d675b749e5cfa9c08f65f', 'inserted_at': '2018-11-26T16:01:26Z'}, 'success': True, 'final': True}
