<i>Copyright (c) Microsoft Corporation. All rights reserved.</i>

<i>Licensed under the MIT License.</i>

# Neural Collaborative Filtering on Movielens dataset.

Neural Collaborative Filtering (NCF) is a well known recommendation algorithm that generalize the matrix factorization problem with multi-layer perceptron. 

This notebook provides an example of how to utilize and evaluate NCF implementation in the `reco_utils`. We use a smaller dataset in this example to run NCF efficiently with GPU acceleration on a [Data Science Virtual Machine](https://azure.microsoft.com/en-gb/services/virtual-machines/data-science-virtual-machines/).

In [11]:
import sys
sys.path.append("../../")
import time
import os
import pandas as pd
import numpy as np
import tensorflow as tf

from reco_utils.recommender.ncf.ncf_singlenode import NCF
from reco_utils.recommender.ncf.dataset import Dataset as NCFDataset
from reco_utils.dataset import movielens
from reco_utils.common.notebook_utils import is_jupyter
from reco_utils.dataset.python_splitters import python_chrono_split
from reco_utils.evaluation.python_evaluation import (rmse, mae, rsquared, exp_var, map_at_k, ndcg_at_k, precision_at_k, 
                                                     recall_at_k, get_top_k_items)

print("System version: {}".format(sys.version))
print("Pandas version: {}".format(pd.__version__))
print("Tensorflow version: {}".format(tf.__version__))

System version: 3.6.0 | packaged by conda-forge | (default, Feb  9 2017, 14:36:55) 
[GCC 4.8.2 20140120 (Red Hat 4.8.2-15)]
Pandas version: 0.23.4
Tensorflow version: 1.5.0


Set the default parameters.

In [12]:
# top k items to recommend
TOP_K = 10

# Select Movielens data size: 100k, 1m, 10m, or 20m
MOVIELENS_DATA_SIZE = '100k'

### 1. Download the MovieLens dataset

In [13]:
df = movielens.load_pandas_df(
    size=MOVIELENS_DATA_SIZE,
    header=["userID", "itemID", "rating", "timestamp"]
)

### 2. Split the data using the Spark chronological splitter provided in utilities

In [14]:
train, test = python_chrono_split(df, 0.75)

Generate an NCF dataset object from the data subsets.

In [15]:
data = NCFDataset(train=train, test=test, seed=123)

### 3. Train the NCF model on the training data, and get the top-k recommendations for our testing data

NCF is for implicity feedback typed recommender, and it generates prospensity of items to be recommended to users in the scale of 0 to 1. A recommended item list can then be generated based on the scores. 

In [17]:
model = NCF (
    n_users=data.n_users, 
    n_items=data.n_items,
    model_type="NeuMF",
    n_factors=4,
    layer_sizes=[16,8,4],
    n_epochs=200,
    batch_size=256,
    learning_rate=1e-3,
    verbose=10,
)

start_time = time.time()

model.fit(data)

train_time = time.time() - start_time

print("Took {} seconds for training.".format(train_time))

Training model: neumf
Epoch 10 [6.60s]: train_loss = 0.259253 
Epoch 20 [6.56s]: train_loss = 0.247687 
Epoch 30 [6.34s]: train_loss = 0.242632 
Epoch 40 [6.43s]: train_loss = 0.237161 
Epoch 50 [6.71s]: train_loss = 0.234029 
Epoch 60 [6.73s]: train_loss = 0.231767 
Epoch 70 [6.74s]: train_loss = 0.228516 
Epoch 80 [6.70s]: train_loss = 0.227151 
Epoch 90 [6.71s]: train_loss = 0.226980 
Epoch 100 [6.63s]: train_loss = 0.224559 
Epoch 110 [6.77s]: train_loss = 0.224069 
Epoch 120 [6.70s]: train_loss = 0.222753 
Epoch 130 [6.72s]: train_loss = 0.221978 
Epoch 140 [6.65s]: train_loss = 0.220951 
Epoch 150 [6.64s]: train_loss = 0.220079 
Epoch 160 [6.69s]: train_loss = 0.220264 
Epoch 170 [6.66s]: train_loss = 0.219022 
Epoch 180 [6.79s]: train_loss = 0.218591 
Epoch 190 [6.72s]: train_loss = 0.217091 
Epoch 200 [6.63s]: train_loss = 0.217476 
Took 1266.5468125343323 seconds for training.


In the movie recommendation use case scenario, seen movies are not recommended to the users.

In [18]:
start_time = time.time()

users, items, preds = [], [], []
item = list(train.itemID.unique())
for user in train.userID.unique():
    user = [user] * len(item) 
    users.extend(user)
    items.extend(item)
    preds.extend(list(model.predict(user, item, is_list=True)))

all_predictions = pd.DataFrame(data={"userID": users, "itemID":items, "prediction":preds})

merged = pd.merge(train, all_predictions, on=["userID", "itemID"], how="outer")
all_predictions = merged[merged.rating.isnull()].drop('rating', axis=1)

test_time = time.time() - start_time
print("Took {} seconds for prediction.".format(test_time))

Took 13.207754135131836 seconds for prediction.


### 4. Evaluate how well NCF performs

The ranking metrics are used for evaluation.

In [19]:
eval_map = map_at_k(test, all_predictions, col_prediction='prediction', k=TOP_K)
eval_ndcg = ndcg_at_k(test, all_predictions, col_prediction='prediction', k=TOP_K)
eval_precision = precision_at_k(test, all_predictions, col_prediction='prediction', k=TOP_K)
eval_recall = recall_at_k(test, all_predictions, col_prediction='prediction', k=TOP_K)

print("MAP:\t%f" % eval_map,
      "NDCG:\t%f" % eval_ndcg,
      "Precision@K:\t%f" % eval_precision,
      "Recall@K:\t%f" % eval_recall, sep='\n')

MAP:	0.048396
NDCG:	0.193969
Precision@K:	0.172428
Recall@K:	0.097780


In [None]:
if is_jupyter():
    # Record results with papermill for tests
    import papermill as pm
    pm.record("map", rank_eval.map_at_k())
    pm.record("ndcg", rank_eval.ndcg_at_k())
    pm.record("precision", rank_eval.precision_at_k())
    pm.record("recall", rank_eval.recall_at_k())
    pm.record("train_time", train_time)
    pm.record("test_time", test_time)