## Custom Recommender System

*Use Neural Collaborative Filtering to create movie recommendations*

#### Make sure you are using the following SageMaker configurations:
 - Kernel: Python 3.8, Tensorflow 2.6 CPU 
 - Instance Type: ml.m5.large

#### Run all

 - If you are in a SageMaker Notebook instance, you can go to the **Cell** tab and choose **Run All**
 - If you are in SageMaker Studio, you can go to the **Run** tab and choose **Run All Cells**

#### Contents
1. Background
2. Data preparation
3. NCF network in TensorFlow 2.0
4. Perform model training using script mode
5. Deploy the trained model using Amazon SageMaker hosting services as an endpoint
6. Run inference using the model endpoint

#### Setup

This solution relies on a configuration file to run the provisioned AWS resources. Run the following cells to generate the config file.

In [3]:
import boto3
import os
import json

In [4]:
client = boto3.client('servicecatalog')
cwd = os.getcwd().split('/')
i = cwd.index('S3Downloads')
pp_name = cwd[i + 1]
pp = client.describe_provisioned_product(Name=pp_name)
record_id = pp['ProvisionedProductDetail']['LastSuccessfulProvisioningRecordId']
record = client.describe_record(Id=record_id)

keys = [ x['OutputKey'] for x in record['RecordOutputs'] if 'OutputKey' and 'OutputValue' in x]
values = [ x['OutputValue'] for x in record['RecordOutputs'] if 'OutputKey' and 'OutputValue' in x]
stack_output = dict(zip(keys, values))

with open(f'/root/S3Downloads/{pp_name}/stack_outputs.json', 'w') as f:
    json.dump(stack_output, f)

In [5]:
import warnings
import json
import sagemaker

warnings.filterwarnings('ignore')
session = sagemaker.Session()

sagemaker_config = json.load(open("stack_outputs.json"))
role = sagemaker.get_execution_role()
solution_bucket = sagemaker_config["SolutionS3Bucket"]
region = sagemaker_config["AWSRegion"]
library_version = sagemaker_config["LibraryVersion"]
solution_name = sagemaker_config["SolutionName"]
bucket = sagemaker_config["S3Bucket"]
endpoint_name = sagemaker_config["SolutionPrefix"] + "-ncf-endpoint"

### Background
*This notebook is based on the [Building a customized recommender system in Amazon SageMaker](https://aws.amazon.com/blogs/machine-learning/building-a-customized-recommender-system-in-amazon-sagemaker/) blog post.*


[Recommender systems](https://en.wikipedia.org/wiki/Recommender_system) help you tailor customer experiences on online platforms. [Amazon Personalize](https://aws.amazon.com/personalize/) is an artificial intelligence and machine learning service that specializes in developing recommender system solutions. It automatically examines data, performs feature and algorithm selection, optimizes models based on data, and deploys and hosts models for real-time recommendation inference. However, if you need to access the weights for a trained model, you may need to build your recommender system from scratch. Use this solution to train and deploy a customized recommender system in TensorFlow 2.0, using a [Neural Collaborative Filtering](https://arxiv.org/abs/1708.05031) (NCF) (He et al., 2017) model on [Amazon SageMaker](https://aws.amazon.com/sagemaker/).

#### Understanding Neural Collaborative Filtering

A recommender system is a set of tools that helps provide users with a personalized experience by predicting user preference amongst a large number of options. Matrix factorization (MF) is a well-known approach to solving such a problem. Conventional MF solutions exploit explicit feedback in a linear fashion; explicit feedback consists of direct user preferences, such as ratings for movies on a five-star scale or binary preference on a product (like or not like). However, explicit feedback isn’t always present in datasets. NCF solves the absence of explicit feedback by only using implicit feedback, which is derived from user activity, such as clicks and views. In addition, NCF utilizes multi-layer perceptrons to introduce non-linearity into the solution.

#### Architecture overview

An NCF model contains two intrinsic sets of network layers: embedding and NCF layers. You use these layers to build a neural matrix factorization solution with two separate network architectures, generalized matrix factorization (GMF) and multi-layer perceptron (MLP), whose outputs are then concatenated as input for the final output layer. The following diagram from the original paper illustrates this architecture.

<img src="docs/ncf-architecture.jpeg" align="center"/>

### Data preparation

This solution uses the MovieLens dataset. [MovieLens](https://grouplens.org/datasets/movielens/) is a movie rating dataset provided by GroupLens, a research lab at the University of Minnesota. Run the following cells to download the dataset.

In [6]:
from sagemaker.s3 import S3Downloader

original_bucket = f"s3://{solution_bucket}-{region}/{library_version}/{solution_name}"
original_data_prefix = f"artifacts/dataset"
original_data = f"{original_bucket}/{original_data_prefix}"
print("original data: ")
S3Downloader.list(original_data)

original data: 


['s3://sagemaker-solutions-prod-eu-central-1/0.2.0/Customized-recommender-system/1.0.0/artifacts/dataset/README.txt',
 's3://sagemaker-solutions-prod-eu-central-1/0.2.0/Customized-recommender-system/1.0.0/artifacts/dataset/inference.npy',
 's3://sagemaker-solutions-prod-eu-central-1/0.2.0/Customized-recommender-system/1.0.0/artifacts/dataset/ratings.csv']

In [7]:
!aws s3 cp --recursive $original_data data

download: s3://sagemaker-solutions-prod-eu-central-1/0.2.0/Customized-recommender-system/1.0.0/artifacts/dataset/README.txt to data/README.txt
download: s3://sagemaker-solutions-prod-eu-central-1/0.2.0/Customized-recommender-system/1.0.0/artifacts/dataset/inference.npy to data/inference.npy
download: s3://sagemaker-solutions-prod-eu-central-1/0.2.0/Customized-recommender-system/1.0.0/artifacts/dataset/ratings.csv to data/ratings.csv


#### Explore the data

In [8]:
!cat data/README.txt

Summary

This dataset (ml-latest-small) describes 5-star rating and free-text tagging activity from [MovieLens](http://movielens.org), a movie recommendation service. It contains 100836 ratings and 3683 tag applications across 9742 movies. These data were created by 610 users between March 29, 1996 and September 24, 2018. This dataset was generated on September 26, 2018.

Users were selected at random for inclusion. All selected users had rated at least 20 movies. No demographic information is included. Each user is represented by an id, and no other information is provided.

The data are contained in the files `links.csv`, `movies.csv`, `ratings.csv` and `tags.csv`. More details about the contents and use of all these files follows.

This is a *development* dataset. As such, it may change over time and is not an appropriate dataset for shared research results. See available *benchmark* datasets if that is your intent.

This and other GroupLens data sets are publicly available for down

The model for this solution mainly uses `ratings.csv`, which contains four columns:
- `userId`
- `movieId`
- `rating`
- `timestamp`

#### Import dependencies

In [92]:
import sagemaker
import numpy as np
import pandas as pd

from typing import Tuple

#### Read data and perform train and test split

This solution uses `ratings.csv`, which contains explicit feedback data, as a proxy dataset to demonstrate the NCF solution. To fit this solution to your data, you need to define a metric to quantify a user rating and associate it with a data point.

In [161]:
# read the rating data
data_path = './data/actions.csv'
df = pd.read_csv(data_path)

students_unique, df_students = np.unique(df['student_id'], return_inverse=True)
df['student_id'] = df_students + 1

coaches_unique, df_coaches = np.unique(df['coach_id'], return_inverse=True)
df['coach_id'] = df_coaches + 1


In [162]:
# view part of the dataset
df

Unnamed: 0,student_id,coach_id,relevance
0,1,1,1
1,2,1,0
2,3,1,5
3,4,1,5
4,1,2,0
5,2,2,5
6,3,2,0
7,4,2,0
8,1,3,5
9,2,3,1


In [163]:
# figure out how to best divide the training and testing data
max_holdout = df.groupby('student_id').coach_id.nunique().min()
print(f'Maximum number of hold out portion: {max_holdout}')

Maximum number of hold out portion: 5


To perform a training and testing split, take the latest 10 items each user rated as the testing set and keep the rest as the training set:

In [164]:
def train_test_split(df: pd.DataFrame, holdout_num: int) -> Tuple[pd.DataFrame, pd.DataFrame]:
    """ perform training/testing split
    
    @param df: dataframe
    @param holdhout_num: number of items to be held out
    
    @return df_train: training data
    @return df_test testing data
    
    """
    # first sort the data by time
    df = df.sort_values(
        by=['student_id'], 
        ascending=[True]
    )
    
    # perform deep copy on the dataframe to avoid modification on the original dataframe
    df_train = df.copy(deep=True)
    df_test = df.copy(deep=True)
    
    # get test set
    df_test = df_test.groupby(['student_id']) \
        .head(holdout_num) \
        .reset_index()
    
    # get train set
    df_train = df_train.merge(
        df_test[['student_id', 'coach_id']].assign(remove=1),
        how='left'
    ).query('remove != 1') \
        .drop('remove', 1) \
        .reset_index(drop=True)
    
    # sanity check to make sure we're not duplicating/losing data
    assert len(df) == len(df_train) + len(df_test)
    
    return df_train, df_test

In [165]:
# df_train, df_test = train_test_split(df, holdout_num=1)
# print(df_test)

df_train = df.copy(deep=True)
df_test = df.copy(deep=True)

#### Perform negative sampling

Because a user rating of an item must be a positive label, there are no negative samples in the dataset. Negative samples are needed for model training. Therefore, we randomly sample `n` items from the unseen movie list for every user and record a rating of `0` to provide negative samples.

In [166]:
def negative_sampling(student_ids: list, coach_ids: list, items: list, n_neg: int) -> pd.DataFrame:
    """This function creates n_neg negative labels for every positive label
    
    @param student_ids: list of student ids
    @param coach_ids: list of coach ids
    @param items: unique list of coach ids
    @param n_neg: number of negative labels to sample
    
    @return df_neg: negative sample dataframe
    
    """
    
    neg = []
    ui_pairs = zip(student_ids, coach_ids)
    records = set(ui_pairs)
    
    # for every positive label case
    for (user_id, movie_id) in records:
        # generate n_neg negative labels
        for _ in range(n_neg):
            # if the randomly sampled movie exists for that user
            random_item = np.random.choice(items)
            while(user_id, random_item) in records:
                # resample
                random_item = np.random.choice(items)
            neg.append([user_id, random_item, 0])
            
    # convert to pandas dataframe for concatenation later
    df_neg = pd.DataFrame(neg, columns=['student_id', 'coach_id', 'relevance'])
    
    return df_neg

In [167]:
# create negative samples for training set
# neg_train = negative_sampling(
#     student_ids=df_train.student_id.values, 
#     coach_ids=df_train.coach_id.values,
#     items=df.coach_id.unique(),
#     n_neg=5
# )
neg_train = df_train #TODO: remove
print(neg_train)

    student_id  coach_id  relevance
0            1         1          1
1            2         1          0
2            3         1          5
3            4         1          5
4            1         2          0
5            2         2          5
6            3         2          0
7            4         2          0
8            1         3          5
9            2         3          1
10           3         3          0
11           4         3          0
12           1         4          5
13           2         4          4
14           3         4          0
15           4         4          0
16           1         5          0
17           2         5          2
18           3         5          0
19           4         5          0


In [168]:
print(f'created {neg_train.shape[0]:,} negative samples')

created 20 negative samples


In [169]:
# df_train = df_train[['student_id', 'coach_id']].assign(relevance=1)
# df_test = df_test[['student_id', 'coach_id']].assign(relevance=1)

# df_train = pd.concat([df_train, neg_train], ignore_index=True)

#### Calculate statistics

Explore your data by calculating the number of unique users and movies.

In [170]:
def get_unique_count(df: pd.DataFrame) -> Tuple[pd.Series, pd.Series]:
    """calculate unique user and movie counts"""
    return df.student_id.nunique(), df.coach_id.nunique()

In [171]:
# get unique number of users and movies in the whole dataset
get_unique_count(df)

(4, 5)

In [172]:
print(f'training set shape: {get_unique_count(df_train)}')
print(f'testing set shape: {get_unique_count(df_test)}')

training set shape: (4, 5)
testing set shape: (4, 5)


Next, calculate the number of unique users and the number of movies in your training data for later use.

In [173]:
# number of unique user and number of unique item/movie
n_user, n_item = get_unique_count(df_train)

print('number of unique users: ', n_user)
print('number of unique items: ', n_item)

number of unique users:  4
number of unique items:  5


#### Preprocess data and upload to S3

Run the cells in this section to perform the training and testing splits, negative sampling, and store the processed data in [Amazon Simple Storage Service](https://aws.amazon.com/s3/) (Amazon S3):

In [174]:
# get current session region
session = boto3.session.Session()
region = session.region_name
print(f'currently in {region}')

currently in eu-central-1


Use the Amazon SageMaker session’s default bucket to store processed data. The format of the default bucket name is `sagemaker-{region}–{aws-account-id}`.

In [175]:
sagemaker_session = sagemaker.Session()
bucket_name = sagemaker_session.default_bucket()
print(f'bucket name: {bucket_name}') 

bucket name: sagemaker-eu-central-1-342301825291


Upload your data to your Amazon S3 bucket.

In [176]:
# save data locally first
dest = 'data/s3'
train_path = os.path.join(dest, 'train.npy')
test_path = os.path.join(dest, 'test.npy')

# !mkdir {dest}
np.save(train_path, df_train.values)
print(df_train.values)
np.save(test_path, df_test.values)

# upload to S3 bucket (see the bucket name above)
sagemaker_session.upload_data(train_path, key_prefix='data')
sagemaker_session.upload_data(test_path, key_prefix='data')

[[1 1 1]
 [2 1 0]
 [3 1 5]
 [4 1 5]
 [1 2 0]
 [2 2 5]
 [3 2 0]
 [4 2 0]
 [1 3 5]
 [2 3 1]
 [3 3 0]
 [4 3 0]
 [1 4 5]
 [2 4 4]
 [3 4 0]
 [4 4 0]
 [1 5 0]
 [2 5 2]
 [3 5 0]
 [4 5 0]]


's3://sagemaker-eu-central-1-342301825291/data/test.npy'

### Code NCF network in TensorFlow 2.0

In [177]:
# import requirements
import tensorflow as tf

from sagemaker import get_execution_role
from sagemaker.tensorflow import TensorFlow

# get current SageMaker session's execution role and default bucket name
sagemaker_session = sagemaker.Session()

role = get_execution_role()
print("execution role ARN:", role)

bucket_name = sagemaker_session.default_bucket()
print("default bucket name:", bucket_name)

execution role ARN: arn:aws:iam::342301825291:role/service-role/AmazonSageMaker-ExecutionRole-20230521T122971
default bucket name: sagemaker-eu-central-1-342301825291


In [178]:
# specify the location of the training data
training_data_uri = os.path.join(f's3://{bucket_name}', 'data')

In [179]:
# inspect the training script using pygmentize
!pygmentize 'ncf.py'

[33m"""[39;49;00m
[33m[39;49;00m
[33m Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.[39;49;00m
[33m SPDX-License-Identifier: MIT-0[39;49;00m
[33m [39;49;00m
[33m Permission is hereby granted, free of charge, to any person obtaining a copy of this[39;49;00m
[33m software and associated documentation files (the "Software"), to deal in the Software[39;49;00m
[33m without restriction, including without limitation the rights to use, copy, modify,[39;49;00m
[33m merge, publish, distribute, sublicense, and/or sell copies of the Software, and to[39;49;00m
[33m permit persons to whom the Software is furnished to do so.[39;49;00m
[33m[39;49;00m
[33m THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,[39;49;00m
[33m INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A[39;49;00m
[33m PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT[39;49;00m
[33m HOLDERS BE LI

The output architecture should look like the following diagram.

<img src="docs/keras-diagram.jpeg" align="center"/>

### Perform model training using script mode

For more information on deploying a model you trained using an instance on Amazon SageMaker, see [Deploy trained Keras or TensorFlow models using Amazon SageMaker](https://aws.amazon.com/blogs/machine-learning/deploy-trained-keras-or-tensorflow-models-using-amazon-sagemaker/). This solution deploys the model using script mode.

You first need to create a Python script that contains the model training code. The model architecture code is already available to you in the file `ncf.py`. This file also contains functions for you to load training and testing data. Run the following cell to define your training environment specifications and hyperparameters.

In [180]:
# specify training instance type and model hyperparameters - note that for demo purposes, the number of epochs is set to 1
instance_count = 1                 # number of instances to use for training
instance_type = 'ml.m5.2xlarge'    # type of instance to use for training

training_script = 'ncf.py'

training_parameters = {
    'epochs': 1,
    'batch_size': 256, 
    'n_user': n_user, 
    'n_item': n_item
}

# training framework specs
tensorflow_version = '2.5'
python_version = 'py37'
distributed_training_spec = {'parameter_server': {'enabled': True}}

Initiate the training job using a TensorFlow estimator.

In [181]:
s3_prefix = "custom_recommender_system"
output_path = f"s3://{bucket_name}/{s3_prefix}/output"

ncf_estimator = TensorFlow(
    entry_point=training_script,
    role=role,
    instance_count=instance_count,
    instance_type=instance_type,
    framework_version=tensorflow_version,
    py_version=python_version,
    distribution=distributed_training_spec,
    hyperparameters=training_parameters,
    output_path=output_path
)

Kick off the training job.

In [182]:
ncf_estimator.fit(training_data_uri)

2023-05-21 19:01:53 Starting - Starting the training job...
2023-05-21 19:02:18 Starting - Preparing the instances for trainingProfilerReport-1684695713: InProgress
......
2023-05-21 19:03:18 Downloading - Downloading input data
2023-05-21 19:03:18 Training - Downloading the training image...
2023-05-21 19:03:38 Training - Training image download completed. Training in progress...[34m2023-05-21 19:03:58.319400: W tensorflow/core/profiler/internal/smprofiler_timeline.cc:460] Initializing the SageMaker Profiler.[0m
[34m2023-05-21 19:03:58.319550: W tensorflow/core/profiler/internal/smprofiler_timeline.cc:105] SageMaker Profiler is not enabled. The timeline writer thread will not be started, future recorded events will be dropped.[0m
[34m2023-05-21 19:03:58.345765: W tensorflow/core/profiler/internal/smprofiler_timeline.cc:460] Initializing the SageMaker Profiler.[0m
[34m2023-05-21 19:03:59,502 sagemaker-training-toolkit INFO     Imported framework sagemaker_tensorflow_container.tr

### Deploy the trained model using Amazon SageMaker hosting services as an endpoint

After the model is trained, you can deploy the model using Amazon SageMaker Hosting Services. Here you deploy the model using one `ml.c5.xlarge` instance as a `tensorflow-serving` endpoint. This allows you to later invoke the endpoint like you would invoke Tensorflow Serving. For more information, see [Train and serve a TensorFlow model with TensorFlow Serving](https://www.tensorflow.org/tfx/tutorials/serving/rest_simple).

In [184]:
endpoint_name = sagemaker_config["SolutionPrefix"] + "neural-collab-filtering-endpoint"
model_name = sagemaker_config["SolutionPrefix"] + "neural-collab-filtering-model"

predictor = ncf_estimator.deploy(
    initial_instance_count=1, 
    instance_type="ml.t2.medium", 
    endpoint_name=endpoint_name,
    model_name=model_name,
)

update_endpoint is a no-op in sagemaker>=2.
See: https://sagemaker.readthedocs.io/en/stable/v2.html for details.
Using already existing model: sagemaker-soln-crs-js-xd5uponeural-collab-filtering-model


-------!

### Run inference using the model endpoint.

To run inference using the endpoint on the testing set, invoke the model using [TensorFlow Serving](https://www.tensorflow.org/tfx/guide/serving):

In [142]:
# to use the endpoint in another notebook, initiate a predictor object as follows
from sagemaker.tensorflow import TensorFlowPredictor

predictor = TensorFlowPredictor(endpoint_name)

# define a function to read testing data
def _load_testing_data(base_dir: str) -> Tuple[np.array, np.array, np.array]:
    """ load testing data """
    df_test = np.load(os.path.join(base_dir, 'test.npy'))
    df_test = np.delete(df_test, 0, 1)
    print(df_test)
    user_test, item_test, y_test = np.split(np.transpose(df_test).flatten(), 3)
    return user_test, item_test, y_test

# read testing data from local
user_test, item_test, _ = _load_testing_data('./data/s3/')

# one-hot encode the testing data for model input
with tf.compat.v1.Session() as tf_sess:
    test_user_data = tf_sess.run(tf.one_hot(user_test, depth=n_user)).tolist()
    test_item_data = tf_sess.run(tf.one_hot(item_test, depth=n_item)).tolist()

[[1 1 1]
 [2 5 2]
 [3 2 0]
 [4 3 0]]


In [149]:
# make batch prediction
batch_size = 100
y_pred = []
print(test_user_data)
print(test_item_data)
for idx in range(0, len(test_user_data), batch_size):
    # reformat test samples into a format acceptable to tensorflow serving
    input_vals = {
     "instances": [
         {'input_1': u, 'input_2': i} 
         for (u, i) in zip(test_user_data[idx:idx+batch_size], test_item_data[idx:idx+batch_size])
    ]}
 
    print(input_vals)

    # invoke model endpoint to run inference
    pred = predictor.predict(input_vals)
    
    print(pred)
    
    # store predictions
    y_pred.extend([i[0] for i in pred['predictions']])

[[0.0, 1.0, 0.0, 0.0], [0.0, 0.0, 1.0, 0.0], [0.0, 0.0, 0.0, 1.0], [0.0, 0.0, 0.0, 0.0]]
[[0.0, 1.0, 0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 1.0, 0.0, 0.0], [0.0, 0.0, 0.0, 1.0, 0.0]]
{'instances': [{'input_1': [0.0, 1.0, 0.0, 0.0], 'input_2': [0.0, 1.0, 0.0, 0.0, 0.0]}, {'input_1': [0.0, 0.0, 1.0, 0.0], 'input_2': [0.0, 0.0, 0.0, 0.0, 0.0]}, {'input_1': [0.0, 0.0, 0.0, 1.0], 'input_2': [0.0, 0.0, 1.0, 0.0, 0.0]}, {'input_1': [0.0, 0.0, 0.0, 0.0], 'input_2': [0.0, 0.0, 0.0, 1.0, 0.0]}]}
{'predictions': [[0.509837568], [0.499842048], [0.516042471], [0.502878308]]}


The model output is a set of probabilities, ranging from 0 to 1, for each user-item pair that is specified for inference. To make final binary predictions, such as "like" or "dislike", you must apply a threshold. For demonstration purposes, this solution uses 0.5 as a threshold. If the predicted probability is equal to or greater than 0.5, then it is assumed that the user likes the movie. If the probability is less than 0.5, then it is assumed that the user dislikes the movie.

In [None]:
# let's see some prediction examples, assuming the threshold 
# --- prediction probability view ---
print('This is what the prediction output looks like')
print(y_pred[:5], end='\n\n\n')

# --- user item pair prediction view, with threshold of 0.5 applied ---
pred_df = pd.DataFrame([
    user_test,
    item_test,
    (np.array(y_pred) >= 0.5).astype(int)],
).T

pred_df.columns = ['userId', 'movieId', 'prediction']

print('We can convert the output to user-item pair as shown below')
print(pred_df.head(), end='\n\n\n')

# --- aggregated prediction view, by user ---
print('Lastly, we can roll up the prediction list by user and view it that way')
print(pred_df.query('prediction == 1').groupby('userId').movieId.apply(list).head().to_frame(), end='\n\n\n')

## Conclusion

Designing a recommender system can be a challenging task that sometimes requires model customization. In this solution, you implemented, deployed, and invoked an NCF model from scratch in Amazon SageMaker. This work can serve as a foundation for you to start building more customized solutions with your own datasets.

For more information about using built-in Amazon SageMaker algorithms and [Amazon Personalize](https://aws.amazon.com/personalize/) to build recommender system solutions, see the following blog posts:

 - [Omnichannel personalization with Amazon Personalize](https://aws.amazon.com/blogs/machine-learning/omnichannel-personalization-with-amazon-personalize/)
 - [Creating a recommendation engine using Amazon Personalize](https://aws.amazon.com/blogs/machine-learning/creating-a-recommendation-engine-using-amazon-personalize/)
 - [Extending Amazon SageMaker factorization machines algorithms to predict top x recommendations](https://aws.amazon.com/blogs/machine-learning/extending-amazon-sagemaker-factorization-machines-algorithm-to-predict-top-x-recommendations/)
 - [Build a movie recommender with factorization machines on Amazon SageMaker](https://aws.amazon.com/blogs/machine-learning/build-a-movie-recommender-with-factorization-machines-on-amazon-sagemaker/)
 
You can further customize the Neural Collaborative Filtering network using Deep Matrix Factorization (Xue et al., 2017)



### Delete endpoint (Optional)

After you complete this solution, remove the endpoint when it is no longer needed to save you from incurring further hosting charges.

In [None]:
predictor.delete_endpoint(delete_endpoint_config=True)