In [1]:
# Copyright 2021 NVIDIA Corporation. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================

<img src="http://developer.download.nvidia.com/compute/machine-learning/frameworks/nvidia_logo.png" style="width: 90px; float: right;">

# Getting Started with Merlin Models: Develop a Model for MovieLens

## Overview

[Merlin Models](https://github.com/NVIDIA-Merlin/models/) is a library for training recommender models. Merlin Models let users in industry easily train standard models against their own dataset, getting high performance GPU accelerated models with best practices baked into the library. This will also let researchers to build custom models by incorporating standard components of deep learning recommender models, and then benchmark their new models on example offline datasets. Merlin Models is part of the [Merlin open source framework](https://developer.nvidia.com/nvidia-merlin).

Core features are:
- Unified API enables users to create models in TensorFlow or PyTorch
- Deep integration with NVTabular for ETL and model serving
- Flexible APIs targeted to both production and research
- Many different recommender system architectures (tabular, two-tower, sequential) or tasks (binary, multi-class classification, multi-task)

### Learning objectives

- Training [Facebook's DLRM model](https://arxiv.org/pdf/1906.00091.pdf) with only 3 commands.
- Understanding Merlin Models high-level API

## Downloading and preparing the dataset

In [3]:
import merlin.models.tf as mm

from merlin.models.data.movielens import get_movielens

2022-03-17 16:01:59.285737: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 16255 MB memory:  -> device: 0, name: Tesla V100-SXM2-32GB, pci bus id: 0000:0b:00.0, compute capability: 7.0
2022-03-17 16:01:59.286915: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 29922 MB memory:  -> device: 1, name: Tesla V100-SXM2-32GB, pci bus id: 0000:85:00.0, compute capability: 7.0
2022-03-17 16:01:59.287899: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 29924 MB memory:  -> device: 2, name: Tesla V100-SXM2-32GB, pci bus id: 0000:86:00.0, compute capability: 7.0
2022-03-17 16:01:59.288930: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 29924 MB memory:  -> device: 3, name: Tesla V100-SXM2-32GB, pci bus id

We run the get_movielens function as a convenience to download the dataset, perform simple preprocessing, and split the data into training and validation datasets.

In [4]:
train, valid = get_movielens(variant="ml-1m")

downloading ml-1m.zip: 5.93MB [00:00, 10.4MB/s]                                 
unzipping files: 100%|█████████████████████████| 5/5 [00:00<00:00, 36.15files/s]
  users = pd.read_csv(
  ratings = pd.read_csv(
  movies = pd.read_csv(
INFO:merlin.models.data.movielens:starting ETL..
INFO:merlin.models.data.movielens:saving the workflow..


## Training the DLRM Model with Merlin Models

We initialize the DLRM model.

In [5]:
model = mm.DLRMModel(
    train.schema,
    embedding_dim=64,
    bottom_block=mm.MLPBlock([128, 64]),
    top_block=mm.MLPBlock([128, 64, 32]),
    prediction_tasks=mm.BinaryClassificationTask(train.schema),
)

model.compile(optimizer="adam")

Next, we train the model.

In [6]:
model.fit(train, batch_size=1024)

2022-03-17 16:02:18.246713: W tensorflow/python/util/util.cc:368] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.




<keras.callbacks.History at 0x7f9394461d30>

We evaluate the model.

In [7]:
history = model.evaluate(valid, batch_size=1024)

2022-03-17 16:02:36.403253: W tensorflow/core/grappler/optimizers/loop_optimizer.cc:907] Skipping loop optimization for Merge node with control input: cond/then/_0/cond/cond/branch_executed/_170




We view the collected metrics. The method produces a list of the metrics and the metrics are stored in the same order that they were generated.

In [8]:
history

[0.7200919985771179,
 0.8575936555862427,
 0.726421058177948,
 0.7930966019630432,
 0.5442478060722351,
 0.0,
 0.5442478060722351]

## Conclusion

Merlin Models enables users to define and train a deep learning recommeder model with only 3 commands.

```python
model = mm.DLRMModel(
    train.schema,
    embedding_dim=64,
    bottom_block=mm.MLPBlock([128, 64]),
    top_block=mm.MLPBlock([128, 64, 32]),
    prediction_tasks=mm.BinaryClassificationTask(
        train.schema.select_by_tag(Tags.TARGET).column_names[0]
    ),
)
model.compile(optimizer="adam")
model.fit(train, batch_size=1024)
```

## Next steps

In the next example notebooks, we will show how to use your own dataset and how to explore different recommender models.