In [1]:
# Copyright 2021 NVIDIA Corporation. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ================================

## Exporting Ranking Models

In this example notebook, we use the Ali-CCP: Alibaba Click and Conversion Prediction dataset to build our recommender system models. To download the training and test datasets visit Ali-CCP: Alibaba Click and Conversion Prediction at [tianchi.aliyun.com](https://tianchi.aliyun.com/dataset/dataDetail?dataId=408#1). We have curated the raw dataset via this [script]() and generated the parquet files that we will use in this example.

### Learning objectives
- Preparing the data with NVTabular
- Training a DLRM model with Merlin Models
- Exporting NVTabular workflow and ranking model for model deployment with Merlin Systems library

## Importing Libraries

Let's start with importing the libraries that we'll use in this notebook.

In [2]:
import os
os.environ["TF_GPU_ALLOCATOR"]="cuda_malloc_async"
import cudf
import glob
import gc

import nvtabular as nvt
from nvtabular.ops import *

from merlin.models.utils.example_utils import workflow_fit_transform

from merlin.schema.tags import Tags
from merlin.schema import Schema

import merlin.models.tf as mm
import merlin.models.tf.dataset as tf_dataloader

from merlin.io.dataset import Dataset
from merlin.schema.io.tensorflow_metadata import TensorflowMetadata
from merlin.models.tf.blocks.core.aggregation import CosineSimilarity

import tensorflow as tf

2022-03-25 19:28:58.918065: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-03-25 19:29:00.044340: I tensorflow/core/common_runtime/gpu/gpu_process_state.cc:214] Using CUDA malloc Async allocator for GPU: 0
2022-03-25 19:29:00.044471: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 16254 MB memory:  -> device: 0, name: Quadro GV100, pci bus id: 0000:15:00.0, compute capability: 7.0


## Feature Engineering with NVTabular

When we work on a new recommender systems, we explore the dataset, first. In doing so, we define our input and output paths.

In [3]:
DATA_FOLDER = os.environ.get("DATA_FOLDER", "/workspace/data/")
train_path = os.path.join(DATA_FOLDER, 'train/' '*.parquet')
valid_path = os.path.join(DATA_FOLDER, 'test/', '*.parquet')
output_path = '/workspace/data/processed'

We use a utility function, `workflow_fit_transform` perform to fit and transform steps on the raw dataset applying the operators defined in the NVTabular workflow pipeline below, and also save our workflow model. After fit and transform, the processed parquet files are saved to `output_path` and our NVTabular workflow model will be saved in the current working directory.

In [4]:
%%time
user_id = ["user_id"] >> Categorify() >> TagAsUserID()
item_id = ["item_id"] >> Categorify() >> TagAsItemID()
targets = ["click"] >> AddMetadata(tags=[str(Tags.BINARY_CLASSIFICATION), "target"])

item_features = ["item_category", "item_shop", "item_brand"] >> Categorify() >> TagAsItemFeatures()

user_features = ['user_shops', 'user_profile', 'user_group', 
       'user_gender', 'user_age', 'user_consumption_2', 'user_is_occupied',
       'user_geography', 'user_intentions', 'user_brands', 'user_categories'] \
        >> Categorify() >> TagAsUserFeatures()

outputs = user_id + item_id + item_features + user_features + targets

workflow_fit_transform(outputs, train_path, valid_path, output_path)



CPU times: user 16.7 s, sys: 20.2 s, total: 36.9 s
Wall time: 39.6 s


## Build and Train a DLRM model

Deep Learning Recommendation Model [(DLRM)](https://arxiv.org/abs/1906.00091) architecture is a popular neural network model originally proposed by Facebook in 2019. To learn more about the DLRM model you can visit the [03-Exploring-different-models.ipynb](https://github.com/NVIDIA-Merlin/models/blob/main/examples/03-Exploring-different-models.ipynb) notebook.

NVTabular workflow exports the schema file of our processed dataset. To learn more about the schema object and schema file you can explore [02-Merlin-Models-and-NVTabular-applying-to-your-own-dataset.ipynb](https://github.com/NVIDIA-Merlin/models/blob/main/examples/02-Merlin-Models-and-NVTabular-applying-to-your-own-dataset.ipynb) notebook.

In [5]:
# define train and valid dataset objects
train = Dataset(os.path.join(output_path, 'train', '*.parquet'), part_size="500MB")
valid = Dataset(os.path.join(output_path, 'valid', '*.parquet'), part_size="500MB")

# define schema object
schema = train.schema

In [6]:
schema.column_names

['user_id',
 'item_id',
 'item_category',
 'item_shop',
 'item_brand',
 'user_shops',
 'user_profile',
 'user_group',
 'user_gender',
 'user_age',
 'user_consumption_2',
 'user_is_occupied',
 'user_geography',
 'user_intentions',
 'user_brands',
 'user_categories',
 'click']

In [10]:
target_column = schema.select_by_tag(Tags.TARGET).column_names[0]
target_column

'click'

We print out all the features that are included in the `schema.pbtxt` file.

In [11]:
model = mm.DLRMModel(
    schema,
    embedding_dim=64,
    bottom_block=mm.MLPBlock([128, 64]),
    top_block=mm.MLPBlock([128, 64, 32]),
    prediction_tasks=mm.BinaryClassificationTask(target_column, metrics=[tf.keras.metrics.AUC()])
)

In [12]:
%%time

model.compile('adam', run_eagerly=False)
model.fit(train, validation_data=valid, batch_size=16*1024)

2022-03-25 19:30:23.157631: W tensorflow/python/util/util.cc:368] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.




2022-03-25 19:34:39.193897: W tensorflow/core/grappler/optimizers/loop_optimizer.cc:907] Skipping loop optimization for Merge node with control input: cond/then/_0/cond/cond/branch_executed/_161


CPU times: user 8min, sys: 1min 31s, total: 9min 31s
Wall time: 5min 9s


<keras.callbacks.History at 0x7ff01819e970>

### Save model

The last step of machine learning (ML)/deep learning (DL) pipeline is to deploy the ETL workflow and saved model to production. In the production setting, we want to transform the input data as done during training (ETL). We need to apply the same mean/std for continuous features and use the same categorical mapping to convert the categories to continuous integer before we use the DL model for a prediction. Therefore, we deploy the NVTabular workflow with the Tensorflow model as an ensemble model to Triton Inference using [Merlin Systems](https://github.com/NVIDIA-Merlin/systems) library very easily. The ensemble model guarantees that the same transformation is applied to the raw inputs.

We save our DLRM model.

In [13]:
model.save('dlrm')

2022-03-25 19:35:38.064456: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 788046592 exceeds 10% of free system memory.
2022-03-25 19:35:38.651699: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 788046592 exceeds 10% of free system memory.
2022-03-25 19:35:39.240066: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 788046592 exceeds 10% of free system memory.


INFO:tensorflow:Assets written to: dlrm/assets


INFO:tensorflow:Assets written to: dlrm/assets


### Next Steps

We trained and exported our ranking model and NVTabular workflow. In the next step, we will learn how to deploy our trained DLRM model into [Triton Inference Server](https://github.com/triton-inference-server/server) with [Merlin Sytems](https://github.com/NVIDIA-Merlin/systems) library. NVIDIA Triton Inference Server (TIS) simplifies the deployment of AI models at scale in production. TIS provides a cloud and edge inferencing solution optimized for both CPUs and GPUs. It supports a number of different machine learning frameworks such as TensorFlow and PyTorch.

Visit `examples/Getting_Started` folder and continue on the nex step with executing `Getting-started-with-Merlin-Systems` notebook.