In [1]:
# Copyright 2021 NVIDIA Corporation. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ================================

# Building Intelligent Recommender Systems with Merlin

Recommender Systems (RecSys) are the engine of the modern internet and the catalyst for human decisions. Building a recommendation system is challenging because it requires multiple stages (data preprocessing, offline training, item retrieval, filtering, ranking, ordering, etc.) to work together seamlessly and efficiently. The biggest challenges for new practitioners are the lack of understanding around what RecSys look like in the real world, and the gap between examples of simple models and a production-ready end-to-end recommender systems.

The figure below represents four-stage recommender systems. This is a much more complex than only training a single model and deploying it.



<img src="../images/fourstages.png"  width="70%">



In [None]:
### Learning objectives
- Understanding four stage of recommender systems
- Training retrieval and ranking recommender system models with Merlin Models
- Deploying trained models to Triton Inference Server with Merlin Systems

## Feature Engineering with NVTabular

In this example notebook, we use the [Ali-CCP: Alibaba Click and Conversion Prediction](https://tianchi.aliyun.com/dataset/dataDetail?dataId=408#1) dataset to build our recommender system models. Below, we will process input features with [NVTabular](https://github.com/NVIDIA-Merlin/NVTabular).

In [2]:
import os
os.environ["TF_GPU_ALLOCATOR"]="cuda_malloc_async"
import cudf
import glob
import gc

import nvtabular as nvt
from nvtabular.ops import *
from example_utils import workflow_fit_transform

from merlin.schema.tags import Tags
from merlin.schema import Schema

import merlin.models.tf as mm
import merlin.models.tf.dataset as tf_dataloader

from merlin.io.dataset import Dataset
from merlin.schema.io.tensorflow_metadata import TensorflowMetadata
from merlin.models.tf.blocks.core.aggregation import CosineSimilarity

import tensorflow as tf

2022-03-24 18:34:27.228183: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-03-24 18:34:28.339122: I tensorflow/core/common_runtime/gpu/gpu_process_state.cc:214] Using CUDA malloc Async allocator for GPU: 0
2022-03-24 18:34:28.339256: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 16254 MB memory:  -> device: 0, name: Quadro GV100, pci bus id: 0000:15:00.0, compute capability: 7.0


First, we define our input and output paths.

In [3]:
train_path = '/workspace/data/train/*.parquet'
test_path = '/workspace/data/test/*.parquet'
output_path = '/workspace/processed/ranking'

<a id="etl"></a>
ETL Workflow:

In [4]:
%%time

user_id = ["user_id"] >> AddMetadata(tags=[Tags.USER_ID, Tags.USER]) >> Categorify()
item_id = ["item_id"] >> AddMetadata(tags=[Tags.ITEM_ID, Tags.ITEM]) >> Categorify()

item_features = ["item_category", "item_shop", "item_brand"] >> AddMetadata(tags=[Tags.ITEM]) >> nvt.ops.Categorify()

user_features = ['user_shops', 'user_profile', 'user_group', 
       'user_gender', 'user_age', 'user_consumption_2', 'user_is_occupied',
       'user_geography', 'user_intentions', 'user_brands', 'user_categories'] \
    >> AddMetadata(tags=[Tags.USER]) >> nvt.ops.Categorify()

targets = ["click"] >> AddMetadata(tags=[str(Tags.BINARY_CLASSIFICATION), "target"])

outputs = user_id+item_id+item_features+user_features+targets

workflow_fit_transform(outputs, train_path, test_path, output_path, 'workflow_ranking')



CPU times: user 18.3 s, sys: 19.8 s, total: 38.2 s
Wall time: 40.5 s


We will also use a util function to wrap up the workflow transform to a one line of code.

## Building Recommender Systems

NVTabular exported the schema file of our processed dataset. Merlin Models library relies on a schema object that takes the input features as input and automatically builds all necessary layers to represent, normalize and aggregate input features. `schema.pbtxt` is a protobuf text file contains features metadata, including statistics about features such as cardinality, min and max values and also tags based on their characteristics and dtypes (e.g., categorical, continuous, list, item_id). The metadata information loaded from Schema and their tags are used to automatically set the parameters of Merlin models.

We use the `schema` object to define our model.

In [5]:
schema = TensorflowMetadata.from_proto_text_file(output_path + '/train/').to_merlin_schema()

In [6]:
target_column = schema.select_by_tag(Tags.TARGET).column_names[0]

In [7]:
schema.column_names

['user_id',
 'item_id',
 'item_category',
 'item_shop',
 'item_brand',
 'user_shops',
 'user_profile',
 'user_group',
 'user_gender',
 'user_age',
 'user_consumption_2',
 'user_is_occupied',
 'user_geography',
 'user_intentions',
 'user_brands',
 'user_categories',
 'click']

### Initialize Dataloaders

We're ready to start training, for that, we need to initialize the dataloaders. We'll use Merlin `BatchedDataset` class for reading chunks of parquet files. `BatchedDataset` asynchronously iterate through CSV or Parquet dataframes on GPU by leveraging an NVTabular `Dataset`. To read more about Merlin optimized dataloaders visit [here](https://github.com/NVIDIA-Merlin/models/blob/main/merlin/models/tf/dataset.py#L141).

In [8]:
batch_size = 16*1024
train = Dataset(os.path.join(output_path + '/train/*.parquet'), part_size="500MB")
valid = Dataset(os.path.join(output_path + '/test/*.parquet'), part_size="500MB")



### Building a Ranking Model with DLRM

Deep Learning Recommendation Model [(DLRM)](https://arxiv.org/abs/1906.00091) architecture is a popular neural network model originally proposed by Facebook in 2019. The model was introduced as a personalization deep learning model that uses embeddings to process sparse features that represent categorical data and a multilayer perceptron (MLP) to process dense features, then interacts these features explicitly using the statistical techniques proposed in [here](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5694074). To learn more about DLRM architetcture please visit `Exploring-different-models` [notebook](https://github.com/NVIDIA-Merlin/models/blob/main/examples/Exploring-different-models.ipynb) in the Merlin Models GH repo.

In [9]:
model = mm.DLRMModel(
    schema,
    embedding_dim=64,
    bottom_block=mm.MLPBlock([128, 64]),
    top_block=mm.MLPBlock([128, 64, 32]),
    prediction_tasks=mm.BinaryClassificationTask(target_column, metrics=[tf.keras.metrics.AUC()])
)

In [10]:
%%time
model.compile('adam', run_eagerly=False)
model.fit(train, validation_data=valid, batch_size=batch_size, epochs=1)

2022-03-24 18:35:10.284582: W tensorflow/python/util/util.cc:368] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.




2022-03-24 18:39:28.921898: W tensorflow/core/grappler/optimizers/loop_optimizer.cc:907] Skipping loop optimization for Merge node with control input: cond/then/_0/cond/cond/branch_executed/_161


CPU times: user 7min 51s, sys: 1min 33s, total: 9min 24s
Wall time: 5min 12s


<keras.callbacks.History at 0x7fecc60cbf70>

Save the model

In [11]:
model.save('dlrm')

2022-03-24 18:40:27.823089: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 788046592 exceeds 10% of free system memory.
2022-03-24 18:40:28.426818: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 788046592 exceeds 10% of free system memory.
2022-03-24 18:40:29.060094: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 788046592 exceeds 10% of free system memory.


INFO:tensorflow:Assets written to: dlrm/assets


INFO:tensorflow:Assets written to: dlrm/assets


## Building a Retrieval Model with Two-Tower Model

In [12]:
output_path = '/workspace/processed/retrieval/'

We select only positive interaction rows therefore we remove rows where `click==0` from the dataset with `Filter()` op.

In [13]:
user_id = ["user_id"] >> Categorify() >> TagAsUserID()
item_id = ["item_id"] >> Categorify() >> TagAsItemID()

item_features = ["item_category", "item_shop", "item_brand"] >> nvt.ops.Categorify() >> AddTags(tags=[Tags.ITEM])

user_features = ['user_shops', 'user_profile', 'user_group', 
       'user_gender', 'user_age', 'user_consumption_2', 'user_is_occupied',
       'user_geography', 'user_intentions', 'user_brands', 'user_categories'] \
        >> nvt.ops.Categorify() >> AddTags(tags=[Tags.USER])

inputs = user_id + item_id + item_features + user_features + ['click'] 

outputs = inputs >> Filter(f=lambda df: df["click"] == 1)

workflow_fit_transform(outputs, train_path, test_path, output_path, 'workflow_retrieval')



In [14]:
train_tt = Dataset(os.path.join(output_path, 'train', '*.parquet'), part_size="500MB")
valid_tt = Dataset(os.path.join(output_path, 'test', '*.parquet'), part_size="500MB")

schema = train.schema



In [15]:
schema = schema.select_by_tag([Tags.ITEM_ID, Tags.USER_ID, Tags.ITEM, Tags.USER])

In [16]:
model = mm.TwoTowerModel(
    schema,
    query_tower=mm.MLPBlock([128, 64], no_activation_last_layer=True),        
    loss="categorical_crossentropy",  
    samplers=[mm.InBatchSampler()],
    embedding_options = mm.EmbeddingOptions(infer_embedding_sizes=True),
    metrics=[mm.RecallAt(10), mm.NDCGAt(10)]
)

In [17]:
model.set_retrieval_candidates_for_evaluation(train)

opt = tf.keras.optimizers.Adagrad(learning_rate=0.003)
model.compile(optimizer=opt, run_eagerly=False)
model.fit(train_tt, validation_data=valid_tt, batch_size=4096, epochs=2)

Epoch 1/2
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: module, class, method, function, traceback, frame, or code object was expected, got cython_function_or_method


Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: module, class, method, function, traceback, frame, or code object was expected, got cython_function_or_method


Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: module, class, method, function, traceback, frame, or code object was expected, got cython_function_or_method

2022-03-24 18:41:31.056288: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 1034311152 exceeds 10% of free system memory.


INFO:tensorflow:Assets written to: /tmp/tmpxj2sh2ay/assets


INFO:tensorflow:Assets written to: /tmp/tmpxj2sh2ay/assets
2022-03-24 18:41:34.284865: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 1034311152 exceeds 10% of free system memory.




2022-03-24 18:42:15.171486: W tensorflow/core/grappler/optimizers/loop_optimizer.cc:907] Skipping loop optimization for Merge node with control input: cond/then/_0/cond/cond/branch_executed/_184


Epoch 2/2



INFO:tensorflow:Assets written to: /tmp/tmpejxazd24/assets


INFO:tensorflow:Assets written to: /tmp/tmpejxazd24/assets








<keras.callbacks.History at 0x7fecb2386550>

## Exporting Retrieval Models

In [18]:
query_tower = model.retrieval_block.query_block()
query_tower.save('query_tower')



INFO:tensorflow:Assets written to: query_tower/assets


INFO:tensorflow:Assets written to: query_tower/assets


In [19]:
from merlin.models.utils.dataset import unique_rows_by_features
user_features = unique_rows_by_features(train, Tags.USER, Tags.USER_ID).compute().reset_index(drop=True)

In [36]:
user_features.head()

Unnamed: 0,user_id,user_shops,user_profile,user_group,user_gender,user_age,user_consumption_2,user_is_occupied,user_geography,user_intentions,user_brands,user_categories
0,0,0,1,5,2,2,2,1,0,0,0,0
1,1,109,0,0,0,0,0,0,0,69,131,9
2,2,301,1,1,1,1,1,1,2,57,4709,57
3,3,1876,23,7,2,3,1,1,1,5,63,3
4,4,534,1,2,1,2,1,1,0,40,22,108


In [38]:
user_features.dtypes

user_id               int64
user_shops            int64
user_profile          int64
user_group            int64
user_gender           int64
user_age              int64
user_consumption_2    int64
user_is_occupied      int64
user_geography        int64
user_intentions       int64
user_brands           int64
user_categories       int64
dtype: object

In [37]:
user_features.columns

Index(['user_id', 'user_shops', 'user_profile', 'user_group', 'user_gender',
       'user_age', 'user_consumption_2', 'user_is_occupied', 'user_geography',
       'user_intentions', 'user_brands', 'user_categories'],
      dtype='object')

In [82]:
user_features.shape

(294736, 12)

In [89]:
from datetime import datetime
user_features["datetime"] = datetime.now()
user_features["datetime"] = user_features["datetime"].astype("datetime64[ns]")
user_features["created"] = datetime.now()
user_features["created"] = user_features["created"].astype("datetime64[ns]")

In [93]:
user_features.to_parquet('user_features.parquet')

In [21]:
item_features = unique_rows_by_features(train, Tags.ITEM, Tags.ITEM_ID).compute().reset_index(drop=True)

In [34]:
item_features.head()

Unnamed: 0,item_id,item_category,item_shop,item_brand
0,0,0,0,0
1,1,441,432,474
2,2,193,1159,125
3,3,3,1463,872
4,4,282,2479,555


In [35]:
item_features.dtypes

item_id          int64
item_category    int64
item_shop        int64
item_brand       int64
dtype: object

In [81]:
item_features.shape

(3078306, 4)

In [94]:
item_features["datetime"] = datetime.now()
item_features["datetime"] = item_features["datetime"].astype("datetime64[ns]")
item_features["created"] = datetime.now()
item_features["created"] = item_features["created"].astype("datetime64[ns]")

In [96]:
item_features.dtypes

item_id                   int64
item_category             int64
item_shop                 int64
item_brand                int64
datetime         datetime64[ns]
created          datetime64[ns]
dtype: object

In [None]:
item_features

In [97]:
# save to disk
item_features.to_parquet('item_features.parquet')

#### Extract and save Item embeddings

In [98]:
item_embs = model.item_embeddings(Dataset(item_features, schema=schema), batch_size=1024)
item_embs_df = item_embs.compute(scheduler="synchronous")



INFO:tensorflow:Assets written to: /tmp/tmpmhpp3f_e/assets


INFO:tensorflow:Assets written to: /tmp/tmpmhpp3f_e/assets






In [99]:
# select only embedding columns
item_embeddings = item_embs_df.iloc[:, 4:]

In [100]:
item_embeddings

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,54,55,56,57,58,59,60,61,62,63
0,-0.240219,0.098013,0.084337,0.031100,0.104367,0.045661,0.020319,0.153839,0.161770,0.107052,...,0.121493,-0.000128,-0.094518,0.092909,0.088235,0.044954,0.202433,0.001573,0.005804,-0.083102
1,0.036755,0.225635,-0.104459,-0.086914,0.100140,0.118539,-0.090942,0.049130,0.242738,0.154544,...,0.263599,0.091065,0.030586,-0.024852,-0.089785,0.167049,-0.036791,0.087586,-0.109924,-0.129684
2,-0.243588,0.090928,0.021481,0.127052,-0.061280,0.090714,0.106807,0.065844,0.241243,0.020420,...,0.207998,-0.010261,-0.282395,0.022748,-0.179967,-0.136605,0.132614,-0.036602,0.240175,0.093345
3,0.160735,0.022391,-0.068275,-0.015373,0.125100,0.148184,-0.064653,0.038877,-0.051233,0.210893,...,0.166989,-0.000811,-0.079048,0.106997,-0.023781,0.167731,-0.114966,-0.052225,-0.161464,-0.138233
4,-0.044245,0.171201,-0.008067,0.048566,-0.148960,0.023509,-0.013836,0.056129,0.312029,-0.116087,...,0.221192,-0.033822,-0.076613,-0.138665,-0.221075,-0.067987,0.029779,0.184153,0.200796,-0.025561
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3078301,-0.127434,0.234514,0.033737,0.114038,-0.170819,0.085047,0.121316,0.106980,0.268950,0.072199,...,0.120395,-0.113239,0.157824,0.116763,-0.052521,0.032848,-0.010210,-0.031877,0.021030,0.072824
3078302,-0.001478,0.322349,-0.017481,-0.122504,0.096282,0.098336,0.035148,-0.077741,0.198080,-0.164761,...,0.262214,-0.177650,-0.180257,-0.069942,-0.149382,0.134342,0.043282,0.077701,-0.048704,-0.029332
3078303,-0.095164,0.051032,-0.096254,0.015315,-0.069524,0.099273,-0.039192,0.158641,0.295329,-0.067302,...,0.167899,0.100052,0.088500,-0.131238,-0.189871,-0.080607,0.100726,0.014591,0.111981,-0.011644
3078304,-0.123624,0.173730,0.060116,0.008632,-0.076002,0.146545,0.003808,0.101340,0.099717,0.014800,...,0.014092,-0.088312,0.018992,-0.050800,-0.126568,-0.138310,0.115050,-0.082769,0.190359,-0.071905


In [26]:
# save to disk
item_embeddings.to_parquet('item_embeddings')

## Deploying the Model into Production with Merlin Systems and Triton IS

In [1]:
from nvtabular.loader.tf_utils import configure_tensorflow, get_dataset_schema_from_feature_columns

configure_tensorflow()



<function tensorflow.python.dlpack.dlpack.from_dlpack(dlcapsule)>

In [1]:
# import os
# os.environ["TF_MEMORY_ALLOCATION"]="0.5"

In [2]:
base_path = "/models/examples"
faiss_index_path = './tmp' + "/index.faiss"
feast_repo_path = base_path + "/feature_repo/"
retrieval_model_path = base_path + "/query_tower/"
ranking_model_path = base_path + "/dlrm/"

In [3]:
faiss_index_path, feast_repo_path, retrieval_model_path

('./tmp/index.faiss',
 '/models/examples/feature_repo/',
 '/models/examples/query_tower/')

In [4]:
from nvtabular import ColumnSchema, Schema

from merlin.systems.dag.ensemble import Ensemble
from merlin.systems.dag.ops.session_filter import FilterCandidates
from merlin.systems.dag.ops.softmax_sampling import SoftmaxSampling
from merlin.systems.dag.ops.tensorflow import PredictTensorflow
from merlin.systems.dag.ops.unroll_features import UnrollFeatures

In [5]:
from run_ensemble_triton import _run_ensemble_on_tritonserver
import pandas as pd

In [6]:
import numpy as np
import cudf
import feast
import faiss

from merlin.systems.dag.ops.faiss import QueryFaiss, setup_faiss 
from merlin.systems.dag.ops.feast import QueryFeast 


request_schema = Schema(
    [
        ColumnSchema("user_id", dtype=np.int64),
    ]
)

item_embeddings = np.ascontiguousarray(
    pd.read_parquet(base_path + "/item_embeddings.parquet").to_numpy()
)

feature_store = feast.FeatureStore(feast_repo_path)
setup_faiss(item_embeddings, str(faiss_index_path))

user_features = ["user_id"] >> QueryFeast.from_feature_view(
    store=feature_store,
    path=feast_repo_path,
    view="user_features",
    column="user_id",
    include_id=True,
)

  if LooseVersion(numpy.__version__) >= "1.19":
  other = LooseVersion(other)
03/24/2022 10:27:15 PM INFO:Loading faiss with AVX2 support.
03/24/2022 10:27:15 PM INFO:Could not load library with AVX2 support due to:
ModuleNotFoundError("No module named 'faiss.swigfaiss_avx2'")
03/24/2022 10:27:15 PM INFO:Loading faiss.
03/24/2022 10:27:15 PM INFO:Successfully loaded faiss.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  ValueType.FLOAT: (np.float, False, False),


In [7]:
retrieval = (
    user_features
    >> PredictTensorflow(retrieval_model_path)
    >> QueryFaiss(faiss_index_path, topk=100)
)

2022-03-24 22:27:20.329563: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-03-24 22:27:21.732087: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 16254 MB memory:  -> device: 0, name: Quadro GV100, pci bus id: 0000:15:00.0, compute capability: 7.0
2022-03-24 22:27:23.928631: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 1034311152 exceeds 10% of free system memory.






In [8]:
item_features = retrieval["candidate_ids"] >> QueryFeast.from_feature_view(
    store=feature_store,
    path=feast_repo_path,
    view="item_features",
    column="candidate_ids",
    output_prefix="item",
    include_id=True,
)

user_features_to_unroll = [
    "user_id",
    "user_shops",
    "user_profile",
    "user_group",
    "user_gender",
    "user_age",
    "user_consumption_2",
    "user_is_occupied",
    "user_geography",
    "user_intentions",
    "user_brands",
    "user_categories",
]
combined_features = item_features >> UnrollFeatures(
    "item_id", user_features[user_features_to_unroll]
)

In [9]:
ranking = combined_features >> PredictTensorflow(ranking_model_path)

ordering = combined_features["item_id"] >> SoftmaxSampling(
    relevance_col=ranking["output_1"], topk=10, temperature=20.0
)

In [10]:
export_path = str("./test_poc")

ensemble = Ensemble(ordering, request_schema)
ens_config, node_configs = ensemble.export(export_path)

In [11]:
from merlin.core.dispatch import make_df
request = make_df({"user_id": [1]})
request["user_id"] = request["user_id"].astype(np.int64)

response = _run_ensemble_on_tritonserver(
    export_path, ensemble.graph.output_schema.column_names, request, "ensemble_model"
)

I0324 22:27:44.104774 1676 tensorflow.cc:2176] TRITONBACKEND_Initialize: tensorflow
I0324 22:27:44.104855 1676 tensorflow.cc:2186] Triton TRITONBACKEND API version: 1.8
I0324 22:27:44.104859 1676 tensorflow.cc:2192] 'tensorflow' TRITONBACKEND API version: 1.8
I0324 22:27:44.104863 1676 tensorflow.cc:2216] backend configuration:
{"cmdline":{"version":"2"}}
I0324 22:27:44.248887 1676 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x7f64d8000000' with size 268435456
I0324 22:27:44.249253 1676 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864
I0324 22:27:44.254555 1676 model_repository_manager.cc:994] loading: 0_queryfeast:1
I0324 22:27:44.355003 1676 model_repository_manager.cc:994] loading: 1_predicttensorflow:1
I0324 22:27:44.358371 1676 backend.cc:46] TRITONBACKEND_Initialize: nvtabular
I0324 22:27:44.358406 1676 backend.cc:53] Triton TRITONBACKEND API version: 1.8
I0324 22:27:44.358418 1676 backend.cc:56] 'nvtabular' TRITONBACKEND 

Signal (2) received.


I0324 22:28:01.013929 1676 server.cc:267] Timeout 29: Found 7 live models and 0 in-flight non-inference requests
I0324 22:28:02.048991 1676 server.cc:267] Timeout 28: Found 7 live models and 0 in-flight non-inference requests
I0324 22:28:03.074960 1676 server.cc:267] Timeout 27: Found 7 live models and 0 in-flight non-inference requests
 0# 0x0000559BD7E6B299 in /opt/tritonserver/bin/tritonserver
 1# 0x00007F656DC9F210 in /usr/lib/x86_64-linux-gnu/libc.so.6
 2# 0x00007F65114F1F2E in /usr/lib/x86_64-linux-gnu/libpython3.8.so.1.0
 3# TRITONBACKEND_ModelInstanceFinalize in /opt/tritonserver/backends/nvtabular/libtriton_nvtabular.so
 4# 0x00007F656E83CFC4 in /opt/tritonserver/bin/../lib/libtritonserver.so
 5# 0x00007F656E8363B9 in /opt/tritonserver/bin/../lib/libtritonserver.so
 6# 0x00007F656E836B1D in /opt/tritonserver/bin/../lib/libtritonserver.so
 7# 0x00007F656E6BA0D7 in /opt/tritonserver/bin/../lib/libtritonserver.so
 8# 0x00007F656E08DDE4 in /usr/lib/x86_64-linux-gnu/libstdc++.so.6


In [19]:
output= response.as_numpy('ordered_ids')

In [20]:
output

array([[ 692064],
       [ 917011],
       [1903152],
       [2711317],
       [1864711],
       [ 332748],
       [2036044],
       [2639642],
       [2510817],
       [1556993]])