# Offline Batch Recommender System

In this notebook, we will build a simple offline batch recsys that writes results to Redis for later access. The architecture diagram below shows how this system comes together.

![](./img/OfflineBatchRecsys.png)

## Candidate Retrieval Model

Now about the model itself... many recommender systems have a *two-stage pipeline*:
1) A fast **candidate retrieval** model quickly truncates the large item catalog to a relevant set of hundreds (or thousands) of options
2) A finely-tuned **ranking model** (i.e. more powerful) ranks the most likely items that are going to interacted with.

In this notebook, we will build a simple **Two-Tower** candidate retrieval model with Tensorflow and Merlin/NVTabular helper utilities that can score millions of items for a given user. The Two-Tower model is a neural network architecture with two MLP towers where both user and item features are fed to generate user and item embeddings in the output.

Though we skip the ranking model step for now, you will pick that up in the [Multi-Stage Recommender System]() example notebook.

*This notebook was created using the latest stable [merlin-tensorflow](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/merlin/containers/merlin-tensorflow/tags) container and was heavily based on the work done by the NVIDIA Merlin team [here](https://github.com/NVIDIA-Merlin/models/blob/main/examples/05-Retrieval-Model.ipynb)*

## About the Dataset

In this notebook, we use a synthetic dataset that are mimicking the [Ali-CCP: Alibaba Click and Conversion Prediction](https://tianchi.aliyun.com/dataset/dataDetail?dataId=408#1) dataset. The synthetic nature allows us to tune it to our exact needs for demonstration/learning purposes.


### Importing Libraries

In [1]:
import os
import logging

import nvtabular as nvt
import merlin.models.tf as mm
import tensorflow as tf

from nvtabular.ops import *

from merlin.datasets.synthetic import generate_data
from merlin.datasets.ecommerce import transform_aliccp
from merlin.models.utils.example_utils import workflow_fit_transform
from merlin.models.utils.dataset import unique_rows_by_features
from merlin.schema.tags import Tags
from merlin.io.dataset import Dataset


# disable INFO and DEBUG logging everywhere
logging.disable(logging.WARNING)

  from .autonotebook import tqdm as notebook_tqdm
2023-01-11 20:36:30.520117: I tensorflow/core/util/util.cc:169] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2023-01-11 20:36:32.417133: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:991] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-01-11 20:36:32.419064: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:991] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-01-11 20:36:32.420450: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:991] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node z

## Generate Synthetic Ali-CCP Dataset

In [4]:
def generate_aliccp_data(num_rows: int, train_size: float, valid_size: float):
    train, valid = generate_data("aliccp-raw", num_rows, set_sizes=(train_size, valid_size))
    train = train.to_ddf().compute()
    valid = valid.to_ddf().compute()
    return train, valid


In [27]:
# Generate the data
NUM_ROWS = 1000000
TRAIN_SIZE = 0.7
VALID_SIZE = 0.3

train, valid = generate_aliccp_data(NUM_ROWS, TRAIN_SIZE, VALID_SIZE)

In [28]:
# Truncate datasets to only "positive" click interactions between User/Item pairs
train = train.loc[train['click']==1].reset_index(drop=True)
valid = valid.loc[valid['click']==1].reset_index(drop=True)

# Drop the "target" interaction fields -- no longer need them
train = train.drop(['click', 'conversion'], axis=1)
valid = valid.drop(['click', 'conversion'], axis=1)

**Note:** To be able to learn from this implicit feedback, we use the naive assumption that the interacted items are **more relevant** for the user than the non-interacted ones.

This is an assumption for simplification purposes so we can use a negative sampling technique. 

In [29]:
# Use the Merlin Dataset wrapper to create dataset objects
train = Dataset(train)
valid = Dataset(valid)

In [30]:
# Defin output path for data
DATA_DIR = os.environ['PWD'] +"/data/"
OUTPUT_DATA_DIR = os.path.join(DATA_DIR, "processed")
CATEGORY_TEMP_DIR = os.path.join(DATA_DIR, "categories")

In [31]:
# Define Feature Transformation Pipeline

user_id = ["user_id"] >> Categorify(out_path=CATEGORY_TEMP_DIR) >> TagAsUserID()
item_id = ["item_id"] >> Categorify(out_path=CATEGORY_TEMP_DIR) >> TagAsItemID()

item_features = ["item_category", "item_shop", "item_brand"] >> Categorify(out_path=CATEGORY_TEMP_DIR) >> TagAsItemFeatures()

user_features = (
    [
        "user_shops",
        "user_profile",
        "user_group",
        "user_gender",
        "user_age",
        "user_consumption_2",
        "user_is_occupied",
        "user_geography",
        "user_intentions",
        "user_brands",
        "user_categories",
    ]
    >> Categorify(out_path=CATEGORY_TEMP_DIR)
    >> TagAsUserFeatures()
)

outputs = user_id + item_id + item_features + user_features

With `transform_aliccp` function, we can execute fit() and transform() on the raw dataset applying the operators defined in the NVTabular workflow pipeline above. The processed parquet files are saved to output_path.

In [32]:
# Transform data and create files
transform_aliccp((train, valid), OUTPUT_DATA_DIR, nvt_workflow=outputs)

## Building a Two-Tower Model

We will use Two-Tower Model to infer a subset of relevant items from large item corpus for a given user. 

A Two-Tower Model consists of item (candidate) and user (query) encoder towers. With two towers, the model can learn representations (embeddings) for queries and candidates separately. 

> NEED TO FIND IMG
<img src="./images/TwoTower.png"  width="30%">

Image Adapted from: [Off-policy Learning in Two-stage Recommender Systems](https://dl.acm.org/doi/abs/10.1145/3366423.3380130)

In [40]:
# Load from file
train = Dataset(os.path.join(OUTPUT_DATA_DIR, "train", "*.parquet"))
valid = Dataset(os.path.join(OUTPUT_DATA_DIR, "valid", "*.parquet"))



Use the `schema` object to define our model. Select features with user and item tags, and be sure to exclude target column.

In [41]:
# Schema will consist of the User ID, Item ID, User Features, and Item Features (as defined above)
schema = train.schema.select_by_tag([Tags.ITEM_ID, Tags.USER_ID, Tags.ITEM, Tags.USER])

# Set the schema for our datasets
train.schema = schema
valid.schema = schema

Inspect the column names in the schmea here:

In [43]:
schema

Unnamed: 0,name,tags,dtype,is_list,is_ragged,properties.num_buckets,properties.freq_threshold,properties.max_size,properties.start_index,properties.cat_path,properties.embedding_sizes.cardinality,properties.embedding_sizes.dimension,properties.domain.min,properties.domain.max,properties.domain.name
0,user_id,"(Tags.CATEGORICAL, Tags.USER_ID, Tags.ID, Tags...",int64,False,False,,0.0,0.0,0.0,/workdir/data/categories/categories/unique.use...,639.0,60.0,0,638,user_id
1,item_id,"(Tags.ITEM, Tags.CATEGORICAL, Tags.ITEM_ID, Ta...",int64,False,False,,0.0,0.0,0.0,/workdir/data/categories/categories/unique.ite...,673.0,61.0,0,672,item_id
2,item_category,"(Tags.ITEM, Tags.CATEGORICAL)",int64,False,False,,0.0,0.0,0.0,/workdir/data/categories/categories/unique.ite...,673.0,61.0,0,672,item_category
3,item_shop,"(Tags.ITEM, Tags.CATEGORICAL)",int64,False,False,,0.0,0.0,0.0,/workdir/data/categories/categories/unique.ite...,673.0,61.0,0,672,item_shop
4,item_brand,"(Tags.ITEM, Tags.CATEGORICAL)",int64,False,False,,0.0,0.0,0.0,/workdir/data/categories/categories/unique.ite...,673.0,61.0,0,672,item_brand
5,user_shops,"(Tags.CATEGORICAL, Tags.USER)",int64,False,False,,0.0,0.0,0.0,/workdir/data/categories/categories/unique.use...,639.0,60.0,0,638,user_shops
6,user_profile,"(Tags.CATEGORICAL, Tags.USER)",int64,False,False,,0.0,0.0,0.0,/workdir/data/categories/categories/unique.use...,54.0,16.0,0,53,user_profile
7,user_group,"(Tags.CATEGORICAL, Tags.USER)",int64,False,False,,0.0,0.0,0.0,/workdir/data/categories/categories/unique.use...,12.0,16.0,0,11,user_group
8,user_gender,"(Tags.CATEGORICAL, Tags.USER)",int64,False,False,,0.0,0.0,0.0,/workdir/data/categories/categories/unique.use...,3.0,16.0,0,2,user_gender
9,user_age,"(Tags.CATEGORICAL, Tags.USER)",int64,False,False,,0.0,0.0,0.0,/workdir/data/categories/categories/unique.use...,8.0,16.0,0,7,user_age


As expected, we shouldn't have any label/target data yet

In [44]:
label_names = schema.select_by_tag(Tags.TARGET).column_names
label_names

[]

### About Negative Sampling

Many datasets for recommender systems contain implicit feedback with logs of user interactions like clicks, add-to-cart, purchases, music listening events, rather than explicit ratings that reflects user preferences over items. 


In Merlin Models -- NVIDIA provides some scalable negative sampling algorithms for this Item Retrieval task. In this example, we use the `in-batch` sampling algorithm which uses the items interacted by other users as negatives within the same mini-batch.

### Model Architecture

The **Two-Tower** model consists of a **User tower** (where all user features are fed) and an **Item tower** (where all item features are fed).

The User tower generates an embedding for the User. Then it computes the positive interaction "score" (likelihood of interaction event) using the dot-product between the User embedding and the Item embedding, in addition to sampled "negative" Items within a batch.

In [45]:
def create_two_tower(tower_dim: int, encoder_dim: int, optimizer: str, k: int, tags) -> mm.TwoTowerModelV2:
    # User/Query Tower
    user_schema = schema.select_by_tag(tags.USER)
    # create user (query) tower input block
    user_inputs = mm.InputBlockV2(user_schema)
    # create user (query) encoder block
    query = mm.Encoder(
        user_inputs,
        mm.MLPBlock([encoder_dim, tower_dim], no_activation_last_layer=True)
    )

    # Item/Candidate Tower
    item_schema = schema.select_by_tag(tags.ITEM)
    # create item (candidate) tower input block
    item_inputs = mm.InputBlockV2(item_schema)
    # create item (candidate) encoder block
    candidate = mm.Encoder(
        item_inputs,
        mm.MLPBlock([encoder_dim, tower_dim], no_activation_last_layer=True)
    )
    
    # Build Model Class
    model = mm.TwoTowerModelV2(query, candidate)
    model.compile(optimizer=optimizer, run_eagerly=False, metrics=[mm.RecallAt(k), mm.NDCGAt(k)])
    return model

**Notes:**
- `no_activation_last_layer:` when set True, no activation is used for top hidden layer. Learn more [here](https://storage.googleapis.com/pub-tools-public-publication-data/pdf/b9f4e78a8830fe5afcf2f0452862fb3c0d6584ea.pdf).
- In the `TwoTowerModelV2` function we did not set `negative_samplers` arg. By default, it uses contrastive learning and `in-batch` negative sampling strategy.
- Two metrics are used to judge the quality of the recommendations: **Normalized Discounted Cumulative Gain (NDCG@K)** and **Recall@K**.
    - NDCG@K accounts for rank of the relevant item in the recommendation list and is a more fine-grained metric than HR, which only verifies whether the relevant item is among the top-k items.
    - Recall (Also known as HitRate@K) when there is only one relevant item in the recommendation list. Recall just verifies whether the relevant item is among the top-k items.
- When we set `validation_data=valid` in the `model.fit()`, we compute evaluation metrics on validation set using the negative sampling strategy used for training. 

In [49]:
# Initialize model
model = create_two_tower(
    tower_dim=64,
    encoder_dim=128,
    optimizer="adam",
    k=10,
    tags=Tags
)

# Fit model
model.fit(train, validation_data=valid, batch_size=4096, epochs=2)

Epoch 1/2
Epoch 2/2


<keras.callbacks.History at 0x7f6d85deadf0>

### Evaluate the model accuracy

The validation metric values during training are calculated given the positive and negative scores in each batch, and then averaged over batches per epoch. **That means validation metrics are not computed using the entire item catalog.**

To determine the exact accuracy, we need to compute the similarity score between a given query and all possible candidates. Below, by using the `topk_model` we can evaluate the trained retrieval model using the entire item catalog (brute force).

In [54]:
# Create candidate/item features for evaluation
candidate_features = unique_rows_by_features(train, Tags.ITEM, Tags.ITEM_ID)

In [55]:
# Here's a display of all of the items
candidate_features.to_ddf().compute()

Unnamed: 0,item_id,item_category,item_shop,item_brand
6,1,1,1,1
64,2,2,2,2
1,3,3,3,3
36,4,4,4,4
15,5,5,5,5
...,...,...,...,...
242971,668,668,668,668
10928,669,669,669,669
199735,670,670,670,670
343407,671,671,671,671


In [56]:
# Convert model to a top_k_encoder
topk_model = model.to_top_k_encoder(candidate_features, k=20, batch_size=128)

# we can set `metrics` param in the `compile(), if we want
topk_model.compile(run_eagerly=False)



In [57]:
# Create data loader for validation data
eval_loader = mm.Loader(valid, batch_size=1024).map(mm.ToTarget(schema, "item_id"))

# Evaluation
metrics = topk_model.evaluate(eval_loader, return_dict=True)
metrics



{'loss': 0.46805083751678467,
 'recall_at_10': 0.09828343242406845,
 'mrr_at_10': 0.03562566637992859,
 'ndcg_at_10': 0.0501684807240963,
 'map_at_10': 0.03562433272600174,
 'precision_at_10': 0.009822357445955276,
 'regularization_loss': 0.0,
 'loss_batch': 0.42506444454193115}

### Generate top-K recommendations

Let's generate top-K (k=20 in our example) recommendations for a given batch of 8 samples. The `to_top_k_encoder()` method uses the item/candidate features dataset to compute and store all item/candidate embeddings in an index. The forward method of `topk_model` takes as the query/user features as input, and computes the dot product scores between the given query/user embeddings and all the candidates of the top-k index. Then, it returns the top-k (k=20) item ids with the highest scores.

In [189]:
# TODO
user_features = unique_rows_by_features(train, Tags.USER, Tags.USER_ID)
loader = mm.Loader(user_features, batch_size=8, shuffle=False)
batch = next(iter(loader))
print(batch[0]['user_id'])

tf.Tensor(
[[1]
 [2]
 [3]
 [4]
 [5]
 [6]
 [7]
 [8]], shape=(8, 1), dtype=int64)


The recommended top 20 item ids are returned below for each of the 8 selected users (from the validation set). The output of the method is a named tuple `TopKPrediction`, where the first element is the dot product scores and the second element is the encoded item ids (not the original ids).

In [195]:
scores, reccommended_item_ids = topk_model(batch[0])

In [196]:
# TODO
reccommended_item_ids

<tf.Tensor: shape=(8, 20), dtype=int32, numpy=
array([[ 44, 116,  41,  82, 176, 427, 120,  46,  13,   9, 141, 435,  27,
         30,   1, 309, 190, 302, 501,  79],
       [  2, 342, 309,   3,  15, 255, 334, 666,  66, 562, 319, 171, 341,
         17, 439, 205, 590, 234, 509,  27],
       [154,  15, 166, 156, 261,  64, 326, 293, 214, 360,  10, 359, 134,
        425, 379, 348,  27, 513,   3, 351],
       [  5,   6,   9, 433, 382, 185, 290,   8, 341,  11, 466, 226, 297,
        309,  20,   7, 440, 180,   4, 176],
       [  3,  10,  21, 208, 256, 441, 129,   2, 353, 321, 486, 192,   5,
        150, 123,  16, 188, 285, 379,  69],
       [  5,  10,  18,   8,   6, 150,  21, 286, 157, 128, 256,  95, 311,
        358,  22, 623, 604, 584, 129, 237],
       [205, 326,  14,  10, 166,  15,   6,  19,  18, 457,  22, 297, 662,
        256, 154, 530, 162,   4, 102, 185],
       [ 13, 171,   1, 141, 120, 309, 226,  41, 494,  17, 427,  62,   6,
         44,   2,   4,  25, 243, 280,   9]], dtype=int32)>

In [197]:
# TODO
scores

<tf.Tensor: shape=(8, 20), dtype=float32, numpy=
array([[0.11971331, 0.11730451, 0.11710918, 0.10144109, 0.10136084,
        0.10009474, 0.09451863, 0.09426868, 0.09191871, 0.09190184,
        0.08844166, 0.08825772, 0.08708625, 0.08590761, 0.08502369,
        0.08320396, 0.08169447, 0.08130737, 0.08104654, 0.08101486],
       [0.1007937 , 0.05819228, 0.05676133, 0.05507875, 0.05385191,
        0.05231218, 0.05026827, 0.04818096, 0.04429016, 0.04413381,
        0.04370643, 0.04094318, 0.04007391, 0.03957719, 0.03950412,
        0.03864555, 0.03795424, 0.03625087, 0.03609602, 0.03584536],
       [0.04929588, 0.04792584, 0.04485624, 0.04083929, 0.0402455 ,
        0.03739437, 0.03208239, 0.0317454 , 0.03147587, 0.02995438,
        0.02948501, 0.0286586 , 0.02753394, 0.02735972, 0.02705404,
        0.02617747, 0.02584481, 0.02583223, 0.02566369, 0.02544329],
       [0.05400515, 0.05316183, 0.04638377, 0.04348372, 0.04185567,
        0.04057237, 0.0401042 , 0.03878526, 0.03799238, 0.037126

## Writing Recommendations to the Inference Store

Redis is used (low latency k-v store) to persist recommendations for each User.


In [198]:
import asyncio
import redis.asyncio as redis

from redis.commands.json.path import Path


def generate_topk_recs(topk_model, user_features, K: int, batch_size: int):
    loader = mm.Loader(user_features, batch_size=batch_size, shuffle=False)
    for batch in loader:
        users = batch[0]['user_id']
        
        scores, topk_items = topk_model(batch[0])
        for user, recs in zip(users.numpy(), topk_items.numpy()):
            user_id = user[0]
            yield user_id, recs.tolist()[:K]

In [199]:
# Test Recommendation Generator
next(generate_topk_recs(topk_model, valid, K=10, batch_size=32))

# SEE BELOW: User ID --> Top K Item IDs

(4, [5, 6, 9, 433, 382, 185, 290, 8, 341, 11])

In [203]:
async def store_recommendations(topk_model, valid, n: int, redis_conn: redis.Redis):
    """
    Store recommendations generated for each User.
    """
    semaphore = asyncio.Semaphore(n)
    async def store(user_id: str, recs: list):
        """
        Store and individual User's latest recommendations in Redis.
        """
        async with semaphore:
            entry = {
                "user_id": int(user_id),
                "recommendations": [int(rec) for rec in recs]
            }
            # Set the JSON object in Redis
            await redis_conn.json().set(f"USER:{user_id}", Path.root_path(), entry)
    
    # create generator
    topk_recs_per_user = generate_topk_recs(topk_model, valid, K=10, batch_size=32)
    # gather with "concurrency"
    await asyncio.gather(*[store(user_id, recs) for user_id, recs in topk_recs_per_user])

In [204]:
redis_conn = redis.Redis(
    host="redis-inference-store",
    port=6379,
    decode_responses=True
)

# Run the process
await store_recommendations(topk_model, valid, n=100, redis_conn=redis_conn)

In [205]:
!redis-cli -h redis-inference-store -p 6379 hgetall USER:1

/bin/bash: redis-cli: command not found


## Conclusion

Now you have all of the tools to **train**

blah blah blah

## Exporting Retrieval Models

So far we have trained and evaluated our Retrieval model. Now, the next step is to deploy our model and generate top-K recommendations given a user (query). We can efficiently serve our model by indexing the trained item embeddings into an **Approximate Nearest Neighbors (ANN)** engine. Basically, for a given user query vector, that is generated passing the user features into user tower of retrieval model, we do an ANN search query to find the ids of nearby item vectors, and at serve time, we score user embeddings over all indexed top-K item embeddings within the ANN engine.

In doing so, we need to export
 
- user (query) tower
- item and user features
- item embeddings

#### Save and Load User (query) tower

We are able to save the user tower model as a TF model to disk. The user tower model is needed to generate a user embedding vector when a user feature vector <i>x</i> is fed into that model.

In [167]:
query_tower = model.query_encoder
query_tower.save(os.path.join(DATA_DIR, "query_tower"))

## we can load back the saved model via the following script.
#query_tower_loaded = tf.keras.models.load_model(os.path.join(DATA_FOLDER, 'query_tower'))

#### Extract and save User features

With `unique_rows_by_features` utility function we can easily extract both unique user and item features tables as cuDF dataframes. Note that for user features table, we use `USER` and `USER_ID` tags.

In [168]:
user_features = (
    unique_rows_by_features(train, Tags.USER, Tags.USER_ID).compute().reset_index(drop=True)
)

In [169]:
user_features.head()

Unnamed: 0,user_id,user_shops,user_profile,user_group,user_gender,user_age,user_consumption_2,user_is_occupied,user_geography,user_intentions,user_brands,user_categories
0,1,1,1,1,1,1,1,1,1,1,1,1
1,2,2,1,1,1,1,1,1,1,2,2,2
2,3,3,1,1,1,1,1,1,1,3,3,3
3,4,4,1,1,1,1,1,1,1,4,4,4
4,5,5,1,1,1,1,1,1,1,5,5,5


In [171]:
# save to disk
user_features.to_parquet(os.path.join(DATA_DIR, "user_features.parquet"))

#### Generate Query embeddings for entire user catalog

In [172]:
queries = model.query_embeddings(Dataset(user_features, schema=schema), batch_size=1024, index=Tags.USER_ID)
query_embs_df = queries.compute(scheduler="synchronous").reset_index()



In [173]:
query_embs_df.head()

Unnamed: 0,user_id,0,1,2,3,4,5,6,7,8,...,54,55,56,57,58,59,60,61,62,63
0,1,-0.044415,-0.079928,0.141791,-0.054618,0.07108,-0.129675,-0.022908,-0.098638,-0.165534,...,0.069087,-0.041935,-0.088211,0.009338,0.042672,0.064693,0.089439,-0.046101,0.019635,0.063172
1,2,0.050006,0.048904,0.122381,-0.045254,0.087561,-0.083287,0.047019,-0.062766,-0.025593,...,-0.034007,0.008304,0.003901,-0.041683,0.066743,0.026556,0.051612,-0.048407,0.101967,0.01646
2,3,0.032441,0.068938,0.14422,0.02558,-0.021241,-0.087041,-0.005831,-0.011544,-0.028608,...,-0.039868,0.02217,-0.067877,-0.04548,0.029166,0.085423,-0.011185,-0.077492,0.052504,0.085522
3,4,-5.7e-05,0.027755,0.110519,0.004052,0.026682,-0.023924,0.000899,-0.079054,-0.12101,...,0.042694,-0.068973,-0.055938,-0.039705,0.065993,-0.022096,0.04686,-0.075874,0.105141,0.074089
4,5,0.038788,0.093953,0.085692,-0.030298,0.056857,-0.07618,0.046644,-0.035246,-0.080857,...,-0.020631,-0.004258,0.006572,-0.032787,0.025109,-0.013951,-0.019807,-0.078905,0.125748,0.012741


#### Extract and save Item features

In [174]:
item_features = (
    unique_rows_by_features(train, Tags.ITEM, Tags.ITEM_ID).compute().reset_index(drop=True)
)

In [175]:
item_features.head()

Unnamed: 0,item_id,item_category,item_shop,item_brand
0,1,1,1,1
1,2,2,2,2
2,3,3,3,3
3,4,4,4,4
4,5,5,5,5


In [177]:
# save to disk
item_features.to_parquet(os.path.join(DATA_DIR, "item_features.parquet"))

#### Extract and save Item embeddings

In [178]:
item_embs = model.candidate_embeddings(Dataset(item_features, schema=schema), batch_size=1024, index=Tags.ITEM_ID)



In [179]:
item_embs_df = item_embs.compute(scheduler="synchronous")

In [180]:
item_embs_df

Unnamed: 0_level_0,0,1,2,3,4,5,6,7,8,9,...,54,55,56,57,58,59,60,61,62,63
item_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1,-0.086588,-0.035170,0.038872,-0.030351,-0.005060,-0.026615,0.036075,-0.019011,-0.025655,0.061539,...,0.079430,-0.023591,0.039861,0.001400,0.047819,-0.046945,0.080522,-0.049821,0.011490,-0.034052
2,-0.020607,-0.028564,0.052904,-0.044290,0.044060,-0.077663,0.063523,-0.002706,-0.049997,-0.046659,...,-0.019446,0.017111,0.047408,0.014948,0.037164,-0.055093,0.080557,-0.004503,0.046443,-0.029852
3,0.014424,0.062466,0.026331,-0.016829,0.067588,-0.040853,0.049609,-0.068072,-0.040935,-0.016965,...,0.029904,0.049761,0.036740,-0.054622,0.017131,-0.079658,-0.063603,-0.020487,0.013681,0.013386
4,-0.100455,-0.087459,-0.002279,0.015967,-0.054807,0.021265,-0.003342,0.098855,-0.029481,0.007833,...,0.119475,0.023623,0.032304,-0.038052,0.066611,-0.063890,0.049579,-0.140459,0.029618,-0.030336
5,-0.010816,0.033060,0.055615,0.017870,0.008282,-0.040606,0.008903,-0.037174,-0.029034,0.012322,...,0.061865,-0.025471,0.040039,-0.002353,0.021088,-0.013709,0.019025,-0.025305,-0.004278,0.054150
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
668,-0.011978,0.008819,0.016523,-0.073915,0.054216,-0.067495,0.008583,-0.067215,0.014997,0.033973,...,0.052676,0.002763,0.043437,-0.001684,0.025481,-0.051444,0.030094,-0.034564,-0.005135,-0.015033
669,-0.051683,-0.021606,0.018060,-0.038789,-0.007095,-0.028825,0.026450,0.028714,-0.007321,0.013074,...,0.093724,0.010164,0.032347,-0.009498,-0.022861,-0.095878,0.034653,-0.067983,-0.050209,-0.018708
670,-0.014077,-0.005224,0.057278,-0.001175,-0.005469,0.012073,0.023982,-0.001092,-0.052760,0.024767,...,0.066937,-0.007940,0.025360,-0.028201,0.024175,-0.053981,0.062025,-0.054338,-0.030390,0.019311
671,-0.001167,-0.023767,0.016283,-0.048372,0.009309,-0.022098,0.010793,0.003111,-0.034449,0.045869,...,0.045344,0.004631,-0.002546,-0.008917,-0.022978,0.001993,0.026103,-0.005468,-0.046113,0.056712


In [182]:
# save to disk
item_embs_df.to_parquet(os.path.join(DATA_DIR, "item_embeddings.parquet"))

That's it. You have learned how to train and evaluate your Two-Tower retrieval model, and then how to export the required components to be able to deploy this model to generate recommendations. In order to learn more on serving a model to [Triton Inference Server](https://github.com/triton-inference-server/server), please explore the examples in the [Merlin](https://github.com/NVIDIA-Merlin/Merlin) and [Merlin Systems](https://github.com/NVIDIA-Merlin/systems) repos.