<img src="https://developer.download.nvidia.com/notebooks/dlsw-notebooks/merlin_merlin_01-building-recommender-systems-with-merlin/nvidia_logo.png" style="width: 90px; float: right;"> 

# Building Online Multi-Stage Recsys Components
The figure below represents a **four-stage recommender system**. This is more complex process than only training a single model and deploying it, and it is much more realistic and closer to what's happening in the real-world recommender production systems. The models and data to perform the bottom row of tasks were previously completed in the first notebook [here](...).

![img](./img/OnlineMultiStageRecsys.png)

In this notebook, we are going to prepare the assets on the bottom row to deploy a four-stage recommender system on [Triton Inference Server](https://github.com/triton-inference-server/server). 

To learn more about the four-stage recommender systems, you can listen to Even Oldridge's [Moving Beyond Recommender Models talk](https://www.youtube.com/watch?v=5qjiY-kLwFY&list=PL65MqKWg6XcrdN4TJV0K1PdLhF_Uq-b43&index=7) at KDD'21 and read more [in this blog post](https://eugeneyan.com/writing/system-design-for-discovery/).

In addition to NVIDIA Merlin libraries and the Triton Inference Server client library, we use two external libraries in these series of examples:

- [Feast](https://docs.feast.dev/): an end-to-end open source feature store library for machine learning
- [Redis](https://github.com/redis/redis-py): a low-latency key-value store and ANN index

## Steps
1) [**Feature Store Setup**](#Feature-Store-Setup)
2) [**Redis ANN Index Setup**](#Redis-ANN-Index-Setup)

### Import required libraries and functions

*These notebooks are developed and tested using `merlin-tensorflow:22.11` container on [NVIDIA's docker registry](https://catalog.ngc.nvidia.com/containers?filters=&orderBy=dateModifiedDESC&query=merlin).*

In [1]:
import os
import merlin.models.tf as mm
import nvtabular as nvt
import numpy as np
import tensorflow as tf

from merlin.datasets.ecommerce import transform_aliccp
from merlin.schema.tags import Tags
from merlin.io.dataset import Dataset
from nvtabular.ops import *


# for running this example on CPU, comment out the line below
# os.environ["TF_GPU_ALLOCATOR"] = "cuda_malloc_async"

2023-01-13 19:42:05.080627: I tensorflow/core/util/util.cc:169] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
  from .autonotebook import tqdm as notebook_tqdm
2023-01-13 19:42:08.796452: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:991] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-01-13 19:42:08.798022: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:991] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-01-13 19:42:08.799213: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:991] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node z

In [2]:
# disable INFO and DEBUG logging everywhere
import logging

logging.disable(logging.WARNING)

First, we define our input path and feature repo path.

In [8]:
# Define output path for data
BASE_DIR = os.environ['PWD']
DATA_DIR = os.path.join(BASE_DIR, "data")
DLRM_DIR = os.path.join(DATA_DIR, "dlrm")
QUERY_TOWER_DIR = os.path.join(DATA_DIR, "query_tower")
OUTPUT_DATA_DIR = os.path.join(DATA_DIR, "processed")
OUTPUT_RETRIEVAL_DATA_DIR = os.path.join(OUTPUT_DATA_DIR, "retrieval")
CATEGORY_TEMP_DIR = os.path.join(DATA_DIR, "categories")

Next, we need to load the previously trained assets. If you have your own great, make sure they end up in the same folder structure as the ones we will pull from the publically hosted S3 bucket below

In [4]:
# Pull data/models from S3
# TODO

## Feature Store Setup

We need to create a Feast feature repository. [Feast](https://feast.dev/) is an end-to-end open source feature store for machine learning. Feast (Feature Store) is a customizable operational data system that re-uses existing infrastructure to manage and serve machine learning features to real-time models.

We will create the feature repo in the current working directory, which is `BASE_DIR` for us.

In [24]:
BASE_DIR

'/workdir'

In [33]:
!rm -rf $BASE_DIR/feature_repo
!cd $BASE_DIR && feast init feature_repo


Creating a new Feast repository in [1m[32m/workdir/feature_repo[0m.



You should be seeing a message like <i>Creating a new Feast repository in ... </i> printed out above. Now, we navigate to the `feature_repo` folder and remove the demo parquet file created by default, and `examples.py` file.

In [34]:
# Cleanup
feature_repo_path = os.path.join(BASE_DIR, "feature_repo", "feature_repo")
if os.path.exists(f"{feature_repo_path}/example_repo.py"):
    os.remove(f"{feature_repo_path}/example_repo.py")
if os.path.exists(f"{feature_repo_path}/data/driver_stats.parquet"):
    os.remove(f"{feature_repo_path}/data/driver_stats.parquet")

### Prepare User and Item features

In [17]:
from merlin.models.utils.dataset import unique_rows_by_features

# Load pre-generated User features file
user_features = Dataset(os.path.join(DATA_DIR, "user_features.parquet")).to_ddf().compute()
user_features.head()



Unnamed: 0,user_id,user_shops,user_profile,user_group,user_gender,user_age,user_consumption_2,user_is_occupied,user_geography,user_intentions,user_brands,user_categories,user_id_raw
0,1,1,1,1,1,1,1,1,1,1,1,1,7
1,2,2,1,1,1,1,1,1,1,2,2,2,6
2,3,3,1,1,1,1,1,1,1,3,3,3,8
3,4,4,1,1,1,1,1,1,1,4,4,4,9
4,5,5,1,1,1,1,1,1,1,5,5,5,5


We will artificially add `datetime` and `created` timestamp columns to our user_features dataframe. This required by Feast to track the user-item features and their creation time and to determine which version to use when we query Feast.

In [20]:
from datetime import datetime

user_features["datetime"] = datetime.now()
user_features["datetime"] = user_features["datetime"].astype("datetime64[ns]")
user_features["created"] = datetime.now()
user_features["created"] = user_features["created"].astype("datetime64[ns]")
user_features.head()

Unnamed: 0,user_id,user_shops,user_profile,user_group,user_gender,user_age,user_consumption_2,user_is_occupied,user_geography,user_intentions,user_brands,user_categories,user_id_raw,datetime,created
0,1,1,1,1,1,1,1,1,1,1,1,1,7,2023-01-13 19:57:29.959151,2023-01-13 19:57:29.959949
1,2,2,1,1,1,1,1,1,1,2,2,2,6,2023-01-13 19:57:29.959151,2023-01-13 19:57:29.959949
2,3,3,1,1,1,1,1,1,1,3,3,3,8,2023-01-13 19:57:29.959151,2023-01-13 19:57:29.959949
3,4,4,1,1,1,1,1,1,1,4,4,4,9,2023-01-13 19:57:29.959151,2023-01-13 19:57:29.959949
4,5,5,1,1,1,1,1,1,1,5,5,5,5,2023-01-13 19:57:29.959151,2023-01-13 19:57:29.959949


In [35]:
# Write parquet file to feature_repo
user_features.to_parquet(
    os.path.join(feature_repo_path, "data", "user_features.parquet")
)

In [36]:
# Load pre-generated Item features file
item_features = Dataset(os.path.join(DATA_DIR, "item_features.parquet")).to_ddf().compute()
item_features.head()



Unnamed: 0,item_id,item_category,item_shop,item_brand,item_id_raw
0,1,1,1,1,7
1,2,2,2,2,8
2,3,3,3,3,6
3,4,4,4,4,5
4,5,5,5,5,9


In [38]:
# Append timestamps
item_features["datetime"] = datetime.now()
item_features["datetime"] = item_features["datetime"].astype("datetime64[ns]")
item_features["created"] = datetime.now()
item_features["created"] = item_features["created"].astype("datetime64[ns]")
item_features.head()

Unnamed: 0,item_id,item_category,item_shop,item_brand,item_id_raw,datetime,created
0,1,1,1,1,7,2023-01-13 20:17:15.088307,2023-01-13 20:17:15.089394
1,2,2,2,2,8,2023-01-13 20:17:15.088307,2023-01-13 20:17:15.089394
2,3,3,3,3,6,2023-01-13 20:17:15.088307,2023-01-13 20:17:15.089394
3,4,4,4,4,5,2023-01-13 20:17:15.088307,2023-01-13 20:17:15.089394
4,5,5,5,5,9,2023-01-13 20:17:15.088307,2023-01-13 20:17:15.089394


In [39]:
# Write parquet file to feature_repo
item_features.to_parquet(
    os.path.join(feature_repo_path, "data", "item_features.parquet")
)

### Create feature definitions 

Now we will create our user and item features definitions in the user_features.py and item_features.py files and save these files in the feature_repo.

In [57]:
with open(os.path.join(feature_repo_path, "user_features.py"), "w") as f:
    f.write(f"""
import datetime

from google.protobuf.duration_pb2 import Duration
from feast import Entity, Feature, FeatureView, ValueType
from feast.infra.offline_stores.file_source import FileSource

user_features = FileSource(
    path={os.path.join(feature_repo_path, "data", "user_features.parquet")},
    event_timestamp_column="datetime",
    created_timestamp_column="created",
)

user_raw = Entity(name="user_id_raw", value_type=ValueType.INT32, description="user id raw",)

user_features_view = FeatureView(
    name="user_features",
    entities=["user_id_raw"],
    ttl=Duration(seconds=86400 * 7),
    features=[
        Feature(name="user_shops", dtype=ValueType.INT32),
        Feature(name="user_profile", dtype=ValueType.INT32),
        Feature(name="user_group", dtype=ValueType.INT32),
        Feature(name="user_gender", dtype=ValueType.INT32),
        Feature(name="user_age", dtype=ValueType.INT32),
        Feature(name="user_consumption_2", dtype=ValueType.INT32),
        Feature(name="user_is_occupied", dtype=ValueType.INT32),
        Feature(name="user_geography", dtype=ValueType.INT32),
        Feature(name="user_intentions", dtype=ValueType.INT32),
        Feature(name="user_brands", dtype=ValueType.INT32),
        Feature(name="user_categories", dtype=ValueType.INT32),
        Feature(name="user_id", dtype=ValueType.INT32),
    ],
    online=True,
    input=user_features,
    tags=dict(),
)
""")

In [56]:
with open(os.path.join(feature_repo_path, "item_features.py"), "w") as f:
    f.write(f"""
import datetime

from google.protobuf.duration_pb2 import Duration
from feast import Entity, Feature, FeatureView, ValueType
from feast.infra.offline_stores.file_source import FileSource

item_features = FileSource(
    path={os.path.join(feature_repo_path, "data", "item_features.parquet")},
    event_timestamp_column="datetime",
    created_timestamp_column="created",
)

item = Entity(name="item_id", value_type=ValueType.INT32, description="item id",)

item_features_view = FeatureView(
    name="item_features",
    entities=["item_id"],
    ttl=Duration(seconds=86400 * 7),
    features=[
        Feature(name="item_category", dtype=ValueType.INT32),
        Feature(name="item_shop", dtype=ValueType.INT32),
        Feature(name="item_brand", dtype=ValueType.INT32),
        Feature(name="item_id_raw", dtype=ValueType.INT32),
    ],
    online=True,
    input=item_features,
    tags=dict(),
)
""")

In [55]:
with open(os.path.join(feature_repo_path, "feature_store.yaml"), "w") as f:
    f.write("""
project: merlin_feature_store
registry: data/registry.db
provider: local
online_store:
  type: redis
  redis_type: redis
  connection_string: "172.20.0.20:6379"
entity_key_serialization_version: 2
""")

Let's checkout our Feast feature repository structure.

In [58]:
import seedir as sd

sd.seedir(
    feature_repo_path,
    style="lines",
    itemlimit=10,
    depthlimit=3,
    exclude_folders=[".ipynb_checkpoints", "__pycache__"],
    sort=True,
)

feature_repo/
├─__init__.py
├─data/
│ ├─item_features.parquet
│ └─user_features.parquet
├─feature_store.yaml
├─item_features.py
├─test_workflow.py
└─user_features.py


## Redis ANN Index Setup

### Load Item Embeddings
We will load the pre-generated Item embeddings from file in preparation for loading into the Redis Server.

In [59]:
item_embeddings = Dataset(os.path.join(DATA_DIR, "item_embeddings.parquet")).to_ddf().compute()
item_embeddings.head()



Unnamed: 0,item_id,0,1,2,3,4,5,6,7,8,...,54,55,56,57,58,59,60,61,62,63
0,1,0.05304,-0.0728,0.029477,0.061942,-0.083255,-0.027927,0.040919,0.049997,-0.029252,...,0.049286,-0.087432,0.0097,-0.071424,0.024802,-0.012824,-0.045329,0.098826,0.028681,-0.000962
1,2,-0.022205,-0.073701,0.084334,-0.087279,0.050656,-0.088896,0.0246,0.115385,-0.007804,...,0.069289,0.089851,0.059622,0.00228,-0.102191,0.056542,0.06724,-0.019867,-0.04911,-0.09991
2,3,0.009324,0.004293,-0.021806,0.011273,0.020458,0.00037,0.035672,0.035399,-0.019392,...,-0.066249,-0.038092,0.028947,-0.004527,-0.007591,0.090114,-0.041121,0.030358,-0.016289,0.000246
3,4,0.028158,-0.01505,-0.010841,0.057088,-0.019317,0.015477,0.011777,-0.020076,-0.026838,...,-0.00481,-0.044477,-0.003194,0.012644,-0.043281,0.032553,-0.01137,0.024755,-0.060624,-0.010108
4,5,-0.041253,0.041424,0.004513,-0.033554,0.051555,-0.000238,0.004218,0.086332,-0.030636,...,-0.083374,0.008296,-0.003563,0.006302,0.002813,0.013661,0.040882,0.018407,-0.014042,-0.012307


In [132]:
import asyncio
import redis.asyncio as redis
from redis.commands.search.query import Query
from redis.commands.search.indexDefinition import IndexDefinition, IndexType
from redis.commands.search.field import VectorField

# Connect to the Redis client
host, port = os.environ["FEATURE_STORE_ADDRESS"].split(":")
redis_conn = redis.Redis(host=host, port=port)

In [133]:
# Define Redis ANN Index Params and Fields
INDEX_NAME = "candidate_index"

vector_field = VectorField(
    "item_embedding",
    "HNSW", {
        "TYPE": "FLOAT32",
        "DIM": 64,
        "DISTANCE_METRIC": "IP",
        "INITIAL_CAP": len(item_embeddings),
    }
)

# Create ANN Index
await redis_conn.ft(INDEX_NAME).create_index(
    fields = [vector_field],
    definition= IndexDefinition(prefix=["ITEM:"], index_type=IndexType.HASH)
)

b'OK'

In [134]:
# Function to write item embeddings to Redis
async def write_item_embeddings(embs, n: int, redis_conn: redis.Redis):
    semaphore = asyncio.Semaphore(n)
    async def write(row):
        async with semaphore:
            item_id = int(row.pop("item_id"))
            entry = {
                "item_id": item_id,
                "item_embeddings": np.array(row.values, dtype=np.float32).tobytes()
            }
            await redis_conn.hset(f"ITEM:{item_id}", mapping=entry)
    asyncio.gather(*[write(row[1]) for row in embs.iterrows()])

In [135]:
# Write embeddings to Redis ANN Index created above
await write_item_embeddings(item_embeddings, 100, redis_conn)

In [140]:
# Verify Index Construction
await redis_conn.ft(INDEX_NAME).info()

{'index_name': 'candidate_index',
 'index_options': [],
 'index_definition': [b'key_type',
  b'HASH',
  b'prefixes',
  [b'ITEM:'],
  b'default_score',
  b'1'],
 'attributes': [[b'identifier',
   b'item_embedding',
   b'attribute',
   b'item_embedding',
   b'type',
   b'VECTOR']],
 'num_docs': '670',
 'max_doc_id': '1340',
 'num_terms': '0',
 'num_records': '0',
 'inverted_sz_mb': '0',
 'vector_index_sz_mb': '0.27555465698242188',
 'total_inverted_index_blocks': '0',
 'offset_vectors_sz_mb': '0',
 'doc_table_size_mb': '0.04718017578125',
 'sortable_values_size_mb': '0',
 'key_table_size_mb': '0.022125244140625',
 'records_per_doc_avg': '0',
 'bytes_per_record_avg': '-nan',
 'offsets_per_term_avg': '-nan',
 'offset_bits_per_record_avg': '-nan',
 'hash_indexing_failures': '0',
 'total_indexing_time': '2.5049999999999999',
 'indexing': '0',
 'percent_indexed': '1',
 'number_of_uses': 5,
 'gc_stats': [b'bytes_collected',
  b'0',
  b'total_ms_run',
  b'1',
  b'total_cycles',
  b'1',
  b'aver

In [142]:
# Fetch an Item ID
item_id = (await redis_conn.keys())[0]

# Fetch a testing input vector
test_vector = await redis_conn.hget(item_id.decode("utf"), "item_embeddings")

# Create a Redis VSS Query
query = Query("*=>[KNN 10 @item_embeddings $vec_param AS vector_score]") \
    .no_content()\
    .dialect(2)

# Search for KNN
k_nearest_neighbors = await redis_conn.ft(INDEX_NAME).search(query, query_params={"vec_param": test_vector})

In [143]:
k_nearest_neighbors # LOL why getting 0 here??? - investigate later

Result{0 total, docs: []}

### Next Steps
In this notebook we created our Feast Feature Store and setup the Redis ANN Index. Next, we will deploy our trained models into [Triton Inference Server (TIS)](https://github.com/triton-inference-server/server) with Merlin Systems library.

For the next step, move on to the [`02-Deploying-Online-Multi-Stage-Recsys-with-Triton.ipynb`](./02-Deploying-Online-Multi-Stage-Redsys-with-Triton.ipynb) notebook to deploy our saved models as an ensemble to TIS and obtain prediction results for a given request.