# Combined VSS + Feature Store deployment

In addition to removing the Feast SDK, VSS retrieval of item features signifigantly improved throughput of the recommender system.

this notebook will walk through how to setup Redis with just a VSS index for retrieving item features.

## Steps
1) [**Feature Store Setup**](#Feature-Store-Setup)
2) [**Redis ANN Index Setup**](#Redis-ANN-Index-Setup)

### Import required libraries and functions

*These notebooks are developed and tested using `merlin-tensorflow:22.11` container on [NVIDIA's docker registry](https://catalog.ngc.nvidia.com/containers?filters=&orderBy=dateModifiedDESC&query=merlin).*

In [1]:
import warnings
warnings.filterwarnings("ignore")

import os
import merlin.models.tf as mm
import nvtabular as nvt
import numpy as np
import tensorflow as tf

from merlin.datasets.ecommerce import transform_aliccp
from merlin.schema.tags import Tags
from merlin.io.dataset import Dataset
from nvtabular.ops import *

# for running this example on CPU, comment out the line below
# os.environ["TF_GPU_ALLOCATOR"] = "cuda_malloc_async"

2023-02-08 22:03:48.725134: I tensorflow/core/util/util.cc:169] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2023-02-08 22:03:52.259741: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:991] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-02-08 22:03:52.260940: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:991] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-02-08 22:03:52.261704: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:991] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-02-08 22:03:52.492599: I tensorflow/core/

First, we define our input path and feature repo path.

In [3]:
# Define output path for data
DATA_DIR = "/model-data/aliccp"
BASE_DIR = "/workdir"

Next, we need to load the previously trained assets. If you have your own great, make sure they end up in the same folder structure as the ones we will pull from the publically hosted S3 bucket below

## Feature Store Setup

We need to create a Feast feature repository. [Feast](https://feast.dev/) is an end-to-end open source feature store for machine learning. Feast (Feature Store) is a customizable operational data system that re-uses existing infrastructure to manage and serve machine learning features to real-time models.

Our feature repo will live at the defined path below:

### Prepare User and Item features

In [4]:
from merlin.models.utils.dataset import unique_rows_by_features

# Load pre-generated User features file
user_features = Dataset(os.path.join(DATA_DIR, "user_features.parquet")).to_ddf().compute()
user_features.head()

Unnamed: 0,user_id,user_shops,user_profile,user_group,user_gender,user_age,user_consumption_2,user_is_occupied,user_geography,user_intentions,user_brands,user_categories,user_id_raw
0,1,1,1,1,1,1,1,1,1,1,1,1,7
1,2,2,1,1,1,1,1,1,1,2,2,2,8
2,3,3,1,1,1,1,1,1,1,3,3,3,6
3,4,4,1,1,1,1,1,1,1,4,4,4,9
4,5,5,1,1,1,1,1,1,1,5,5,5,5


We will artificially add `datetime` and `created` timestamp columns to our user_features dataframe. This required by Feast to track the user-item features and their creation time and to determine which version to use when we query Feast.

In [5]:
# Load pre-generated Item features file
item_features = Dataset(os.path.join(DATA_DIR, "item_features.parquet")).to_ddf().compute()
item_features.head()

Unnamed: 0,item_id,item_category,item_shop,item_brand,item_id_raw
0,1,1,1,1,7
1,2,2,2,2,6
2,3,3,3,3,8
3,4,4,4,4,9
4,5,5,5,5,5


In [6]:
# Write parquet file to feature_repo
user_features.to_parquet(
    os.path.join(BASE_DIR, "feature_repo/data", "user_features.parquet")
)
item_features.to_parquet(
    os.path.join(BASE_DIR, "feature_repo/data", "item_features.parquet")
)

In [4]:
import pandas as pd
from redis import client

host, port = os.environ.get("FEATURE_STORE_ADDRESS", "localhost:6379")

redis_client = client.Redis(host=host, port=port, decode_responses=True)
data_path = os.path.join(BASE_DIR, "feature_repo/data")

def prepare_feature_store(data_path):
    user_dataset = pd.read_parquet(f"{data_path}/user_features.parquet")
    item_dataset = pd.read_parquet(f"{data_path}/item_features.parquet")

    load_dataframe("user", "user_id_raw", user_dataset)
    load_dataframe("item", "item_id_raw", item_dataset)
    
def load_dataframe(feature_name, key_name, df):
    records = df.to_dict(orient="records")
    pipe = redis_client.pipeline()
    for record in records:
        key = ":".join((feature_name, str(record[key_name])))
        pipe.hset(key, mapping=record)
    pipe.execute()
    

In [21]:
%%time
# this should take about 2 minutes
prepare_feature_store(data_path)

CPU times: user 41.9 ms, sys: 9.48 ms, total: 51.4 ms
Wall time: 64.1 ms


In [22]:
pd.DataFrame([redis_client.hgetall("user:8"), redis_client.hgetall("user:9")]).astype(int).info()


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2 entries, 0 to 1
Data columns (total 13 columns):
 #   Column              Non-Null Count  Dtype
---  ------              --------------  -----
 0   user_id             2 non-null      int64
 1   user_shops          2 non-null      int64
 2   user_profile        2 non-null      int64
 3   user_group          2 non-null      int64
 4   user_gender         2 non-null      int64
 5   user_age            2 non-null      int64
 6   user_consumption_2  2 non-null      int64
 7   user_is_occupied    2 non-null      int64
 8   user_geography      2 non-null      int64
 9   user_intentions     2 non-null      int64
 10  user_brands         2 non-null      int64
 11  user_categories     2 non-null      int64
 12  user_id_raw         2 non-null      int64
dtypes: int64(13)
memory usage: 336.0 bytes


In [23]:
%%timeit
redis_client.hgetall("user:8")
# Fast feature retrieval from Redis!

323 µs ± 9.07 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


### Explore Feature Repo Structure

## Redis ANN Index Setup

### Load Item Embeddings
We will load the pre-generated Item embeddings from file in preparation for loading into the Redis Server.

In [8]:
item_embeddings = Dataset(os.path.join(DATA_DIR, "item_embeddings.parquet")).to_ddf().compute()
item_embeddings.head()

Unnamed: 0,item_id,0,1,2,3,4,5,6,7,8,...,54,55,56,57,58,59,60,61,62,63
0,1,-0.034885,-0.000131,0.018455,0.03743,0.026332,0.012729,0.00676,0.069112,0.044133,...,-0.027153,-0.02995,-0.02007,-0.067773,0.00242,-0.001353,-0.055582,0.042481,0.013875,0.021228
1,2,0.021357,-0.026375,0.06909,-0.011445,0.025277,-0.010337,0.008437,0.042574,0.060663,...,-0.03727,-0.039209,0.013558,-0.006484,-0.029601,0.073999,0.009857,-0.022534,-0.00944,-0.025069
2,3,-0.018197,0.017502,0.002263,0.008534,0.015912,0.00636,-0.00166,0.007613,0.054932,...,-0.045789,0.033707,-0.025606,-0.020231,0.068983,0.030158,-0.054312,-0.006741,0.026637,-0.040934
3,4,-0.018756,-0.057435,0.027142,0.069214,-0.014137,0.063484,0.049648,-0.000459,0.04144,...,-0.050948,-0.007804,0.001069,-0.059237,-0.018273,-0.005572,-0.017192,0.033178,0.05067,0.040354
4,5,0.044985,0.015847,-0.041081,-0.00662,-0.003196,-0.04521,-0.031615,-0.093638,0.007464,...,-0.014779,0.057923,-0.015743,-0.048929,0.000438,-0.043618,-0.137103,-0.01958,0.025585,0.028937


In [5]:
import asyncio
import redis.asyncio as redis
from redis.commands.search.query import Query
from redis.commands.search.indexDefinition import IndexDefinition, IndexType
from redis.commands.search.field import VectorField

# Connect to the Redis client
redis_conn = redis.Redis(host=host, port=port)

In [6]:
# Define Redis ANN Index Params and Fields
INDEX_NAME = "candidate_index_2"
VECTOR_FIELD_NAME = "item_embedding"

# Function to write item embeddings to Redis
async def write_item_embeddings(embs, n: int, redis_conn: redis.Redis):
    semaphore = asyncio.Semaphore(n)
    async def write(row):
        async with semaphore:
            item_id = int(row.pop("item_id"))
            entry = {
                VECTOR_FIELD_NAME: np.array(row.values, dtype=np.float32).tobytes()
            }
            await redis_conn.hset(f"item:{item_id}", mapping=entry)
    asyncio.gather(*[write(row[1]) for row in embs.iterrows()])

In [9]:
# Write embeddings to Redis ANN Index created above
await write_item_embeddings(item_embeddings, 100, redis_conn)

In [10]:


vector_field = VectorField(
    VECTOR_FIELD_NAME,
    "HNSW", {
        "TYPE": "FLOAT32",
        "DIM": 64,
        "DISTANCE_METRIC": "IP",
        "INITIAL_CAP": len(item_embeddings),
    }
)

# Create ANN Index
await redis_conn.ft(INDEX_NAME).create_index(
    fields = [vector_field],
    definition= IndexDefinition(prefix=["item:"], index_type=IndexType.HASH)
)

b'OK'

In [11]:
await redis_conn.hgetall("item:6")

{b'item_brand': b'2',
 b'item_embedding': b';\x1f7=+\x08\x90<b2\xb5;\x8f\xb3\xc8;\xef\xb2\x93;bh\xe8\xbb\xfe\x91;;\x0c\xf0<\xbc\x82\x96L=\xce\xec\x01=\xad\xd3(<\xf1eC\xbd}\x90\xf0<\xceZ\x82\xba\xa1\x15\x98\xbb\xf9m\xad\xbc\x05\x8b.\xbc|c\xa39\x13\xd4\xbc\xbb\xfd\x18:\xbc2C}\xbcQ\xdc\xd5<\xf5=\xdd<\xd3R\x9f<\xb9#\xf1\xbc\x1d?\xd4\xbc\x8a"\x96\xbc\xccy\x98\xbd\xae\xcfx\xbc\xd3<\x93\xbd\xac\xc7\xa2<\xee\x0b\x00\xbbT6\xef:\x1b\xb7\x12=\x0e\xf2\xc9\xbak0\xb7<\x05\xc7#=\xbb[\x97\xbcd\x91O;\xeb:\x98<\x05\x08\x0f\xbdx95=\x0f\xbe\x05\xbcT\xf1\xad;b\xf7\xa2\xbcNW\xca\xbcO\xe6\xcf\xbbu9m<T\x92\xdd<\x9e$\x8a\xbcwj.\xbc\xb4%\xb5\xbc\n\t\x8b<\xc8e\xd6\xb9N\xbe\x15\xbdX/\x8d<0\xfc\xda\xbck\xbe!\xbc\xe5_\x06;\r\xdf\x16=\xf1\x18\xa3\xbc\x94\xe4\':\xed?\x99<7\xc5\xe7<',
 b'item_id_raw': b'6',
 b'item_shop': b'2',
 b'item_category': b'2',
 b'item_id': b'2'}

### Next Steps
In this notebook we created our combined Feature Store and setup the Redis ANN Index. Next, we will deploy our trained models into [Triton Inference Server (TIS)](https://github.com/triton-inference-server/server).

For the next step, move on to the [`02-Deploying-Online-Multi-Stage-Recsys-with-Triton.ipynb`](./02-Deploying-Online-Multi-Stage-Redsys-with-Triton.ipynb) notebook to deploy our saved models as an ensemble to TIS and obtain prediction results for a given request.