# ML Feature Store Quickstart Tutorial - Getting Started with Feast using Redis

This tutorial provides a step-by-step **Feast for Redis quickstart** that walks you through an end-to-end example of using Feast with Redis as its online feature store for machine learning. It is based on the Feast Quickstart tutorial ([here](https://docs.feast.dev/getting-started/quickstart))), but instead of using the default online store, it **uses the Redis online store** for delivering real-time predictions at scale. If you are not familiar with Feast or Redis, then the fastest way to get started with Feast using Redis is through this tutorial. For a high-level introduction to Feature Stores and Feast using Redis, please refer to [this blog article](https://redis.com/blog/building-feature-stores-with-redis-introduction-to-feast-with-redis). More detailed information on Redis and Feast, as well as additional resources, are available at the end of this tutorial. 

In this tutorial you will:

1.   Deploy a local feature store with a Parquet file offline store and Redis online store.
2.   Build a training dataset using the demo time series features from the Parquet files.
3.   Materialize (load) feature values from the offline store into the Redis online store.
4.   Read the latest features from the Redis online store for inference.



> **Feast in a nutshell:**



> Feast (**Fea**ture **st**ore) is an open source feature store and [is part of the Linux Foundation AI & Data Foundation](//https://lfaidata.foundation/blog/2020/11/10/feast-joins-lf-ai-data-as-new-incubation-project/). It can serve feature data to models from a low-latency online store (for real-time serving) or an offline store (for model training or batch serving). It also provides a central registry so **machine learning engineers** and **data scientists** can discover the relevant features for ML use cases. Feast is a Python library + optional CLI. You can install Feast using pip, as will be described soon in this tutorial.




> **Redis in a nutshell:**

> Redis is an open source (BSD licensed), **in-memory** data structure store, used as a database, cache, and message broker. [Redis](https://redis.com) provides data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs, geospatial indexes, and streams. Redis has built-in replication, Lua scripting, LRU eviction, transactions, and different levels of on-disk persistence, and provides high availability via Redis Sentinel and automatic partitioning with Redis Cluster.


## Demo Scenario and Tutorial Steps
In this tutorial, we use feature stores to generate training data and power online model inference for a **ride-sharing driver satisfaction** prediction model. In the demo data scenario: 
We have surveyed some drivers to determine how satisfied they are with their experience using a ride-sharing app. 
We want to generate predictions for driver satisfaction for the rest of the users so we can reach out to potentially dissatisfied users.

Tutorial Steps:
1.   Install Feast and Redis and run Redis-Server in the background
2.   Create a feature repository and configure Redis as the online store
3.   Register feature definitions and deploy your feature store
4.   Generate training data
5.   Load features into your Redis online store
6.   Fetch feature vectors for inference from Redis online store

## Step 1: Install Feast and Redis and run Redis-Server in the background


###**Step 1a**: Install Feast for Redis (and Pygments for pretty printing) using pip

In [None]:
%%sh
pip install feast[redis] -U -q
pip install Pygments -q
echo "Please restart your runtime now (Runtime -> Restart runtime). This ensures that the correct dependencies are loaded."

Please restart your runtime now (Runtime -> Restart runtime). This ensures that the correct dependencies are loaded.


**Reminder**: Please restart your runtime after installing Feast (Runtime -> Restart runtime). This ensures that the correct dependencies are loaded.


###**Step 1b**: Install Redis using pip

In [None]:
!pip install redis redis-server

Collecting redis
  Downloading redis-4.0.2-py3-none-any.whl (119 kB)
[?25l[K     |██▊                             | 10 kB 22.5 MB/s eta 0:00:01[K     |█████▌                          | 20 kB 28.2 MB/s eta 0:00:01[K     |████████▏                       | 30 kB 33.3 MB/s eta 0:00:01[K     |███████████                     | 40 kB 27.4 MB/s eta 0:00:01[K     |█████████████▊                  | 51 kB 23.6 MB/s eta 0:00:01[K     |████████████████▍               | 61 kB 24.7 MB/s eta 0:00:01[K     |███████████████████▏            | 71 kB 20.0 MB/s eta 0:00:01[K     |██████████████████████          | 81 kB 21.6 MB/s eta 0:00:01[K     |████████████████████████▋       | 92 kB 23.4 MB/s eta 0:00:01[K     |███████████████████████████▍    | 102 kB 23.9 MB/s eta 0:00:01[K     |██████████████████████████████▏ | 112 kB 23.9 MB/s eta 0:00:01[K     |████████████████████████████████| 119 kB 23.9 MB/s 
[?25hCollecting redis-server
  Downloading redis_server-6.0.9-202010301343-cp37

Note: Additional information on alternative ways for installing Redis can be found here: https://redis.io/download#installation. Additional configuration information can be found in the Redis Quick Start guide (https://redis.io/topics/quickstart).

###**Step 1c**: Start Redis Server

In [None]:
import subprocess
import redis_server

subprocess.Popen([redis_server.REDIS_SERVER_PATH]) 

<subprocess.Popen at 0x7fc5c7a9ba10>

## Step 2: Create a feature repository and configure Redis as the online store

A feature repository is a directory that contains the configuration of the feature store and individual features. 

###**Step 2a**: Create a feature repository

The easiest way to create a new feature repository to use the `feast init` command. This creates a scaffolding with initial demo data.

In [None]:
!feast init feature_repo
%cd feature_repo

Feast is an open source project that collects anonymized error reporting and usage statistics. To opt out or learn more see https://docs.feast.dev/reference/usage

Creating a new Feast repository in [1m[32m/content/feature_repo[0m.

/content/feature_repo



Let's take a look at the demo repo itself. It breaks down into


*   `data/` contains raw demo parquet data
*   `example.py` contains demo feature definitions
*   `feature_store.yaml` contains a demo setup configuring where data sources are



In [None]:
!ls -R

.:
data  example.py  feature_store.yaml

./data:
driver_stats.parquet


### **Step 2b**: Configure Redis as the online store in the YAML configuration file
To configure Redis as the online store we need to set the `type` and `connection_string` values for `online_store`  in `feature_store.yaml` as follows:

In [None]:
%%writefile feature_store.yaml
project: feature_repo
registry: data/registry.db
provider: local
online_store:
    type: redis
    connection_string: localhost:6379

Overwriting feature_store.yaml


The `provider` defines where the raw data exists (for generating training data and feature values for serving) in this demo, locally. The `online_store` defines where to materialize ( load) feature values in the online store database (for serving).

Note that the above configuration is different from the default YAML file provided for the tutorial that instead uses the default online store.

So by adding these two lines for `online_store` (`type: redis, connection_string: localhost:6379`) in the YAML file per the above, Feast is then able to read and write from Redis as its online store. Redis Online Store is part of the Feast core code, and as such, Feast knows how to use Redis out-of-the-box.



### **Step 2c**: Inspect feature definitions
Let’s take a look at the demo feature definitions at `example.py`:

In [None]:
!pygmentize -f terminal16m example.py

[38;2;64;128;128m# This is an example feature definition file[39m

[38;2;0;128;0;01mfrom[39;00m [38;2;0;0;255;01mgoogle.protobuf.duration_pb2[39;00m [38;2;0;128;0;01mimport[39;00m Duration

[38;2;0;128;0;01mfrom[39;00m [38;2;0;0;255;01mfeast[39;00m [38;2;0;128;0;01mimport[39;00m Entity, Feature, FeatureView, FileSource, ValueType

[38;2;64;128;128m# Read data from parquet files. Parquet is convenient for local development mode. For[39m
[38;2;64;128;128m# production, you can use your favorite DWH, such as BigQuery. See Feast documentation[39m
[38;2;64;128;128m# for more info.[39m
driver_hourly_stats [38;2;102;102;102m=[39m FileSource(
    path[38;2;102;102;102m=[39m[38;2;186;33;33m"[39m[38;2;186;33;33m/content/feature_repo/data/driver_stats.parquet[39m[38;2;186;33;33m"[39m,
    event_timestamp_column[38;2;102;102;102m=[39m[38;2;186;33;33m"[39m[38;2;186;33;33mevent_timestamp[39m[38;2;186;33;33m"[39m,
    created_timestamp_column[38;2;102;102;102m=[

###**Step 2d:** Inspect the raw data

The raw feature data we have in this demo is stored in a local parquet file. The dataset captures hourly stats of a driver in a ride-sharing app.

In [None]:
import pandas as pd

pd.read_parquet("data/driver_stats.parquet")

Unnamed: 0,event_timestamp,driver_id,conv_rate,acc_rate,avg_daily_trips,created
0,2021-11-26 13:00:00+00:00,1005,0.313361,0.610319,321,2021-12-11 13:54:37.581
1,2021-11-26 14:00:00+00:00,1005,0.660652,0.601616,995,2021-12-11 13:54:37.581
2,2021-11-26 15:00:00+00:00,1005,0.165863,0.381302,808,2021-12-11 13:54:37.581
3,2021-11-26 16:00:00+00:00,1005,0.583375,0.237953,913,2021-12-11 13:54:37.581
4,2021-11-26 17:00:00+00:00,1005,0.720630,0.322882,224,2021-12-11 13:54:37.581
...,...,...,...,...,...,...
1802,2021-12-11 11:00:00+00:00,1001,0.752774,0.603291,747,2021-12-11 13:54:37.581
1803,2021-12-11 12:00:00+00:00,1001,0.803639,0.674685,541,2021-12-11 13:54:37.581
1804,2021-04-12 07:00:00+00:00,1001,0.755037,0.422243,390,2021-12-11 13:54:37.581
1805,2021-12-04 01:00:00+00:00,1003,0.657862,0.652138,184,2021-12-11 13:54:37.581


## Step 3: Register feature definitions and deploy your feature store

Now we run `feast apply`to register the feature views and entities defined in `example.py`. The apply command scans Python files in the current directory for feature view/entity definitions, registers the objects, and deploys infrastructure. In this example, it reads `example.py` (shown above) and sets up the Redis online store. Note that we had previously specified Redis as the online store in `feature_store.yaml` (in *Step 2b* above).



In [None]:
!feast apply

Registered entity [1m[32mdriver_id[0m
Registered feature view [1m[32mdriver_hourly_stats[0m
Deploying infrastructure for [1m[32mdriver_hourly_stats[0m


## Step 4: Generate training data

To train a model, we need features and labels. Often, this label data is stored separately (e.g. you have one table storing user survey results and another set of tables with feature values). 

The user can query that table of labels with timestamps and pass that into Feast as an *entity dataframe* for training data generation. In many cases, Feast will also intelligently join relevant tables to create the relevant feature vectors.
- Note that we include timestamps because want the features for the same driver at various timestamps to be used in a model.

In [None]:
from datetime import datetime, timedelta
import pandas as pd

from feast import FeatureStore

# The entity dataframe is the dataframe we want to enrich with feature values
entity_df = pd.DataFrame.from_dict(
    {
        "driver_id": [1001, 1002, 1003],
        "label_driver_reported_satisfaction": [1, 5, 3], 
        "event_timestamp": [
            datetime.now() - timedelta(minutes=11),
            datetime.now() - timedelta(minutes=36),
            datetime.now() - timedelta(minutes=73),
        ],
    }
)

store = FeatureStore(repo_path=".")

training_df = store.get_historical_features(
    entity_df=entity_df,
    features=[
        "driver_hourly_stats:conv_rate",
        "driver_hourly_stats:acc_rate",
        "driver_hourly_stats:avg_daily_trips",
    ],
).to_df()

print("----- Feature schema -----\n")
print(training_df.info())

print()
print("----- Example features -----\n")
print(training_df.head())

----- Feature schema -----

<class 'pandas.core.frame.DataFrame'>
Int64Index: 3 entries, 0 to 2
Data columns (total 6 columns):
 #   Column                              Non-Null Count  Dtype              
---  ------                              --------------  -----              
 0   event_timestamp                     3 non-null      datetime64[ns, UTC]
 1   driver_id                           3 non-null      int64              
 2   label_driver_reported_satisfaction  3 non-null      int64              
 3   conv_rate                           3 non-null      float32            
 4   acc_rate                            3 non-null      float32            
 5   avg_daily_trips                     3 non-null      int32              
dtypes: datetime64[ns, UTC](1), float32(2), int32(1), int64(2)
memory usage: 132.0 bytes
None

----- Example features -----

                   event_timestamp  driver_id  ...  acc_rate  avg_daily_trips
0 2021-12-11 12:42:22.122271+00:00       1003  ...  0

## Step 5: Load features into your Redis online store

We will now load or materialize feature data into your Redis online store so we can serve the latest features to models for online prediction. The `materialize` command allows users to materialize features over a specific historical time range into the online store. It will query the batch sources for all feature views over the provided time range, and load the latest feature values into the configured online store. `materialize-incremental` command will only ingest new data that has arrived in the offline store, since the last materialize call.

In [None]:
from datetime import datetime
!feast materialize-incremental {datetime.now().isoformat()}

Materializing [1m[32m1[0m feature views to [1m[32m2021-12-11 13:55:34+00:00[0m into the [1m[32mredis[0m online store.

[1m[32mdriver_hourly_stats[0m from [1m[32m2021-12-10 13:55:35+00:00[0m to [1m[32m2021-12-11 13:55:34+00:00[0m:
  0%|                                                                         | 0/5 [00:00<?, ?it/s]100%|███████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 2060.88it/s]


## Step 6: Fetch feature vectors for inference
At inference time, we need to quickly read the latest feature values for different drivers (which otherwise might have existed only in batch sources) from the Redis online feature store using `get_online_features()`. These feature vectors can then be fed to the model.

In [None]:
from pprint import pprint
from feast import FeatureStore

store = FeatureStore(repo_path=".")

feature_vector = store.get_online_features(
    features=[
        "driver_hourly_stats:conv_rate",
        "driver_hourly_stats:acc_rate",
        "driver_hourly_stats:avg_daily_trips",
    ],
    entity_rows=[
        {"driver_id": 1004},
        {"driver_id": 1005},
    ],
).to_dict()

pprint(feature_vector)

{'acc_rate': [0.36325469613075256, 0.6221743822097778],
 'avg_daily_trips': [691, 773],
 'conv_rate': [0.6506150364875793, 0.2967696785926819],
 'driver_id': [1004, 1005]}


# Tutorial recap:

In this tutorial you’ve deployed a local feature store with a Parquet file offline store and Redis online store. You then built a training dataset using time series features from Parquet files. Then, you materialized feature values from the offline store into the Redis online store. Finally, you read the latest features from the Redis online store for inference. With Redis as the online store you can read the latest feature very quickly for real-time ML use cases, with low latency and high throughput at scale. 


# Next steps

- Read the Feast [Concepts](https://docs.feast.dev/getting-started/concepts) page to understand the Feast data model, and read the Feast [Architecture](https://docs.feast.dev/getting-started/architecture-and-components) page.
- Read the full [configuration](https://rtd.feast.dev/en/master/#module-feast.infra.online_stores.redis) guide for Feast with Redis, and the [data model](https://github.com/feast-dev/feast/blob/master/docs/specs/online_store_format.md) used to store feature values in Redis.
- Case studies - learn from your peers: Learn how companies are using Features Stores with Redis as the online store ([Wix](https://youtu.be/E8839ENL-WY), [Swiggy](https://bytes.swiggy.com/enabling-data-science-at-scale-at-swiggy-the-dsp-story-208c2d85faf9), [Comcast](https://cdn.oreillystatic.com/en/assets/1/event/300/Automating%20ML%20model%20training%20and%20deployments%20via%20metadata-driven%20data%2C%20infrastructure%2C%20feature%20engineering%2C%20and%20model%20management%20Presentation.pdf), [Zomato](https://www.zomato.com/blog/elements-of-scalable-machine-learning), [AT&T](https://youtu.be/AXQt_oW9JEc), [DoorDash](https://doordash.engineering/2020/11/19/building-a-gigascale-ml-feature-store-with-redis/), [iFood](https://databricks.com/session_na20/building-a-real-time-feature-store-at-ifood)), and specifically how they are using Feast with Redis for their online store ([Gojek](https://youtu.be/DaNv-Wf1MBA?t=836), [Udaan](https://hasgeek.com/fifthelephant/mlops-conference/schedule/managed-feature-store-improving-data-reusability-providing-a-means-for-low-latency-real-time-prediction-at-udaan-HsZnfC4VUNdWUyJXXwfp5m), [Robinhood](https://www.applyconf.com/agenda/how-robinhood-built-a-feature-store-using-feast/)).
- Read about [Azure Managed Feature Store with Feast and Redis](https://github.com/Azure/feast-azure) and follow the [Getting started with Feast on Azure tutorial](https://github.com/Azure/feast-azure/tree/main/provider/tutorial) as well as other Feast tutorials
- You can also look for more info on Feast or Redis in the general product introduction pages on [Feast](https://docs.feast.dev/) and [Redis](https://redis.io/topics/introduction) respectively.
- Join other Feast users and contributors in [Slack](https://slack.feast.dev) and become part of the community! 