## Running Feast Online Feature Store in a Kubernetes Cluster
This notebook example is used with this [demo](https://github.com/tedhtchang/populate_feast_online_store) to populate a Feast Online store and run Feast Online Serving REST API server in a Kubernetes Cluster.

Note: This notebook can be run inside a Kubercluster or locally.

## Clone the feature repository

In [None]:
# The driver_rank_repo on github is acting as the source of truth for feature repository
!git clone https://github.com/tedhtchang/driver_rank_repo

In [None]:
!pip install feast feast[redis] feast[aws]

Feast SDK/CLI can run as a standalone feature store without any requirement. Please visit [here](https://docs.feast.dev/quickstart) for detail.

In [None]:
!feast --help

## Inside the feature repository

We have already initialized the feature repository but if you changed the repo config file (feature_store.yaml) and feature definitions (driver_repo.py). Please re-run the `feast apply` 
```
.
├── data
│   ├── driver_stats.parquet
│   └── registry.db
├── driver_repo.py
└── feature_store.yaml

1 directory, 4 files
```

In [None]:
# If you update the feature_store.yaml or the feature definition py, re-run feast apply.
!feast -c ./driver_rank_repo apply

### feature_store.yaml
```
project: driver_ranking
registry: data/registry.db
provider: local
online_store:
  type: redis
  connection_string: "redis-service.default.svc.cluster.local:6379
```
If you are running the notebook from outside of the cluster determine the external IP address of the connection string using this [method]().

### driver_repo.py
```
from datetime import timedelta
from feast import FileSource, Entity, Feature, FeatureView, ValueType
driver = Entity(name="driver_id", join_key="driver_id", value_type=ValueType.INT64,)

driver_stats_source = FileSource(
    path="driver_rank_repo/data/driver_stats.parquet",
    event_timestamp_column="event_timestamp",
    created_timestamp_column="created",
)

driver_stats_fv = FeatureView(
    name="driver_hourly_stats",
    entities=["driver_id"],
    ttl=timedelta(weeks=52),
    features=[
        Feature(name="conv_rate", dtype=ValueType.FLOAT),
        Feature(name="acc_rate", dtype=ValueType.FLOAT),
        Feature(name="avg_daily_trips", dtype=ValueType.INT64),
    ],
    batch_source=driver_stats_source,
    tags={"team": "driver_performance"},
)
```

Feast can be configure to read and registry.db offline data from S3. Please see [here]() for detail.

## Move the features from Offline to Online store (materialize)

In [None]:
!feast -c ./driver_rank_repo materialize 2021-08-03T03:03:07 $(date -u +"%Y-%m-%dT%H:%M:%S")

This is it. You have moved the offline features into the online store!

## Initializing Feast SDK
The Feast SDK can be initialized this way. You will need to provide a connection_string (Redis server External IP/hostname. See [here]() for detail) that is accessible from where the notebook is running. 

In [None]:
import os
from pprint import pprint
from feast import FeatureStore, RepoConfig
from feast.infra.online_stores.redis import RedisOnlineStoreConfig

store = FeatureStore(
config = RepoConfig(
                registry = os.path.join(os.getcwd(), "./driver_rank_repo/data/registry.db"),
                project = "driver_ranking",
                provider = "local",
                online_store = RedisOnlineStoreConfig(
                    connection_string = "redis-service.default.svc.cluster.local:6379")
            
            )
        )

## Get Historical Features for batch training

In [None]:
from datetime import datetime

import pandas as pd

from feast import FeatureStore

entity_df = pd.DataFrame.from_dict(
    {
        "driver_id": [1001, 1002, 1003, 1004, 1005],
        "event_timestamp": [
            datetime(2021, 8, 12, 10, 59, 42),
            datetime(2021, 8, 12, 8, 12, 10),
            datetime(2021, 8, 12, 16, 40, 26),
            datetime(2021, 8, 12, 15, 1, 12),
            datetime(2021, 8, 12, 15, 1, 12),
        ],
    }
)

training_df = store.get_historical_features(
    entity_df=entity_df,
    features=[
        "driver_hourly_stats:conv_rate",
        "driver_hourly_stats:acc_rate",
        "driver_hourly_stats:avg_daily_trips",
    ],
).to_df()

print(training_df.head())

## Online features from the Online Store using SDK

In [None]:

feature_vector = store.get_online_features(
    features=[
        "driver_hourly_stats:conv_rate",
        "driver_hourly_stats:acc_rate",
        "driver_hourly_stats:avg_daily_trips",
    ],
    entity_rows=[{"driver_id": 1002}],
).to_dict()

pprint(feature_vector)


## Online features from the Feature Server Service using REST API

In [None]:
!curl -X GET "http://feature-server-service:6566/get-online-features/" -H "Content-type: application/json" -H "Accept: application/json" -d '{"features": ["driver_hourly_stats:conv_rate","driver_hourly_stats:acc_rate","driver_hourly_stats:avg_daily_trips"],"entities": {"driver_id": [1001, 1002, 1003]},"full_feature_names": true}'|jq