In [1]:
# WE MUST ENSURE PYTHON CONSISTENCY BETWEEN NOTEBOOK AND FEAST SERVERS
# LAUNCH THIS NOTEBOOK FROM A CLEAN PYTHON ENVIRONMENT >3.9
%pip install feast==0.40.1

Note: you may need to restart the kernel to use updated packages.


# Install Feast on Kind
## Objective

Provide a reference implementation of a runbook to deploy a Feast development environment on a Kubernets cluster using [Kind](https://kind.sigs.k8s.io/docs/user/quick-start).


## Prerequisites
* [Kind](https://kind.sigs.k8s.io/) cluster and a Docker runtime container
* [kubectl](https://kubernetes.io/docs/tasks/tools/#kubectl) Kubernetes CLI tool.
* [Helm](https://helm.sh/) Kubernetes package manager.
* [yq](https://github.com/mikefarah/yq) YAML processor.

## Install Prerequisites
The following commands install and configure all the prerequisites on MacOS environment. You can find the
equivalent instructions on the offical documentation pages:
* Install Kind and Docker runtime (e.g. [Colima](https://github.com/abiosoft/colima)).
* Create Kind cluster named `feast`.
* Install and setup the `kubectl` context.
* `Helm`.
* `yq`.
```bash
brew install colima
colima start
brew install kind
kind create cluster --name feast
kind start
brew install helm
brew install kubectl
kubectl config use-context kind-feast
brew install yq
```

Additionally, we create a `feast` namespace and use it as the default for the `kubectl` CLI:

In [2]:
!kubectl create ns feast
!kubectl config set-context --current --namespace feast

namespace/feast created
Context "kind-feast" modified.


Validate the cluster setup:

In [3]:
!kubectl get ns

NAME                 STATUS   AGE
default              Active   26h
feast                Active   3s
kube-node-lease      Active   26h
kube-public          Active   26h
kube-system          Active   26h
local-path-storage   Active   26h


## Deployment Architecture
The primary objective of this runbook is to guide the deployment of Feast services on a Kubernetes Kind cluster, using the default `postgres` template to set up a basic feature store.

> 🚀 We will also add instructions to repeat the example with a custom project, for a personalized experience.

In this notebook, we will deploy a distributed topology of Feast services, which includes:

* `Registry Server`: Exposes endpoints at the [default port 6570](https://github.com/feast-dev/feast/blob/89bc5512572130510dd18690309b5a392aaf73b1/sdk/python/feast/constants.py#L39) and handles metadata storage for feature definitions.
* `Online Store Server`: Exposes endpoints at the [default port 6566](https://github.com/feast-dev/feast/blob/4a6b663f80bc91d6de35ed2ec428d34811d17a18/sdk/python/feast/cli.py#L871-L872). This service uses the `Registry Server` to query metadata and is responsible for low-latency serving of features.
* `Offline Store Server`: Exposes endpoints at the [default port 8815](https://github.com/feast-dev/feast/blob/89bc5512572130510dd18690309b5a392aaf73b1/sdk/python/feast/constants.py#L42). It uses the `Registry Server` to query metadata and provides access to batch data for historical feature retrieval.

Each service is backed by a `PostgreSQL` database, which is also deployed within the same Kind cluster.

Finally, port forwarding will be configured to expose these Feast services locally. This will allow a local client, implemented in the accompanying client notebook, to interact with the deployed services.

## Install PostgreSQL
Install the [reference deployment](./postgres/postgres.yaml) to install and configure a simple PostgreSQL database.

In [4]:
!kubectl apply -f postgres/postgres.yaml
!kubectl wait --for=condition=available deployment/postgres --timeout=2m

secret/postgres-secret created
persistentvolume/postgres-volume created
persistentvolumeclaim/postgres-volume-claim created
deployment.apps/postgres created
service/postgres created
deployment.apps/postgres condition met


In [5]:
!kubectl get pods
!kubectl get svc

NAME                       READY   STATUS    RESTARTS   AGE
postgres-76c8d94d6-pngvm   1/1     Running   0          8s
NAME       TYPE       CLUSTER-IP    EXTERNAL-IP   PORT(S)          AGE
postgres   NodePort   10.96.231.4   <none>        5432:30565/TCP   8s


## Create the feature store project
Use the `feast init` command to create the default project.

We also start port forwarding for the `postgres` service to populate the tables with default data.

> 🚀 If you want to use a custom configuration, replace it under the sample/feature_repo folder and skip this section

In [6]:
from src.utils import port_forward
psql_process = port_forward("postgres", 5432, 5432)

Port-forwarding postgres with process ID: 9611


Forwarding from 127.0.0.1:5432 -> 5432
Forwarding from [::1]:5432 -> 5432


We are going to emulate the `feast init -t postgres sample` command using Python code. This is needed to mock the request of additional
parameters to configure the DB connection and also request the upload of example data to Postgres tables.

In [7]:
from feast.repo_operations import init_repo
from unittest import mock
from feast.templates.postgres.bootstrap import bootstrap

project_directory = "sample"
template = "postgres"

with mock.patch("click.prompt", side_effect=["localhost", "5432", "feast", "public", "feast", "feast"]):
  with mock.patch("click.confirm", side_effect=[True]):
    init_repo(project_directory, template)

Handling connection for 5432
Handling connection for 5432

Creating a new Feast repository in [1m[32m/Users/dmartino/projects/AI/feast/feast/examples/kind-quickstart/sample[0m.



Verify that the DB includes the expected tables with pre-populated data.

In [8]:
!PSQL_POD=$(kubectl get pods -l app=postgres -oname) && kubectl exec $PSQL_POD -- psql -h localhost -U feast feast -c '\dt'
!PSQL_POD=$(kubectl get pods -l app=postgres -oname) && kubectl exec $PSQL_POD -- psql -h localhost -U feast feast -c 'select count(*) from feast_driver_hourly_stats'

                 List of relations
 Schema |           Name            | Type  | Owner 
--------+---------------------------+-------+-------
 public | feast_driver_hourly_stats | table | feast
(1 row)

 count 
-------
  1807
(1 row)



Finally, let's stop port forwarding.

In [9]:
psql_process.terminate()
!ps -ef | grep port-forward

  501 10392  6947   0  1:12PM ttys051    0:00.12 /bin/zsh -c ps -ef | grep port-forward
  501 10394 10392   0  1:12PM ttys051    0:00.00 grep port-forward


### Generate server configurations
Each server has its own configuration that we generate from the one initialized before.

We use `yq` to manipulate the original configuration and generate the server specifics.

Note: from now on, we assume that the Feast service names will be as follows:
* For `Registry Server`: `registry-server`
* For `Online Store`: `online-server`
* For `Offline Store`: `offline-server`

> 🚀 If you used different service names, replace the `host` parameter in the following `yq` commands.

In [10]:
%env FEATURE_REPO_DIR=sample/feature_repo
# Adjust the database host to match the postgres service
!yq -i '.registry.path="postgresql://feast:feast@postgres:5432/feast"' $FEATURE_REPO_DIR/feature_store.yaml
!yq -i '.online_store.host="postgres"' $FEATURE_REPO_DIR/feature_store.yaml
!yq -i '.offline_store.host="postgres"' $FEATURE_REPO_DIR/feature_store.yaml
!cat $FEATURE_REPO_DIR/feature_store.yaml

env: FEATURE_REPO_DIR=sample/feature_repo
project: sample
provider: local
registry:
  registry_type: sql
  path: postgresql://feast:feast@postgres:5432/feast
  cache_ttl_seconds: 60
  sqlalchemy_config_kwargs:
    echo: false
    pool_pre_ping: true
online_store:
  type: postgres
  host: postgres
  port: 5432
  database: feast
  db_schema: public
  user: feast
  password: feast
offline_store:
  type: postgres
  host: postgres
  port: 5432
  database: feast
  db_schema: public
  user: feast
  password: feast
entity_key_serialization_version: 2


In [11]:
# Registry server has only `registry` section
!cat $FEATURE_REPO_DIR/feature_store.yaml | yq '.project | {key: .}, .registry | {key: .}, .provider | {key: .}, .entity_key_serialization_version | {key: .}' > registry_feature_store.yaml
! cat registry_feature_store.yaml

project: sample
registry:
  registry_type: sql
  path: postgresql://feast:feast@postgres:5432/feast
  cache_ttl_seconds: 60
  sqlalchemy_config_kwargs:
    echo: false
    pool_pre_ping: true
provider: local
entity_key_serialization_version: 2


In [12]:
# Online server has `online_store` section, a remote `registry` and a remote `offline_store`
!cat $FEATURE_REPO_DIR/feature_store.yaml | yq '.project | {key: .}, .provider | {key: .}, .online_store  | {key: .}, .entity_key_serialization_version | {key: .}' > online_feature_store.yaml
!yq -i '.registry.path="registry-server:80"' online_feature_store.yaml
!yq -i '.registry.registry_type="remote"' online_feature_store.yaml
!yq -i '.offline_store.type="remote"' online_feature_store.yaml
!yq -i '.offline_store.host="offline-server"' online_feature_store.yaml
!yq -i '.offline_store.port=80' online_feature_store.yaml

!cat online_feature_store.yaml

project: sample
provider: local
online_store:
  type: postgres
  host: postgres
  port: 5432
  database: feast
  db_schema: public
  user: feast
  password: feast
entity_key_serialization_version: 2
registry:
  path: registry-server:80
  registry_type: remote
offline_store:
  type: remote
  host: offline-server
  port: 80


In [13]:
# Offline server has `offline_store` section and a remote `registry`
!cat $FEATURE_REPO_DIR/feature_store.yaml | yq '.project | {key: .}, .provider | {key: .}, .offline_store | {key: .}, .entity_key_serialization_version | {key: .}' > offline_feature_store.yaml
!yq -i '.registry.path="registry-server:80"' offline_feature_store.yaml
!yq -i '.registry.registry_type="remote"' offline_feature_store.yaml
!cat offline_feature_store.yaml

project: sample
provider: local
offline_store:
  type: postgres
  host: postgres
  port: 5432
  database: feast
  db_schema: public
  user: feast
  password: feast
entity_key_serialization_version: 2
registry:
  path: registry-server:80
  registry_type: remote


### Encode configuration files
Next step is to encode in base64 the configuration files for each server. We'll store the output in environment variables.

In [14]:
import os
def base64_file(file):
  import base64

  with open(file, 'rb') as file:
      yaml_content = file.read()
  return base64.b64encode(yaml_content).decode('utf-8')

os.environ['REGISTRY_CONFIG_BASE64'] = base64_file('registry_feature_store.yaml')
os.environ['ONLINE_CONFIG_BASE64'] = base64_file('online_feature_store.yaml')
os.environ['OFFLINE_CONFIG_BASE64'] = base64_file('offline_feature_store.yaml')

In [15]:
!env | grep BASE64

REGISTRY_CONFIG_BASE64=cHJvamVjdDogc2FtcGxlCnJlZ2lzdHJ5OgogIHJlZ2lzdHJ5X3R5cGU6IHNxbAogIHBhdGg6IHBvc3RncmVzcWw6Ly9mZWFzdDpmZWFzdEBwb3N0Z3Jlczo1NDMyL2ZlYXN0CiAgY2FjaGVfdHRsX3NlY29uZHM6IDYwCiAgc3FsYWxjaGVteV9jb25maWdfa3dhcmdzOgogICAgZWNobzogZmFsc2UKICAgIHBvb2xfcHJlX3Bpbmc6IHRydWUKcHJvdmlkZXI6IGxvY2FsCmVudGl0eV9rZXlfc2VyaWFsaXphdGlvbl92ZXJzaW9uOiAyCg==
ONLINE_CONFIG_BASE64=cHJvamVjdDogc2FtcGxlCnByb3ZpZGVyOiBsb2NhbApvbmxpbmVfc3RvcmU6CiAgdHlwZTogcG9zdGdyZXMKICBob3N0OiBwb3N0Z3JlcwogIHBvcnQ6IDU0MzIKICBkYXRhYmFzZTogZmVhc3QKICBkYl9zY2hlbWE6IHB1YmxpYwogIHVzZXI6IGZlYXN0CiAgcGFzc3dvcmQ6IGZlYXN0CmVudGl0eV9rZXlfc2VyaWFsaXphdGlvbl92ZXJzaW9uOiAyCnJlZ2lzdHJ5OgogIHBhdGg6IHJlZ2lzdHJ5LXNlcnZlcjo4MAogIHJlZ2lzdHJ5X3R5cGU6IHJlbW90ZQpvZmZsaW5lX3N0b3JlOgogIHR5cGU6IHJlbW90ZQogIGhvc3Q6IG9mZmxpbmUtc2VydmVyCiAgcG9ydDogODAK
OFFLINE_CONFIG_BASE64=cHJvamVjdDogc2FtcGxlCnByb3ZpZGVyOiBsb2NhbApvZmZsaW5lX3N0b3JlOgogIHR5cGU6IHBvc3RncmVzCiAgaG9zdDogcG9zdGdyZXMKICBwb3J0OiA1NDMyCiAgZGF0YWJhc2U6IGZlYXN0CiAgZGJfc2NoZW1hOiBwdWJs

## Install servers
We'll use the charts defined in this local repository to install the servers.

The installation order reflects the dependency between the deployments:
* `Registry Server` starts first because it has no dependencies
* Then `Offline Server` as it depends only on the `Registry Server`
* Last the `Online Server` that depends on both the other servers

In [16]:
%env FEAST_IMAGE_REPO=feastdev/feature-server
%env FEAST_IMAGE_VERSION=0.40.1

env: FEAST_IMAGE_REPO=feastdev/feature-server
env: FEAST_IMAGE_VERSION=0.40.1


In [17]:
# Registry
!helm upgrade --install feast-registry ../../infra/charts/feast-feature-server \
--set fullnameOverride=registry-server --set feast_mode=registry \
--set image.repository=${FEAST_IMAGE_REPO} --set image.tag=${FEAST_IMAGE_VERSION} \
--set feature_store_yaml_base64=$REGISTRY_CONFIG_BASE64

!kubectl wait --for=condition=available deployment/registry-server --timeout=2m

Release "feast-registry" does not exist. Installing it now.
NAME: feast-registry
LAST DEPLOYED: Tue Sep 17 13:14:05 2024
NAMESPACE: feast
STATUS: deployed
REVISION: 1
TEST SUITE: None
deployment.apps/registry-server condition met


In [18]:
# Offline
!helm upgrade --install feast-offline ../../infra/charts/feast-feature-server \
--set fullnameOverride=offline-server --set feast_mode=offline \
--set image.repository=${FEAST_IMAGE_REPO} --set image.tag=${FEAST_IMAGE_VERSION} \
--set feature_store_yaml_base64=$OFFLINE_CONFIG_BASE64

!kubectl wait --for=condition=available deployment/offline-server --timeout=2m

Release "feast-offline" does not exist. Installing it now.
NAME: feast-offline
LAST DEPLOYED: Tue Sep 17 13:14:33 2024
NAMESPACE: feast
STATUS: deployed
REVISION: 1
TEST SUITE: None
deployment.apps/offline-server condition met


In [19]:
# Online
!helm upgrade --install feast-online ../../infra/charts/feast-feature-server \
--set fullnameOverride=online-server --set feast_mode=online \
--set image.repository=${FEAST_IMAGE_REPO} --set image.tag=${FEAST_IMAGE_VERSION} \
--set feature_store_yaml_base64=$ONLINE_CONFIG_BASE64

!kubectl wait --for=condition=available deployment/online-server --timeout=2m

Release "feast-online" does not exist. Installing it now.
NAME: feast-online
LAST DEPLOYED: Tue Sep 17 13:14:55 2024
NAMESPACE: feast
STATUS: deployed
REVISION: 1
TEST SUITE: None
deployment.apps/online-server condition met


### Validate deployment
Fist validate application and service status:

In [20]:
!kubectl get svc
!kubectl get deployments
!kubectl get pods

NAME              TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)          AGE
offline-server    ClusterIP   10.96.24.216   <none>        80/TCP           44s
online-server     ClusterIP   10.96.36.113   <none>        80/TCP           22s
postgres          NodePort    10.96.231.4    <none>        5432:30565/TCP   4m14s
registry-server   ClusterIP   10.96.128.48   <none>        80/TCP           71s
NAME              READY   UP-TO-DATE   AVAILABLE   AGE
offline-server    1/1     1            1           44s
online-server     1/1     1            1           22s
postgres          1/1     1            1           4m14s
registry-server   1/1     1            1           71s
NAME                               READY   STATUS    RESTARTS   AGE
offline-server-6c59467c75-9jvq7    1/1     Running   0          45s
online-server-76968bbc48-qlvvj     1/1     Running   0          23s
postgres-76c8d94d6-pngvm           1/1     Running   0          4m15s
registry-server-597c5cd445-nrm75   1/1     Runn

Then verify the content of the local configuration file (it's stored in `/tmp/` folder with random subfolder).

In [21]:
!kubectl exec deployment/registry-server -- find /tmp -name feature_store.yaml -exec cat {} \;

project: sample
registry:
  registry_type: sql
  path: postgresql://feast:feast@postgres:5432/feast
  cache_ttl_seconds: 60
  sqlalchemy_config_kwargs:
    echo: false
    pool_pre_ping: true
provider: local
entity_key_serialization_version: 2


In [22]:
!kubectl exec deployment/offline-server -- find /tmp -name feature_store.yaml -exec cat {} \;

project: sample
provider: local
offline_store:
  type: postgres
  host: postgres
  port: 5432
  database: feast
  db_schema: public
  user: feast
  password: feast
entity_key_serialization_version: 2
registry:
  path: registry-server:80
  registry_type: remote


In [23]:
!kubectl exec deployment/online-server -- find /tmp -name feature_store.yaml -exec cat {} \;

project: sample
provider: local
online_store:
  type: postgres
  host: postgres
  port: 5432
  database: feast
  db_schema: public
  user: feast
  password: feast
entity_key_serialization_version: 2
registry:
  path: registry-server:80
  registry_type: remote
offline_store:
  type: remote
  host: offline-server
  port: 80


Finally, let's verify the `feast` version in each server

In [24]:
!kubectl exec deployment/registry-server -- feast version
!kubectl exec deployment/offline-server -- feast version
!kubectl exec deployment/online-server -- feast version

<jemalloc>: MADV_DONTNEED does not work (memset will be used instead)
<jemalloc>: (This is the expected behaviour if you are running under QEMU)
Feast SDK Version: "0.40.1"
<jemalloc>: MADV_DONTNEED does not work (memset will be used instead)
<jemalloc>: (This is the expected behaviour if you are running under QEMU)
Feast SDK Version: "0.40.1"
<jemalloc>: MADV_DONTNEED does not work (memset will be used instead)
<jemalloc>: (This is the expected behaviour if you are running under QEMU)
Feast SDK Version: "0.40.1"
