# Hazelcast Feast Integration Streaming Features Demo

## Setup

* Offline store: Postgresql
* Online store: Hazelcast

Hazelcast runs as a single member cluster.

## Demo Data

Syntethically generated credit card transaction data encoded as [JSON lines](https://jsonlines.org/).

The data is streamed in the `transaction` Kafka topic.
```json
{
  "acct_num": "BELQ94233230477440",
  "amt": 1217.6299129927615,
  "unix_time": 1721610473
}
```

You can peek some of the data streaming in the `transaction` topic using the following command:

In [None]:
! kafkactl consume transaction --tail 5

## Offline Feature Storage

Offline features are stored in PostgreSQL tables.
The tables are populated by Hazelcast Jet jobs that read data from files and transforming them to create the features.

The tables are defined as follows:

```
create table user_transaction_count_7d (
    id serial primary key,
    user_id text,
    transaction_count_7d integer,
    feature_timestamp timestamp
);
```

That table was already created when starting the PostgreSQL container in this demo.

## Feast Setup

The Feast project is in the `feature_repo` directory.
You can take a look at the Feast configuration using the following command:

In [None]:
! cat feature_repo/feature_store.yaml

The feature views are defined in the `features.py` file.
Run the following command to see its contents:

In [None]:
! cat feature_repo/features.py

Before being able to use features, you must run the following command:

In [None]:
! feast -c feature_repo apply

At this point, you are ready to start the feature server.
Due to Jupyter Notebook limitations, you have to run the command in a separate process.
The command below is equivalent to:
```
feast -c feature_repo serve -h 0.0.0.0 -p 6566 --no-access-log
```

In [None]:
import subprocess
feature_server = subprocess.Popen(
    ["feast", "-c", "feature_repo", "serve", "-h", "0.0.0.0", "-p",  "6566", "--no-access-log"],
    stderr=subprocess.DEVNULL
)

## Jet Job

The Jet job create `user_transaction_count` from the transactions streaming in the Kafka topic `transaction`.
You can see how the pipeline is defined by running the following command:

In [None]:
! cat jet/streaming_features/src/main/java/com/example/Main.java

You have to compile the Java code that creates the Jet pipeline.
We provided an easy-to-use script to do that from this Jupyter Notebook:

In [None]:
! run build_jet streaming_features

You can now run create the Jet pipeline and run the jobs:

In [None]:
! clc job submit --name transform_features build/jet/streaming_features/libs/*.jar http://demo:6566 kafka:19092

You can list the running jobs and verify that the jobs completed successfully using the following command:

In [None]:
! clc job list

## Checking the created features

Running the command above created a bunch of Maps in the Hazelcasst cluster, one for each feature.
You can list them out using the following command:

In [None]:
! clc object list map

And check one of the feature Maps:

In [None]:
! clc map -n feast_streaming_user_transaction_count_7d entry-set | head -10

You can retrieve features from the feature server in a human-readable format:

In [None]:
! curl "http://localhost:6566/get-online-features" -d \
'{\
    "features": [\
      "user_transaction_count_7d:transaction_count_7d"\
    ],\
    "entities": {\
      "user_id": ["EBJD80665876768751", "YVCV56500100273531", "QRQP56813768247223"]\
    }\
 }' | jq

## What's Next?

(Link to Hazelcast Feast documentation.)