# Hazelcast Feast Integration Batch Features Demo

## Setup

* Offline store: Postgresql
* Online store: Hazelcast

Hazelcast runs as a single member cluster.

## Demo Data

`demo_data.jsonl` contains randomly generated credit card transaction data encoded as [JSON lines](https://jsonlines.org/).

```json
# demo_data.jsonl
{
  "acct_num": "UKFH75714629958700",
  "amt": 189.22,
  "unix_time": 1722574800
}
```

The data file exist in the member container, so it can be accessed by Jet jobs.

## Offline Feature Storage

In this demo, offline features are stored in a single PostgreSQL table.
The table is populated by a Hazelcast Jet job that reads data from `demo_data.jsonl` and transforms them to create the features.

The table is defined as follows:

```
create table user_transaction_count_7d (
    id serial primary key,
    user_id text,
    transaction_count_7d integer,
    feature_timestamp timestamp
);
```

The tables was already created when the PostgreSQL container started.

In order to access PostgreSQL tables from Jet jobs, a data connection must be created.
You can do that by running an SQL script using CLC:

In [None]:
! clc script run --echo etc/create_data_connection.sql

You can verify that the data connection is active by running the following command:

In [None]:
! clc sql "show resources for demo"

## Feast Setup

The Feast project is in the `feature_repo` directory.
You can take a look at the Feast configuration using the following command:

In [None]:
! cat feature_repo/feature_store.yaml

The feature views are defined in the `fraud_features.py`.
Run the following command to see its contents:

In [None]:
! cat feature_repo/fraud_features.py

Before being able to use features, you must run the following command:

In [None]:
! feast -c feature_repo apply

## Jet Job

The Jet job reads transactions from `demo_data.jsonl` and populates the `user_transaction_count_7d` table in the PostgreSQL database.
You can see how the pipeline is defined by running the following command:

In [None]:
! cat jet/batch_features/src/main/java/com/example/Main.java

You have to compile the Java code that creates the Jet pipeline.
We provided an easy-to-use script to do that from this Jupyter Notebook:

In [None]:
! run build_jet batch_features

You can now run create the Jet pipeline and run the jobs:

In [None]:
! clc job submit build/jet/batch_features/libs/*.jar /home/hazelcast/data

You can list the running jobs and verify that the jobs completed successfully using the following command:

In [None]:
! clc job list

## Materialization

Materialization is the process of transferring features from the offline store to the online store. That is from PostgreSQL to Hazelcast in this case.
Run the following command to do that:

In [None]:
! feast -c feature_repo materialize-incremental "2024-07-24T08:00:00"

Running the command above created an IMap that corresponds to the "user_transaction_count_7d" feature in the Hazelcasst cluster.
You can list it using the following command:

In [None]:
! clc object list map

Check the contents of the feature IMap to check the data written by Feast:

In [None]:
! clc map -n feast_batch_user_transaction_count_7d entry-set | head -10

## What's Next?

(Link to Hazelcast Feast documentation.)