# ANOVOS - Feast Integration
Following notebook shows the feast integration supported by ANOVOS package and how it can be invoked accordingly. 
Code that is necessary for a minimal dataflow is contained here as well. 
* [Read Dataset](#Read-Dataset)
* [Write Datasets and export feature definitions](#Write-Datasets-and-export-feature-definitions)

**Setting Spark Session**

In [None]:
from anovos.shared.spark import *

sc.setLogLevel("ERROR")
import warnings
warnings.filterwarnings('ignore')

**Input/Output Path**

In [None]:
inputPath = "../data/income_dataset/csv"
inputPath_parq = "../data/income_dataset/parquet"
inputPath_join = "../data/income_dataset/join"
outputPath = "../output/income_dataset/"

# Read Dataset

- API specification of function **read_dataset** can be found <a href="https://docs.anovos.ai/api/data_ingest/data_ingest.html">here</a>
- Currently supports - csv, parquet, avro

In [None]:
from anovos.data_ingest.data_ingest import read_dataset

In [None]:
df = read_dataset(spark, file_path = inputPath, file_type = "csv",file_configs = {"header": "True", 
                                                                           "delimiter": "," , 
                                                                           "inferSchema": "True"})
df.toPandas().head(5)

# Write Datasets and export feature definitions

A description of feature store related configuration can be found <a href="https://docs.anovos.ai/using-anovos/feature_store.html">here</a>
- API specification of function **generate_feature_description** can be found <a href="https://docs.anovos.ai/api/feature_store/feast_exporter.html">here</a> <br>
- Limitations:
    - repartition for file output needs to be set to 1
    - no incremental updates possible
       

In [None]:
from anovos.feature_store import feast_exporter

In [None]:
#Example 1 - add timestamp columns to df 
entity_config = {
    "name": "income",
    "id_col": "ifa",
    "description": "write_feast_features",
}

file_source_config = {
    "owner": "test@owner.com",
    "description": "data source description",
    "timestamp_col": "event_time",
    "create_timestamp_col": "create_time_col",
}

feature_view_config = {
    "name": "income_view",
    "ttl_in_seconds": 3600000,
    "owner": "view@owner.com",
    "create_timestamps": True,
}

write_feast_features = {
    "entity": entity_config,
    "file_source": file_source_config,
    "feature_view": feature_view_config,
    "file_path": "../data/feast_repo",
    "service_name": "income_feature_service"
}
# read this from yml file in real world


file_source_config = write_feast_features["file_source"]
df = feast_exporter.add_timestamp_columns(df, file_source_config)

In [None]:
from anovos.data_ingest.data_ingest import write_dataset

In [None]:
write_dataset(df, outputPath, 'parquet',{'repartition':1, 'mode':'overwrite'})

In [None]:
import os 
import glob

In [None]:
# Example 1 - write feast feature configuration into feast repository
path = os.path.join(write_main["file_path"], "final_dataset", "part*")
filename = glob.glob(path)[0]
feast_exporter.generate_feature_description(df.dtypes, write_feast_features, filename)