## How-to guide for Feature Store on the Abacus.AI platform
This notebook provides you with a hands on environment to build and deploy a feature store using Abacus.AI

We'll be using the [User Movie Ratings](https://s3.amazonaws.com/abacusai.exampledatasets/user_recommendations/user_movie_ratings.csv), [Movies Metadata](https://s3.amazonaws.com/abacusai.exampledatasets/user_recommendations/movies_metadata.csv) and [User Metadata](https://s3.amazonaws.com/abacusai.exampledatasets/user_recommendations/users_metadata.csv) datasets.

1. Install the Abacus.AI library.

In [None]:
!pip install abacusai

2. Add your Abacus.AI [API Key](https://abacus.ai/app/profile/apikey) generated using the API dashboard as follows:

In [None]:
#@title Abacus.AI API Key

api_key = ''  #@param {type: "string"}

3. Import the Abacus.AI library and instantiate a client.

In [None]:
from abacusai import ApiClient, ApiException
client = ApiClient(api_key)

## 1. Create a Project



In this notebook, we're going to create and deploy a feature store that automatically featurizes input data using the Movies Interaction and User/Item Metadata datasets.

In [None]:
project = client.create_project(name='Demo Movie Feature Store Project', use_case='FEATURE_STORE')

## 2. Creating Datasets

Abacus.AI can read datasets directly from File blob storage

We are using three datasets for this notebook.
- [User Movie Ratings](https://s3.amazonaws.com/abacusai.exampledatasets/user_recommendations/user_movie_ratings.csv)
This dataset contains information about user interactions with movies.
- [Movies Metadata](https://s3.amazonaws.com/abacusai.exampledatasets/user_recommendations/movies_metadata.csv) This dataset contains information about the movies.
- [User Metadata](https://s3.amazonaws.com/abacusai.exampledatasets/user_recommendations/users_metadata.csv) This dataset contains information about users.


### Data Preview

In [None]:
!pip install fsspec
!pip install s3fs
import pandas as pd

In [None]:
pd.read_csv('s3://abacusai.exampledatasets/user_recommendations/user_movie_ratings.csv')

In [None]:
pd.read_csv('s3://abacusai.exampledatasets/user_recommendations/movies_metadata.csv')

In [None]:
pd.read_csv('s3://abacusai.exampledatasets/user_recommendations/users_metadata.csv')

### Add the datasets to Abacus.AI


Using the Create Dataset API, we can tell Abacus.AI the public S3 URI of where to find the datasets.



In [None]:
# if the datasets already exist, skip creation
try: 
  events_dataset = client.describe_dataset(client.describe_feature_group_by_table_name('user_movie_ratings').dataset_id)
  movies_dataset = client.describe_dataset(client.describe_feature_group_by_table_name('movies_metadata').dataset_id)
  users_dataset = client.describe_dataset(client.describe_feature_group_by_table_name('users_metadata').dataset_id)
except ApiException: # datasets not found
  events_dataset = client.create_dataset_from_file_connector(name='User Movie Ratings', table_name='user_movie_ratings', location='s3://abacusai.exampledatasets/user_recommendations/user_movie_ratings.csv')
  movies_dataset = client.create_dataset_from_file_connector(name='Movie Metadata', table_name='movies_metadata', location='s3://abacusai.exampledatasets/user_recommendations/movies_metadata.csv')
  users_dataset = client.create_dataset_from_file_connector(name='User Metadata', table_name='users_metadata', location='s3://abacusai.exampledatasets/user_recommendations/users_metadata.csv')

events_feature_group = client.describe_feature_group_by_table_name(table_name=events_dataset.feature_group_table_name)
events_feature_group.set_indexing_config(lookup_keys=['user_id'], update_timestamp_key='timestamp')

movies_feature_group = client.describe_feature_group_by_table_name(table_name=movies_dataset.feature_group_table_name)
movies_feature_group.set_indexing_config(primary_key='movie_id')

users_feature_group = client.describe_feature_group_by_table_name(table_name=users_dataset.feature_group_table_name)
users_feature_group.set_indexing_config(primary_key='user_id')

In [None]:
for dataset in [events_dataset, movies_dataset, users_dataset]:
    dataset.wait_for_inspection()

In [None]:
feature_group = client.create_feature_group(table_name='combined_user_movie_ratings', sql='SELECT * FROM user_movie_ratings JOIN movies_metadata USING (movie_id) JOIN users_metadata ON (user_movie_ratings.user_id = users_metadata.user_id)')
feature_group.add_to_project(project.project_id)

#Managing tags and security

We can add/remove user-friendly tags to this feature group, as well as lock it to prevent unauthorized editing.

In [None]:
feature_group.add_tag('user interactions with movies')
feature_group.add_tag('test')
print(feature_group.refresh().tags)
feature_group.remove_tag('test')
print(feature_group.refresh().tags)

In [None]:
feature_group.set_modifier_lock(True)
print(feature_group.list_modifiers())
feature_group.add_user_to_modifiers('austin@abacus.ai')
print(feature_group.list_modifiers())
feature_group.remove_user_from_modifiers('austin@abacus.ai')
feature_group.set_modifier_lock(False)

### Materialize Feature Group Data

In [None]:
feature_group.set_indexing_config(lookup_keys=['user_id', 'movie_id'])

In [None]:
feature_group_version = feature_group.create_version()
feature_group_version.wait_for_results()

### Inspect Data locally

In [None]:
df = feature_group_version.load_as_pandas()
df

### Export Feature Group Data

Before you can export the feature group data, you must first authorize Abacus.AI to be able to write to your blob storage provider [here](https://abacus.ai/profile/connectors).

In [None]:
import time
external_export_uri = f's3://featurestore-export/demo/export_{time.time()}.csv'

In [None]:
feature_group_version.export_to_file_connector(external_export_uri, export_file_format='CSV')

### Deploy feature group for online featurization

In [None]:
deployment_token = client.create_deployment_token(project_id=project.project_id).deployment_token
deployment = client.create_deployment(feature_group_id=feature_group.feature_group_id, project_id=project.project_id) 
deployment.wait_for_deployment()

In [None]:
client.lookup_features(deployment_id=deployment.deployment_id, deployment_token=deployment_token, query_data={'user_id': '1'})

#Cleanup Script

In [None]:
#@title Abacus.AI API Key

api_key = 'cf45d23272e74465b776e5fa79101f7b'  #@param {type: "string"}

!pip install abacusai
from abacusai import ApiClient, ApiException
client = ApiClient(api_key)

delete_project_names = ['Demo Movie Feature Store Project', 'Demo Feature Store Streaming Project', 'Demo Feature Store Point In Time Streaming Project', 'Demo Python Model']
for project in client.list_projects():
  if project.name in delete_project_names:
    [deployment.stop() for deployment in project.list_deployments()]
    project.delete()

fgs_to_delete = ['interactions_joined_items', 'python_interactions_joined_items', 'combined_user_movie_ratings', 'demo_views_interactions_joined_items', 'demo_add_to_cart_interactions_joined_items', 'demo_transactions', 'demo_interactions_joined_items', 'concrete_by_flyash']
for fg_name in fgs_to_delete:
  try:
    fg = client.describe_feature_group_by_table_name(fg_name)
    fg.delete()
  except ApiException:
    pass