### SageMaker Feature Store Notebook showing use of Point-in-Time Queries

This notebook is part of an AWS blog, written in collaboration with GoDaddy.com, that shows how to use "Point-in-Time" queries by leveraging SageMaker Feature Store. The blog walks through a fraud detection use case where we ingest transaction data. 

This particular notebook (#2) is used to create two feature groups, one for consumer transaction data and one for credit card transaction data. These two feature groups are created according to the schema files, which are stored in the `schema` sub-directory. This schema is aligned with previous notebook, `1_generate_creditcard_transactions.ipynb` , which generated raw transaction data for the fraud detection use case.

In [None]:
from sagemaker import get_execution_role
import sagemaker
import boto3
import time
import json
import sys

role = get_execution_role()
sm_client = boto3.Session().client(service_name='sagemaker')
smfs_runtime = boto3.Session().client(service_name='sagemaker-featurestore-runtime')

#### Start by Deleting Feature Groups that we will re-create

In [None]:
# Use SageMaker default bucket
BUCKET = sagemaker.Session().default_bucket()
BASE_PREFIX = "sagemaker-featurestore-blog"

# Note that FeatureStore will append this pattern to base prefix -> "{account_id}/sagemaker/{region}/offline-store/"
OFFLINE_STORE_BASE_URI = f's3://{BUCKET}/{BASE_PREFIX}'

print(OFFLINE_STORE_BASE_URI)

CONS_FEATURE_GROUP = "consumer-fg"
CARD_FEATURE_GROUP = "credit-card-fg"

In [None]:
# Delete feature groups (in case the name already exists)

try:
    sm_client.delete_feature_group(FeatureGroupName=CARD_FEATURE_GROUP) 
    print('deleted feature group: ' + CARD_FEATURE_GROUP)
except:
    print('Could not delete feature group, it may not exist')
    
try:
    sm_client.delete_feature_group(FeatureGroupName=CONS_FEATURE_GROUP) 
    print('deleted feature group: ' + CONS_FEATURE_GROUP)
except:
    print('Could not delete feature group, it may not exist')
    

#### Recreate the Feature Groups using Schema definition files
Each feature group contains configuration parameters for Offline and Online stores. The feature group uses a schema definition file (JSON) that dictates the feature names and types. Below we display these local schema files.

#### Schema files on in the local 'schema' folder

In [None]:
!pygmentize schema/consumer-fg-schema.json

In [None]:
!pygmentize schema/credit-card-fg-schema.json

In [None]:
def create_feature_group_from_schema(filename, fg_name, role_arn=None, s3_uri=None):
    schema = json.loads(open(filename).read())
    
    feature_defs = []
    
    for col in schema['features']:
        feature = {'FeatureName': col['name']}
        if col['type'] == 'double':
            feature['FeatureType'] = 'Fractional'
        elif col['type'] == 'bigint':
            feature['FeatureType'] = 'Integral'
        else:
            feature['FeatureType'] = 'String'
        feature_defs.append(feature)

    record_identifier_name = schema['record_identifier_feature_name']
    event_time_name = schema['event_time_feature_name']

    if role_arn is None:
        role_arn = get_execution_role()

    if s3_uri is None:
        offline_config = {}
    else:
        print(f'Creating Offline Store at: {s3_uri}')
        offline_config = {'OfflineStoreConfig': {'S3StorageConfig': {'S3Uri': s3_uri}}}
        
    sm_client.create_feature_group(
        FeatureGroupName = fg_name,
        RecordIdentifierFeatureName = record_identifier_name,
        EventTimeFeatureName = event_time_name,
        FeatureDefinitions = feature_defs,
        Description = schema['description'],
        Tags = schema['tags'],
        OnlineStoreConfig = {'EnableOnlineStore': True},
        RoleArn = role_arn,
        **offline_config)

#### Create the new Feature Groups using the schema definition 
Now we will create the feature group as defined by the schema file. Since Feature Group creation can sometimes take a few minutes, we will wait below for status to change from `Creating`.

In [None]:
create_feature_group_from_schema('schema/consumer-fg-schema.json', CONS_FEATURE_GROUP, 
                                 role_arn=role, s3_uri=OFFLINE_STORE_BASE_URI)

In [None]:
create_feature_group_from_schema('schema/credit-card-fg-schema.json', CARD_FEATURE_GROUP, 
                                 role_arn=role, s3_uri=OFFLINE_STORE_BASE_URI)

In [None]:
# Wait for status to change to 'Created'

def wait_for_feature_group_creation_complete(feature_group_name):
    response = sm_client.describe_feature_group(FeatureGroupName=feature_group_name)
    status = response['FeatureGroupStatus']
    while status == "Creating":
        print("Waiting for Feature Group Creation")
        time.sleep(5)
        response = sm_client.describe_feature_group(FeatureGroupName=feature_group_name)
        status = response['FeatureGroupStatus']
    if status != "Created":
        raise RuntimeError(f"Failed to create feature group {feature_group_name}")
    print(f"FeatureGroup {feature_group_name} successfully created.")

wait_for_feature_group_creation_complete(feature_group_name=CONS_FEATURE_GROUP)
wait_for_feature_group_creation_complete(feature_group_name=CARD_FEATURE_GROUP)

#### Make sure the new Feature Groups exist

In [None]:
sm_client.list_feature_groups()

#### Describe configuration of feature group
Note that each feature group gets its own ARN, allowing you to manage IAM policies that control access to individual feature groups. The feature names and types are displayed, and the required record identifier and event time features are called out specifically. Notice that when we created the Feature Group above, we passed in the `s3_uri` parameter. This parameter dictates the base S3 location where the Offline Store data is written, and can be retrieved from the `describe_feature_group` output within the `OfflineStoreConfig` dictionary. 

In [None]:
sm_client.describe_feature_group(FeatureGroupName=CONS_FEATURE_GROUP)

In [None]:
sm_client.describe_feature_group(FeatureGroupName=CARD_FEATURE_GROUP)