# Update feature metadata - Add tags to features

### Recap of what is in place
1. In [notebook 0](./0_prepare_transactions_dataset.ipynb), We generated a synthetic dataset of transactions, including simulated fraud attacks.
2. In [notebook 1](./1_setup.ipynb), we created our two feature groups. In that same notebook, we also created a Kinesis data stream and a Kinesis Data Analytics SQL application that consumes the transaction stream and produces aggregate features. These features are provided in near real time to Lambda, and they look back over a 10 minute window.
3. In [notebook 2](./2_batch_ingestion-chime.ipynb), we used a SageMaker Processing Job to create aggregated features and used them to feed both the training dataset as well as an online feature group. We used Glue interactive session to ingest transaction data to offline feature store.
4. In [notebook 3](./3_train_and_deploy_model-chime.ipynb), we used offline fs and trained and deployed an XGBoost model to detect fraud.
5. In [notebook 4](./4_streaming_predictions-chime.ipynb), we send transaction to feature store in near real time and make prediction fraud/non fraud
6. In [notebook 5](./5_update_feature_group_chime.ipynb), we update online batch feature store and ingest updated dataset
7. In [notebook 6](./6_streaming_predictions-update-chime.ipynb), we update online feature store and stream new dataset to feature store 
8. In [notebook 7](./7_update_feature_group_version_dwflow.ipynb), we create new version of feature store from old using Data Wrangler


In [None]:
from sagemaker.feature_store.feature_group import FeatureGroup
from time import gmtime, strftime, sleep
from random import randint
import pandas as pd
import numpy as np
import subprocess
import sagemaker
import importlib
import logging
import time
import sys
import boto3
from datetime import datetime, timezone, date

In [None]:
logger = logging.getLogger('__name__')
logger.setLevel(logging.DEBUG)
logger.addHandler(logging.StreamHandler())

In [None]:
logger.info(f'Using SageMaker version: {sagemaker.__version__}')
logger.info(f'Using Boto3 version: {boto3.__version__}')

In [None]:
sagemaker_session = sagemaker.Session()
role = sagemaker.get_execution_role()
default_bucket = 'sm-fs-demo'
logger.info(f'Default S3 bucket = {default_bucket}')
prefix = 'sagemaker-feature-store'
region = sagemaker_session.boto_region_name

boto_session = boto3.Session(region_name=region)
sagemaker_client = boto_session.client(service_name='sagemaker', region_name=region)
featurestore_runtime = boto_session.client(service_name='sagemaker-featurestore-runtime', region_name=region)

In [None]:
def generate_event_timestamp():
    # naive datetime representing local time
    naive_dt = datetime.now()
    # take timezone into account
    aware_dt = naive_dt.astimezone()
    # time in UTC
    utc_dt = aware_dt.astimezone(timezone.utc)
    # transform to ISO-8601 format
    event_time = utc_dt.isoformat(timespec='milliseconds')
    event_time = event_time.replace('+00:00', 'Z')
    return event_time

In [None]:
feature_group_name ='cc-agg-fg'

In [None]:
print('describe fg', generate_event_timestamp())

ret = sagemaker_client.describe_feature_metadata(
    FeatureGroupName=feature_group_name,
    FeatureName="name" 
)

print('update fg', generate_event_timestamp(), '\n', ret['Parameters'])

sagemaker_client.update_feature_metadata(
    FeatureGroupName=feature_group_name,
    FeatureName="name",
    ParameterAdditions=[
        {"Key": "team", "Value": "mlops"},
        {"Key": "org1", "Value": "customer fin team"},
    ]
)

print('updated fg', generate_event_timestamp())

ret = sagemaker_client.describe_feature_metadata(
    FeatureGroupName=feature_group_name,
    FeatureName="name" 
)

print('describe modified fg', generate_event_timestamp(), '\n', ret['Parameters'])

In [None]:
# Search functions that returns features where either feature name, description or parameters (key/value pairs) match the search criteria
def search_features_using_string(search_string):
    response = sagemaker_client.search(
        Resource= "FeatureMetadata",
        SearchExpression={
            'Filters': [
                {
                    'Name': 'FeatureName',
                    'Operator': 'Contains',
                    'Value': search_string
                },
                {
                    'Name': 'Description',
                    'Operator': 'Contains',
                    'Value': search_string
                },
                {
                    'Name': 'AllParameters',
                    'Operator': 'Contains',
                    'Value': search_string
                }
            ],
            "Operator": "Or"
        },
    )
    # Displaying results in a DataFrame
    df=pd.json_normalize(response['Results'], max_level=1)
    df.columns = df.columns.map(lambda col: col.split(".")[1])
    df=df.drop('FeatureGroupArn', axis=1)
    return df

# Searching for Feature which contains "married" string in either feature name, description, or parameters
search_string="name"
search_features_using_string(search_string)