# Setup

**NOTE:** Before running this notebook, be sure to set the stack name in the first code cell to match the name of the CloudFormation stack you used to create this notebook instance. If you used the default stack name, you should not need to make any updates.

This notebook performs the following setup actions for this example use of Amazon SageMaker Feature Store:

1. Create online-only feature groups
2. Create an Amazon Kinesis data stream
3. Create an Amazon Kinesis Data Applications (KDA) application

### Get ARN's of Lambda functions from CloudFormation stack outputs
1. InvokeFraudEndpointLambdaARN
2. StreamingAggLambdaARN

In [1]:
STACK_NAME = 'sm-fs-streaming-agg-stack' # if you're not using the default stack name, replace this
%store STACK_NAME

Stored 'STACK_NAME' (str)


In [2]:
import sys
import boto3

cf_client = boto3.client('cloudformation')

try:
    outputs = cf_client.describe_stacks(StackName=STACK_NAME)['Stacks'][0]['Outputs']
    for o in outputs:
        if o['OutputKey'] == 'IngestLambdaFunctionARN':
            lambda_to_fs_arn = o['OutputValue']
        if o['OutputKey'] == 'PredictLambdaFunctionARN':
            lambda_to_model_arn = o['OutputValue']
        if o['OutputKey'] == 'PredictLambdaFunctionName':
            predict_lambda_name = o['OutputValue']

except:
    msg = f'CloudFormation stack {STACK_NAME} was not found. Please set the STACK_NAME properly and re-run this cell'
    sys.exit(ValueError(msg))

In [3]:
print(f'lambda_to_model_arn: {lambda_to_model_arn}')
print(f'lambda_to_fs_arn: {lambda_to_fs_arn}')
print(f'predict_lambda_name: {predict_lambda_name}')

lambda_to_model_arn: arn:aws:lambda:us-east-1:461312420708:function:InvokeFraudEndpointLambda
lambda_to_fs_arn: arn:aws:lambda:us-east-1:461312420708:function:StreamingIngestAggFeatures
predict_lambda_name: InvokeFraudEndpointLambda


In [4]:
%store lambda_to_model_arn

Stored 'lambda_to_model_arn' (str)


In [5]:
%store predict_lambda_name

Stored 'predict_lambda_name' (str)


In [6]:
# to get the latest sagemaker python sdk
#!pip install -U sagemaker

In [7]:
from IPython.display import display_html
def restartkernel() :
    display_html("<script>Jupyter.notebook.kernel.restart()</script>",raw=True)
#restartkernel()

### Imports and other setup

In [4]:
from sagemaker import get_execution_role
import sagemaker
import boto3
import json

role = get_execution_role()
sm = boto3.Session().client(service_name='sagemaker')
smfs_runtime = boto3.Session().client(service_name='sagemaker-featurestore-runtime')

## Create online-only feature groups
When using Amazon SageMaker Feature Store, a core design decision is the definition of feature groups. For our credit card fraud detection use case, we have decided to use two of them:

1. `cc-agg-chime-fg` - holds aggregate features that will be updated in near real-time (streaming ingestion)
2. `cc-agg-batch-chime-fg` - holds aggregate features that will be updated in batch

Establishing a feature group is a one-time step and is done using the `CreateFeatureGroup` API. 

Feature groups can be created as **online-only**, **offline-only**, or both **online and offline**, which replicates updates from an online store to an offline store in Amazon S3. Since our focus in this example is on demonstrating the use of the feature store for online inference and streaming aggregation of features, we make each of our feature groups online-only.

In addition to a feature group name, we provide metadata about each feature in the group. We are using a json file to define the schema, but this is not a requirement. We use a schema file to demonstrate how you might capture the feature group definitions, enabling you to recreate them consistently as you move from a development environment to a test or production environment. In our schema file, we also highlight the record identifier and the event timestamp. All feature groups must have these two features, but you get to decide how to name them.

Here is a visual summary of the feature groups we will create below.

<img src="./images/feature_groups.png" />

#### cc-agg-fg schema

In [8]:
!pygmentize schema/cc-agg-fg-chime-schema.json

{[37m[39;49;00m
[37m    [39;49;00m[94m"description"[39;49;00m:[37m [39;49;00m[33m"Aggregated features for each credit card, batch ingestion nightly"[39;49;00m,[37m[39;49;00m
[37m    [39;49;00m[94m"features"[39;49;00m:[37m [39;49;00m[[37m[39;49;00m
[37m          [39;49;00m{[37m[39;49;00m
[37m              [39;49;00m[94m"name"[39;49;00m:[37m [39;49;00m[33m"cc_num"[39;49;00m,[37m[39;49;00m
[37m              [39;49;00m[94m"type"[39;49;00m:[37m [39;49;00m[33m"bigint"[39;49;00m,[37m[39;49;00m
[37m              [39;49;00m[94m"description"[39;49;00m:[37m [39;49;00m[33m"Credit Card Number (Unique)"[39;49;00m[37m[39;49;00m
[37m          [39;49;00m},[37m[39;49;00m
[37m          [39;49;00m{[37m[39;49;00m
[37m              [39;49;00m[94m"name"[39;49;00m:[37m [39;49;00m[33m"num_trans_last_10m"[39;49;00m,[37m[39;49;00m
[37m              [39;49;00m[94m"type"[39;49;00m:[37m [39;49;00m[33m"bigint"[39;49;00m,[37m[

#### cc-agg-batch-fg schema

In [9]:
!pygmentize schema/cc-agg-batch-fg-chime-schema.json

{[37m[39;49;00m
[37m    [39;49;00m[94m"description"[39;49;00m:[37m [39;49;00m[33m"Aggregated features for each credit card, streamed intraday"[39;49;00m,[37m[39;49;00m
[37m    [39;49;00m
[37m    [39;49;00m[94m"features"[39;49;00m:[37m [39;49;00m[[37m[39;49;00m
[37m          [39;49;00m{[37m[39;49;00m
[37m              [39;49;00m[94m"name"[39;49;00m:[37m [39;49;00m[33m"cc_num"[39;49;00m,[37m[39;49;00m
[37m              [39;49;00m[94m"type"[39;49;00m:[37m [39;49;00m[33m"bigint"[39;49;00m,[37m[39;49;00m
[37m              [39;49;00m[94m"description"[39;49;00m:[37m [39;49;00m[33m"Credit Card Number (Unique)"[39;49;00m[37m[39;49;00m
[37m          [39;49;00m},[37m[39;49;00m
[37m          [39;49;00m{[37m[39;49;00m
[37m              [39;49;00m[94m"name"[39;49;00m:[37m [39;49;00m[33m"num_trans_last_1w"[39;49;00m,[37m[39;49;00m
[37m              [39;49;00m[94m"type"[39;49;00m:[37m [39;49;00m[33m"bigint"[39

In [47]:
%%writefile schema/cc-train-fg-chime-schema.json
{
    "description": "Feature with credit card at 10m and 1week average",
    
    "features": [
          {
              "name": "tid",
              "type": "bigint",
              "description": "Transaction ID (Unique)"
          },
          {
              "name": "cc_num",
              "type": "bigint",
              "description": "Credit Card Number "
          },
          {
              "name": "amount",
              "type": "bigint",
              "description": "Transaction Amount"
          },
          {
              "name": "fraud_label",
              "type": "int",
              "description": "Is fraud"
          },
          {
              "name": "num_trans_last_10m",
              "type": "bigint",
              "description": "Aggregated Metric: Average number of transactions for the card aggregated by past 10 minutes"
          },
          {
              "name": "avg_amt_last_10m",
              "type": "double",
              "description": "Aggregated Metric: Average transaction amount for the card aggregated by past 10 minutes"
          },
          {
              "name": "num_trans_last_1w",
              "type": "bigint",
              "description": "Aggregated Metric: Average number of transactions for the card aggregated by past 1 week"
          },
          {
              "name": "avg_amt_last_1w",
              "type": "double",
              "description": "Aggregated Metric: Average transaction amount for the card aggregated by past 1 week"
          },
          {
              "name": "amt_ratio1",
              "type": "double",
              "description": "avg_amt_last_10m by avg_amt_last_1w"
          },
          {
              "name": "amt_ratio2",
              "type": "double",
              "description": "count by avg_amt_last_1w"
          },
          {
              "name": "count_ratio",
              "type": "double",
              "description": "num_trans_last_10m by num_trans_last_1w"
          },
          {
              "name": "datetime",
              "type": "double",
              "description": "Required feature for event timestamp"
          }
      ],
    
      "record_identifier_feature_name": "tid",
      "event_time_feature_name": "datetime",
      "tags": [{"Key": "Environment", "Value" : "DEV"}, 
               {"Key": "IngestionType", "Value": "Batch"},
               {"Key": "CostCenter", "Value": "C18"}]
}

Writing schema/cc-train-fg-chime-schema.json


#### Utility functions to simplify creation of feature groups
`schema_to_defs` takes our schema file and returns feature definitions, and the names of the record identifier and event timestamp feature.

In [3]:
def schema_to_defs(filename):
    schema = json.loads(open(filename).read())
    
    feature_definitions = []
    
    for col in schema['Features']:
        feature = {'FeatureName': col['name']}
        if col['type'] == 'double':
            feature['FeatureType'] = 'Fractional'
        elif col['type'] == 'bigint':
            feature['FeatureType'] = 'Integral'
        else:
            feature['FeatureType'] = 'String'
        feature_definitions.append(feature)

    return feature_definitions, schema['record_identifier_feature_name'], schema['event_time_feature_name']

`schema_to_fg` creates a feature group from a schema file. If no s3 URI is passed, an online-only feature group is created.

In [6]:
def create_feature_group_from_schema(filename, fg_name, role_arn=None, s3_uri=None):
    schema = json.loads(open(filename).read())
    
    feature_defs = []
    
    for col in schema['features']:
        feature = {'FeatureName': col['name']}
        if col['type'] == 'double':
            feature['FeatureType'] = 'Fractional'
        elif col['type'] == 'bigint':
            feature['FeatureType'] = 'Integral'
        else:
            feature['FeatureType'] = 'String'
        feature_defs.append(feature)

    record_identifier_name = schema['record_identifier_feature_name']
    event_time_name = schema['event_time_feature_name']

    if role_arn is None:
        role_arn = get_execution_role()

    if s3_uri is None:
        offline_config = {}
    else:
        offline_config = {'OfflineStoreConfig': {'S3StorageConfig': {'S3Uri': s3_uri}}}
        
    sm.create_feature_group(
        FeatureGroupName = fg_name,
        RecordIdentifierFeatureName = record_identifier_name,
        EventTimeFeatureName = event_time_name,
        FeatureDefinitions = feature_defs,
        Description = schema['description'],
        Tags = schema['tags'],
        OnlineStoreConfig = {'EnableOnlineStore': True},
        RoleArn = role_arn,
        **offline_config)

In [2]:
prefix = 'chime-fs'
default_bucket = 'chime-fs-demo'

#### Create the two feature groups

In [45]:
create_feature_group_from_schema('schema/cc-agg-fg-chime-schema.json', 'cc-agg-chime-fg',s3_uri=f's3://{default_bucket}/{prefix}')

In [46]:
create_feature_group_from_schema('schema/cc-agg-batch-fg-chime-schema.json', 'cc-agg-batch-chime-fg',s3_uri=f's3://{default_bucket}/{prefix}')

In [5]:
create_feature_group_from_schema('schema/cc-train-fg-chime-schema.json', 'cc-train-chime-fg',s3_uri=f's3://{default_bucket}/{prefix}')

#### Show that the feature store is aware of the new feature groups

In [21]:
from utilities.feature_store_helper import FeatureStore

fs = FeatureStore()
fs.list_feature_groups('chime')

[{'FeatureGroupName': 'cc-agg-chime-fg',
  'FeatureGroupArn': 'arn:aws:sagemaker:us-east-1:461312420708:feature-group/cc-agg-chime-fg',
  'CreationTime': datetime.datetime(2023, 4, 2, 22, 8, 57, 659000, tzinfo=tzlocal()),
  'FeatureGroupStatus': 'Created'},
 {'FeatureGroupName': 'cc-agg-batch-chime-fg',
  'FeatureGroupArn': 'arn:aws:sagemaker:us-east-1:461312420708:feature-group/cc-agg-batch-chime-fg',
  'CreationTime': datetime.datetime(2023, 4, 2, 22, 8, 59, 327000, tzinfo=tzlocal()),
  'FeatureGroupStatus': 'Created'}]

#### Describe each feature group
Note that each feature group gets its own ARN, allowing you to manage IAM policies that control access to individual feature groups. The feature names and types are displayed, and the record identifier and event time features are called out specifically. Notice that there is only an `OnlineStoreConfig` and no `OfflineStoreConfig`, as we have decided not to replicate features offline for these groups.

In [15]:
sm.describe_feature_group(FeatureGroupName='cc-agg-chime-fg')

{'FeatureGroupArn': 'arn:aws:sagemaker:us-east-1:461312420708:feature-group/cc-agg-chime-fg',
 'FeatureGroupName': 'cc-agg-chime-fg',
 'RecordIdentifierFeatureName': 'cc_num',
 'EventTimeFeatureName': 'trans_time',
 'FeatureDefinitions': [{'FeatureName': 'cc_num', 'FeatureType': 'Integral'},
  {'FeatureName': 'num_trans_last_10m', 'FeatureType': 'Integral'},
  {'FeatureName': 'avg_amt_last_10m', 'FeatureType': 'Fractional'},
  {'FeatureName': 'trans_time', 'FeatureType': 'Fractional'}],
 'CreationTime': datetime.datetime(2023, 4, 2, 22, 8, 57, 659000, tzinfo=tzlocal()),
 'OnlineStoreConfig': {'EnableOnlineStore': True},
 'RoleArn': 'arn:aws:iam::461312420708:role/sm-fs-streaming-agg-stack-SageMakerRole-WU81JV183YQ2',
 'FeatureGroupStatus': 'Created',
 'Description': 'Aggregated features for each credit card, batch ingestion nightly',
 'OnlineStoreTotalSizeBytes': 0,
 'ResponseMetadata': {'RequestId': 'eba72ab7-7931-42cf-ba42-5343e0cab3d1',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-

In [16]:
sm.describe_feature_group(FeatureGroupName='cc-agg-batch-chime-fg')

{'FeatureGroupArn': 'arn:aws:sagemaker:us-east-1:461312420708:feature-group/cc-agg-batch-chime-fg',
 'FeatureGroupName': 'cc-agg-batch-chime-fg',
 'RecordIdentifierFeatureName': 'cc_num',
 'EventTimeFeatureName': 'trans_time',
 'FeatureDefinitions': [{'FeatureName': 'cc_num', 'FeatureType': 'Integral'},
  {'FeatureName': 'num_trans_last_1w', 'FeatureType': 'Integral'},
  {'FeatureName': 'avg_amt_last_1w', 'FeatureType': 'Fractional'},
  {'FeatureName': 'trans_time', 'FeatureType': 'Fractional'}],
 'CreationTime': datetime.datetime(2023, 4, 2, 22, 8, 59, 327000, tzinfo=tzlocal()),
 'OnlineStoreConfig': {'EnableOnlineStore': True},
 'RoleArn': 'arn:aws:iam::461312420708:role/sm-fs-streaming-agg-stack-SageMakerRole-WU81JV183YQ2',
 'FeatureGroupStatus': 'Created',
 'Description': 'Aggregated features for each credit card, streamed intraday',
 'OnlineStoreTotalSizeBytes': 0,
 'ResponseMetadata': {'RequestId': 'ff300873-62ed-4648-a39f-1f5266f66ded',
  'HTTPStatusCode': 200,
  'HTTPHeaders': 

## Create an Amazon Kinesis Data Stream

In [22]:
kinesis_client = boto3.client('kinesis')

In [23]:
kinesis_client.create_stream(StreamName='cc-chime-stream', ShardCount=1)

{'ResponseMetadata': {'RequestId': 'e2e04f9c-9e2d-a4ec-be44-4fe85a225df9',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': 'e2e04f9c-9e2d-a4ec-be44-4fe85a225df9',
   'x-amz-id-2': 'sDpvwQABK5a9U7yzHq8KYESAk+ekWPOP12XLGg8V9F+rln39hzqT3VZ+uF5pymrnn2fb7BGHl8G6/2/yh9Z2tmrWBTR2aau4SrudwIqKXLE=',
   'date': 'Sun, 02 Apr 2023 22:14:12 GMT',
   'content-type': 'application/x-amz-json-1.1',
   'content-length': '0'},
  'RetryAttempts': 0}}

In [24]:
kinesis_client.list_streams()

{'StreamNames': ['cc-chime-stream', 'cc-stream'],
 'HasMoreStreams': False,
 'StreamSummaries': [{'StreamName': 'cc-chime-stream',
   'StreamARN': 'arn:aws:kinesis:us-east-1:461312420708:stream/cc-chime-stream',
   'StreamStatus': 'CREATING',
   'StreamModeDetails': {'StreamMode': 'PROVISIONED'},
   'StreamCreationTimestamp': datetime.datetime(2023, 4, 2, 22, 14, 11, tzinfo=tzlocal())},
  {'StreamName': 'cc-stream',
   'StreamARN': 'arn:aws:kinesis:us-east-1:461312420708:stream/cc-stream',
   'StreamStatus': 'ACTIVE',
   'StreamModeDetails': {'StreamMode': 'PROVISIONED'},
   'StreamCreationTimestamp': datetime.datetime(2023, 3, 24, 16, 57, 54, tzinfo=tzlocal())}],
 'ResponseMetadata': {'RequestId': 'e1b650a3-02b9-a2c1-bd12-50d5c6b65bd4',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': 'e1b650a3-02b9-a2c1-bd12-50d5c6b65bd4',
   'x-amz-id-2': 'gZOnCidn2xUk11am7tIp97HT5Yr9R0Sh8NjBo318PWM5q+TQfr3RsX5gWJgp+Kq7m3isZDJkYKaozzaMVYjajSmqJridqTCu4x9qFEg2NT4=',
   'date': 'Sun, 02 

In [25]:
kinesis_client.describe_stream(StreamName='cc-chime-stream')

{'StreamDescription': {'StreamName': 'cc-chime-stream',
  'StreamARN': 'arn:aws:kinesis:us-east-1:461312420708:stream/cc-chime-stream',
  'StreamStatus': 'ACTIVE',
  'StreamModeDetails': {'StreamMode': 'PROVISIONED'},
  'Shards': [{'ShardId': 'shardId-000000000000',
    'HashKeyRange': {'StartingHashKey': '0',
     'EndingHashKey': '340282366920938463463374607431768211455'},
    'SequenceNumberRange': {'StartingSequenceNumber': '49639470341898735315633020849736896402179764604945563650'}}],
  'HasMoreShards': False,
  'RetentionPeriodHours': 24,
  'StreamCreationTimestamp': datetime.datetime(2023, 4, 2, 22, 14, 11, tzinfo=tzlocal()),
  'EnhancedMonitoring': [{'ShardLevelMetrics': []}],
  'EncryptionType': 'NONE'},
 'ResponseMetadata': {'RequestId': 'c7427708-f360-f08b-9be6-777502b5183a',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': 'c7427708-f360-f08b-9be6-777502b5183a',
   'x-amz-id-2': '8UrmIGndYSAYvBvBow/J3jcAXNkERYABpDWvIcrZWzNC5Z3josbM0eq8kxDxltTMQ5Y3hduqHswLG+/kH

In [26]:
import time
active_stream = False
while not active_stream:
    status = kinesis_client.describe_stream(StreamName='cc-chime-stream')['StreamDescription']['StreamStatus']
    if (status == 'CREATING'):
        print('Waiting for the Kinesis stream to become active...')
        time.sleep(20)  
    elif (status == 'ACTIVE'): 
        active_stream = True
        print('ACTIVE')

ACTIVE


In [27]:
stream_arn = kinesis_client.describe_stream(StreamName='cc-chime-stream')['StreamDescription']['StreamARN']

In [28]:
stream_arn

'arn:aws:kinesis:us-east-1:461312420708:stream/cc-chime-stream'

## Map the Kinesis stream as an event source for Lambda fraud detection

In [29]:
lambda_client = boto3.client('lambda')

lambda_client.create_event_source_mapping(EventSourceArn=stream_arn,
                                          FunctionName=lambda_to_model_arn,
                                          StartingPosition='LATEST',
                                          Enabled=True,
                                          MaximumRecordAgeInSeconds=60*10
                                          ) #DestinationConfig would handle discarded records

{'ResponseMetadata': {'RequestId': 'a4367e08-a2ab-496d-8746-10053a9fe55d',
  'HTTPStatusCode': 202,
  'HTTPHeaders': {'date': 'Sun, 02 Apr 2023 22:14:59 GMT',
   'content-type': 'application/json',
   'content-length': '964',
   'connection': 'keep-alive',
   'x-amzn-requestid': 'a4367e08-a2ab-496d-8746-10053a9fe55d'},
  'RetryAttempts': 0},
 'UUID': '603b4506-b295-4c10-bc8c-d5a1cb96b8d8',
 'StartingPosition': 'LATEST',
 'BatchSize': 100,
 'MaximumBatchingWindowInSeconds': 0,
 'ParallelizationFactor': 1,
 'EventSourceArn': 'arn:aws:kinesis:us-east-1:461312420708:stream/cc-chime-stream',
 'FunctionArn': 'arn:aws:lambda:us-east-1:461312420708:function:InvokeFraudEndpointLambda',
 'LastModified': datetime.datetime(2023, 4, 2, 22, 14, 59, 77000, tzinfo=tzlocal()),
 'LastProcessingResult': 'No records processed',
 'State': 'Creating',
 'StateTransitionReason': 'User action',
 'DestinationConfig': {'OnFailure': {}},
 'MaximumRecordAgeInSeconds': 600,
 'BisectBatchOnFunctionError': False,
 'M

## Create an Amazon Kinesis Data Applications (KDA) application

In [30]:
kda_client = boto3.client('kinesisanalytics')

In [31]:
sql_code = 'CREATE OR REPLACE STREAM "DESTINATION_SQL_STREAM" (\n' + \
                '"cc_num"              BIGINT,\n' + \
                '"num_trans_last_10m"  SMALLINT,\n' + \
                '"avg_amt_last_10m"    REAL\n);\n\n' + \
            'CREATE OR REPLACE PUMP "STREAM_PUMP" AS\n' + \
            'INSERT INTO "DESTINATION_SQL_STREAM"\n' + \
                'SELECT STREAM "cc_num", \n' + \
                    'COUNT(*) OVER LAST_10_MINUTES, \n' + \
                    'AVG("amount") OVER LAST_10_MINUTES\n' + \
                    'FROM "SOURCE_SQL_STREAM_001"\n' + \
                    'WINDOW LAST_10_MINUTES AS (\n' + \
                        'PARTITION BY "cc_num"\n' + \
                        'RANGE INTERVAL \'10\' MINUTE PRECEDING);\n'

In [32]:
kda_inputs = [{
                'NamePrefix': 'SOURCE_SQL_STREAM',
                'KinesisStreamsInput': {
                       'ResourceARN': stream_arn,
                       'RoleARN': role
                },
                'InputSchema': {
                      'RecordFormat': {
                          'RecordFormatType': 'JSON',
                          'MappingParameters': {
                              'JSONMappingParameters': {
                                  'RecordRowPath': '$'
                              }
                          },
                      },
                      'RecordEncoding': 'UTF-8',
                      'RecordColumns': [
                          {'Name': 'cc_num',  'Mapping': '$.cc_num',   'SqlType': 'DECIMAL(1,1)'},
                          {'Name': 'merchant','Mapping': '$.merchant', 'SqlType': 'VARCHAR(64)'},
                          {'Name': 'amount', 'Mapping': '$.amount', 'SqlType': 'REAL'},
                          {'Name': 'zip_code', 'Mapping': '$.zip_code', 'SqlType': 'INTEGER'}
                      ]
                }
              }                         
             ]

<h3> Create Kinesis Data Analytics Application </h3>

We first lookup Lambda ARNs from CloudFormation output, then create a Kinesis Data Analytics application that connects its output to the Streaming Lambda. This Lambda will ingest the records and write them to the SageMaker Feature Group.

In [33]:
kda_outputs = [{'LambdaOutput': {'ResourceARN': lambda_to_fs_arn, 'RoleARN': role},
                'Name': 'DESTINATION_SQL_STREAM',
                'DestinationSchema': {'RecordFormatType': 'JSON'}}]

In [34]:
kda_outputs

[{'LambdaOutput': {'ResourceARN': 'arn:aws:lambda:us-east-1:461312420708:function:StreamingIngestAggFeatures',
   'RoleARN': 'arn:aws:iam::461312420708:role/sm-fs-streaming-agg-stack-SageMakerRole-WU81JV183YQ2'},
  'Name': 'DESTINATION_SQL_STREAM',
  'DestinationSchema': {'RecordFormatType': 'JSON'}}]

In [35]:
kda_client.create_application(ApplicationName='cc-agg-chime-app', 
                              Inputs=kda_inputs,
                              Outputs=kda_outputs,
                              ApplicationCode=sql_code)

{'ApplicationSummary': {'ApplicationName': 'cc-agg-chime-app',
  'ApplicationARN': 'arn:aws:kinesisanalytics:us-east-1:461312420708:application/cc-agg-chime-app',
  'ApplicationStatus': 'READY'},
 'ResponseMetadata': {'RequestId': '7d221bf4-9c80-4c82-a491-e813de943266',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': '7d221bf4-9c80-4c82-a491-e813de943266',
   'content-type': 'application/x-amz-json-1.1',
   'content-length': '185',
   'date': 'Sun, 02 Apr 2023 22:15:24 GMT'},
  'RetryAttempts': 0}}

In [36]:
kda_client.describe_application(ApplicationName='cc-agg-chime-app')

{'ApplicationDetail': {'ApplicationName': 'cc-agg-chime-app',
  'ApplicationARN': 'arn:aws:kinesisanalytics:us-east-1:461312420708:application/cc-agg-chime-app',
  'ApplicationStatus': 'READY',
  'CreateTimestamp': datetime.datetime(2023, 4, 2, 22, 15, 24, tzinfo=tzlocal()),
  'LastUpdateTimestamp': datetime.datetime(2023, 4, 2, 22, 15, 24, tzinfo=tzlocal()),
  'InputDescriptions': [{'InputId': '1.1',
    'NamePrefix': 'SOURCE_SQL_STREAM',
    'InAppStreamNames': ['SOURCE_SQL_STREAM_001'],
    'KinesisStreamsInputDescription': {'ResourceARN': 'arn:aws:kinesis:us-east-1:461312420708:stream/cc-chime-stream',
     'RoleARN': 'arn:aws:iam::461312420708:role/sm-fs-streaming-agg-stack-SageMakerRole-WU81JV183YQ2'},
    'InputSchema': {'RecordFormat': {'RecordFormatType': 'JSON',
      'MappingParameters': {'JSONMappingParameters': {'RecordRowPath': '$'}}},
     'RecordEncoding': 'UTF-8',
     'RecordColumns': [{'Name': 'cc_num',
       'Mapping': '$.cc_num',
       'SqlType': 'DECIMAL(1,1)'},

In [37]:
kda_client.start_application(ApplicationName='cc-agg-chime-app',
                             InputConfigurations=[{'Id': '1.1',
                                                   'InputStartingPositionConfiguration': 
                                                     {'InputStartingPosition':'NOW'}}])

{'ResponseMetadata': {'RequestId': 'fbded9a8-9704-42db-9e19-4ac6b621f2c9',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': 'fbded9a8-9704-42db-9e19-4ac6b621f2c9',
   'content-type': 'application/x-amz-json-1.1',
   'content-length': '2',
   'date': 'Sun, 02 Apr 2023 22:15:32 GMT'},
  'RetryAttempts': 0}}