# Building an Automated Pipeline for Amazon Forecast
When a file is put in S3 with data for training, build a pipeline that automatically performs data import, training, forecasting, and exporting of predictions in Amazon Forecast.
We use Step Functions and Lambda.

# Configure SageMaker role
Attach policies for the services you use and also set up trust relationships, as shown in the figure below.

![IAMroles_Permissions](https://user-images.githubusercontent.com/27226946/89102049-7d18e380-d440-11ea-91a6-6ab2c7e63870.png)

![IAMroles_TrustRelationships](https://user-images.githubusercontent.com/27226946/89102054-84d88800-d440-11ea-9199-0583be09aa1c.png)

Edit trust relationship should be written as follows.

In [None]:
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "sagemaker.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    },
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "forecast.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    },
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "lambda.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    },
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "events.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    },
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "states.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

## Getting SageMaker role

In [1]:
from sagemaker import get_execution_role

role_sm = get_execution_role()

In [2]:
role_sm

'arn:aws:iam::805433377179:role/service-role/AmazonSageMaker-ExecutionRole-20200716T084970'

# 1.make Lambda function
We will create a Lambda function. The files we will use are located in lambdas/.

In [3]:
import boto3

In [4]:
lambda_ = boto3.client('lambda')

In [5]:
!rm -f lambdas/createdatasetimport/datasetimport.zip
!cd lambdas/createdatasetimport; zip -r datasetimport .

zip_file = open("lambdas/createdatasetimport/datasetimport.zip", "rb").read()


lambda_.create_function(
    FunctionName="datasetimport",
    Runtime="python3.7",
    Role=role_sm,
    Handler="datasetimport.lambda_handler",
    Code={"ZipFile": zip_file},
    Timeout=60*15,
    MemorySize=3008
)

  adding: datasetimport.py (deflated 54%)
  adding: .ipynb_checkpoints/ (stored 0%)
  adding: .ipynb_checkpoints/datasetimport-checkpoint.py (deflated 54%)


{'ResponseMetadata': {'RequestId': 'f50b2b66-af46-4ac1-8ae1-1bbb4ba5e8cd',
  'HTTPStatusCode': 201,
  'HTTPHeaders': {'date': 'Sat, 01 Aug 2020 13:19:08 GMT',
   'content-type': 'application/json',
   'content-length': '894',
   'connection': 'keep-alive',
   'x-amzn-requestid': 'f50b2b66-af46-4ac1-8ae1-1bbb4ba5e8cd'},
  'RetryAttempts': 0},
 'FunctionName': 'datasetimport',
 'FunctionArn': 'arn:aws:lambda:us-east-1:805433377179:function:datasetimport',
 'Runtime': 'python3.7',
 'Role': 'arn:aws:iam::805433377179:role/service-role/AmazonSageMaker-ExecutionRole-20200716T084970',
 'Handler': 'datasetimport.lambda_handler',
 'CodeSize': 1330,
 'Description': '',
 'Timeout': 900,
 'MemorySize': 3008,
 'LastModified': '2020-08-01T13:19:08.367+0000',
 'CodeSha256': 'J3eD8ZAuZBSsgYctOvOKYsZ14tLjFmZYQIlUNH9/Hfg=',
 'Version': '$LATEST',
 'TracingConfig': {'Mode': 'PassThrough'},
 'RevisionId': 'b8d34d4e-e05d-4124-a7d3-c52c91d58497',
 'State': 'Active',
 'LastUpdateStatus': 'Successful'}

In [6]:
!rm -f lambdas/GetStatusImport/getstatusimport.zip
!cd lambdas/GetStatusImport; zip -r getstatusimport .

zip_file = open("lambdas/GetStatusImport/getstatusimport.zip", "rb").read()

lambda_.create_function(
    FunctionName="getstatusimport",
    Runtime="python3.7",
    Role=role_sm,
    Handler="getstatusimport.lambda_handler",
    Code={"ZipFile": zip_file},
    Timeout=60*15,
    MemorySize=3008
)

  adding: .ipynb_checkpoints/ (stored 0%)
  adding: .ipynb_checkpoints/getstatusimport-checkpoint.py (deflated 46%)
  adding: getstatusimport.py (deflated 46%)


{'ResponseMetadata': {'RequestId': 'c596aa8f-d100-4c8e-860a-5283d9eeea9f',
  'HTTPStatusCode': 201,
  'HTTPHeaders': {'date': 'Sat, 01 Aug 2020 13:19:10 GMT',
   'content-type': 'application/json',
   'content-length': '899',
   'connection': 'keep-alive',
   'x-amzn-requestid': 'c596aa8f-d100-4c8e-860a-5283d9eeea9f'},
  'RetryAttempts': 0},
 'FunctionName': 'getstatusimport',
 'FunctionArn': 'arn:aws:lambda:us-east-1:805433377179:function:getstatusimport',
 'Runtime': 'python3.7',
 'Role': 'arn:aws:iam::805433377179:role/service-role/AmazonSageMaker-ExecutionRole-20200716T084970',
 'Handler': 'getstatusimport.lambda_handler',
 'CodeSize': 976,
 'Description': '',
 'Timeout': 900,
 'MemorySize': 3008,
 'LastModified': '2020-08-01T13:19:10.390+0000',
 'CodeSha256': 'KgnL0F8/xZlha2Zso7TAWHKESF9w8jBzSp22XPzN3Qc=',
 'Version': '$LATEST',
 'TracingConfig': {'Mode': 'PassThrough'},
 'RevisionId': '4f5a5dba-71aa-4ebd-844c-7608e3256f76',
 'State': 'Active',
 'LastUpdateStatus': 'Successful'}

In [7]:
!rm -f lambdas/createpredictor/predictor.zip
!cd lambdas/createpredictor; zip -r predictor .

zip_file = open("lambdas/createpredictor/predictor.zip", "rb").read()

lambda_.create_function(
    FunctionName="predictor",
    Runtime="python3.7",
    Role=role_sm,
    Handler="predictor.lambda_handler",
    Code={"ZipFile": zip_file},
    Timeout=60*15,
    MemorySize=3008
)

  adding: predictor.py (deflated 56%)


{'ResponseMetadata': {'RequestId': '5379297f-7fb3-44d8-bbbc-e52a3b56d28e',
  'HTTPStatusCode': 201,
  'HTTPHeaders': {'date': 'Sat, 01 Aug 2020 13:19:11 GMT',
   'content-type': 'application/json',
   'content-length': '881',
   'connection': 'keep-alive',
   'x-amzn-requestid': '5379297f-7fb3-44d8-bbbc-e52a3b56d28e'},
  'RetryAttempts': 0},
 'FunctionName': 'predictor',
 'FunctionArn': 'arn:aws:lambda:us-east-1:805433377179:function:predictor',
 'Runtime': 'python3.7',
 'Role': 'arn:aws:iam::805433377179:role/service-role/AmazonSageMaker-ExecutionRole-20200716T084970',
 'Handler': 'predictor.lambda_handler',
 'CodeSize': 645,
 'Description': '',
 'Timeout': 900,
 'MemorySize': 3008,
 'LastModified': '2020-08-01T13:19:11.822+0000',
 'CodeSha256': 'ozekRr7hwZFxQxduOnRc7zUB8yRzFMFFIkSxbDr80cg=',
 'Version': '$LATEST',
 'TracingConfig': {'Mode': 'PassThrough'},
 'RevisionId': 'ce0614ac-f1ba-4992-8df7-c050f58ea273',
 'State': 'Active',
 'LastUpdateStatus': 'Successful'}

In [8]:
!rm -f lambdas/GetStatusPredictor/getstatuspredictor.zip
!cd lambdas/GetStatusPredictor; zip -r getstatuspredictor .

zip_file = open("lambdas/GetStatusPredictor/getstatuspredictor.zip", "rb").read()

lambda_.create_function(
    FunctionName="getstatuspredictor",
    Runtime="python3.7",
    Role=role_sm,
    Handler="getstatuspredictor.lambda_handler",
    Code={"ZipFile": zip_file},
    Timeout=60*15,
    MemorySize=3008
)

  adding: getstatuspredictor.py (deflated 46%)


{'ResponseMetadata': {'RequestId': 'ec66c66c-a133-48ee-ad4b-f8ede27cc69b',
  'HTTPStatusCode': 201,
  'HTTPHeaders': {'date': 'Sat, 01 Aug 2020 13:19:13 GMT',
   'content-type': 'application/json',
   'content-length': '908',
   'connection': 'keep-alive',
   'x-amzn-requestid': 'ec66c66c-a133-48ee-ad4b-f8ede27cc69b'},
  'RetryAttempts': 0},
 'FunctionName': 'getstatuspredictor',
 'FunctionArn': 'arn:aws:lambda:us-east-1:805433377179:function:getstatuspredictor',
 'Runtime': 'python3.7',
 'Role': 'arn:aws:iam::805433377179:role/service-role/AmazonSageMaker-ExecutionRole-20200716T084970',
 'Handler': 'getstatuspredictor.lambda_handler',
 'CodeSize': 382,
 'Description': '',
 'Timeout': 900,
 'MemorySize': 3008,
 'LastModified': '2020-08-01T13:19:13.283+0000',
 'CodeSha256': 'U6OA69b4ByECTijrgQPYBZWZt0zq1em5fs/aX19mcJg=',
 'Version': '$LATEST',
 'TracingConfig': {'Mode': 'PassThrough'},
 'RevisionId': '36157fcc-3edb-42ec-b308-0ec016464116',
 'State': 'Active',
 'LastUpdateStatus': 'Succe

In [9]:
!rm -f lambdas/createforecast/forecast.zip
!cd lambdas/createforecast; zip -r forecast .

zip_file = open("lambdas/createforecast/forecast.zip", "rb").read()

lambda_.create_function(
    FunctionName="forecast",
    Runtime="python3.7",
    Role=role_sm,
    Handler="forecast.lambda_handler",
    Code={"ZipFile": zip_file},
    Timeout=60*15,
    MemorySize=3008
)

  adding: forecast.py (deflated 45%)


{'ResponseMetadata': {'RequestId': '8b8cc7f8-1516-4f06-9ba4-6741f1905b60',
  'HTTPStatusCode': 201,
  'HTTPHeaders': {'date': 'Sat, 01 Aug 2020 13:19:14 GMT',
   'content-type': 'application/json',
   'content-length': '878',
   'connection': 'keep-alive',
   'x-amzn-requestid': '8b8cc7f8-1516-4f06-9ba4-6741f1905b60'},
  'RetryAttempts': 0},
 'FunctionName': 'forecast',
 'FunctionArn': 'arn:aws:lambda:us-east-1:805433377179:function:forecast',
 'Runtime': 'python3.7',
 'Role': 'arn:aws:iam::805433377179:role/service-role/AmazonSageMaker-ExecutionRole-20200716T084970',
 'Handler': 'forecast.lambda_handler',
 'CodeSize': 374,
 'Description': '',
 'Timeout': 900,
 'MemorySize': 3008,
 'LastModified': '2020-08-01T13:19:14.518+0000',
 'CodeSha256': 'aozu0Z3fO0iTdTir/My4WCE9/wQ3/Tg58r7FTq5wytc=',
 'Version': '$LATEST',
 'TracingConfig': {'Mode': 'PassThrough'},
 'RevisionId': 'e2929354-384a-458c-a025-1d3d4b1e7d0f',
 'State': 'Active',
 'LastUpdateStatus': 'Successful'}

In [10]:
!rm -f lambdas/GetStatusForecast/getstatusforecast.zip
!cd lambdas/GetStatusForecast; zip -r getstatusforecast .

zip_file = open("lambdas/GetStatusForecast/getstatusforecast.zip", "rb").read()

lambda_.create_function(
    FunctionName="getstatusforecast",
    Runtime="python3.7",
    Role=role_sm,
    Handler="getstatusforecast.lambda_handler",
    Code={"ZipFile": zip_file},
    Timeout=60*15,
    MemorySize=3008
)

  adding: getstatusforecast.py (deflated 47%)


{'ResponseMetadata': {'RequestId': 'b4854f1f-1827-4e9d-a890-6b3fef891e4f',
  'HTTPStatusCode': 201,
  'HTTPHeaders': {'date': 'Sat, 01 Aug 2020 13:19:16 GMT',
   'content-type': 'application/json',
   'content-length': '905',
   'connection': 'keep-alive',
   'x-amzn-requestid': 'b4854f1f-1827-4e9d-a890-6b3fef891e4f'},
  'RetryAttempts': 0},
 'FunctionName': 'getstatusforecast',
 'FunctionArn': 'arn:aws:lambda:us-east-1:805433377179:function:getstatusforecast',
 'Runtime': 'python3.7',
 'Role': 'arn:aws:iam::805433377179:role/service-role/AmazonSageMaker-ExecutionRole-20200716T084970',
 'Handler': 'getstatusforecast.lambda_handler',
 'CodeSize': 374,
 'Description': '',
 'Timeout': 900,
 'MemorySize': 3008,
 'LastModified': '2020-08-01T13:19:15.837+0000',
 'CodeSha256': 'm93syuTvFZ9cc4CyWa7RDnw5uCl3z6UxqnxpGa4AVp8=',
 'Version': '$LATEST',
 'TracingConfig': {'Mode': 'PassThrough'},
 'RevisionId': '43635002-3a50-4900-9ca9-b50f67969c18',
 'State': 'Active',
 'LastUpdateStatus': 'Successf

In [11]:
!rm -f lambdas/createforecastexportjob/forecastexportjob.zip
!cd lambdas/createforecastexportjob; zip -r forecastexportjob .

zip_file = open("lambdas/createforecastexportjob/forecastexportjob.zip", "rb").read()

lambda_.create_function(
    FunctionName="forecastexportjob",
    Runtime="python3.7",
    Role=role_sm,
    Handler="forecastexportjob.lambda_handler",
    Code={"ZipFile": zip_file},
    Timeout=60*15,
    MemorySize=3008
)

  adding: forecastexportjob.py (deflated 53%)


{'ResponseMetadata': {'RequestId': '8e704695-62a3-4164-9d2d-d1d3e679aead',
  'HTTPStatusCode': 201,
  'HTTPHeaders': {'date': 'Sat, 01 Aug 2020 13:19:17 GMT',
   'content-type': 'application/json',
   'content-length': '905',
   'connection': 'keep-alive',
   'x-amzn-requestid': '8e704695-62a3-4164-9d2d-d1d3e679aead'},
  'RetryAttempts': 0},
 'FunctionName': 'forecastexportjob',
 'FunctionArn': 'arn:aws:lambda:us-east-1:805433377179:function:forecastexportjob',
 'Runtime': 'python3.7',
 'Role': 'arn:aws:iam::805433377179:role/service-role/AmazonSageMaker-ExecutionRole-20200716T084970',
 'Handler': 'forecastexportjob.lambda_handler',
 'CodeSize': 508,
 'Description': '',
 'Timeout': 900,
 'MemorySize': 3008,
 'LastModified': '2020-08-01T13:19:17.079+0000',
 'CodeSha256': '30oXiVATLm2ex6B2FQiMagL8mYdEUPh2cr7rTDJnFSQ=',
 'Version': '$LATEST',
 'TracingConfig': {'Mode': 'PassThrough'},
 'RevisionId': 'f4d9f39e-817a-4fab-b8ee-94858184d7c4',
 'State': 'Active',
 'LastUpdateStatus': 'Successf

In [12]:
!rm -f lambdas/GetStatusForecastExportJob/getstatusforecastexportjob.zip
!cd lambdas/GetStatusForecastExportJob; zip -r getstatusforecastexportjob .

zip_file = open("lambdas/GetStatusForecastExportJob/getstatusforecastexportjob.zip", "rb").read()

lambda_.create_function(
    FunctionName="getstatusforecastexportjob",
    Runtime="python3.7",
    Role=role_sm,
    Handler="getstatusforecastexportjob.lambda_handler",
    Code={"ZipFile": zip_file},
    Timeout=60*15,
    MemorySize=3008
)

  adding: getstatusforecastexportjob.py (deflated 47%)


{'ResponseMetadata': {'RequestId': 'c3e7af4a-a9fa-47e2-a0e8-bd1f3fcb3cc4',
  'HTTPStatusCode': 201,
  'HTTPHeaders': {'date': 'Sat, 01 Aug 2020 13:19:18 GMT',
   'content-type': 'application/json',
   'content-length': '932',
   'connection': 'keep-alive',
   'x-amzn-requestid': 'c3e7af4a-a9fa-47e2-a0e8-bd1f3fcb3cc4'},
  'RetryAttempts': 0},
 'FunctionName': 'getstatusforecastexportjob',
 'FunctionArn': 'arn:aws:lambda:us-east-1:805433377179:function:getstatusforecastexportjob',
 'Runtime': 'python3.7',
 'Role': 'arn:aws:iam::805433377179:role/service-role/AmazonSageMaker-ExecutionRole-20200716T084970',
 'Handler': 'getstatusforecastexportjob.lambda_handler',
 'CodeSize': 406,
 'Description': '',
 'Timeout': 900,
 'MemorySize': 3008,
 'LastModified': '2020-08-01T13:19:18.360+0000',
 'CodeSha256': '7gfhIm0E6+FmQlVZCsXHR08X42ddNQ7ioaYPIRIG2OU=',
 'Version': '$LATEST',
 'TracingConfig': {'Mode': 'PassThrough'},
 'RevisionId': '7d1710c7-b5dc-47ef-b71a-26a9b8c4dbd7',
 'State': 'Active',
 'L

In [13]:
!rm -f lambdas/NotifyUser/notifyuser.zip
!cd lambdas/NotifyUser; zip -r notifyuser .

zip_file = open("lambdas/NotifyUser/notifyuser.zip", "rb").read()

lambda_.create_function(
    FunctionName="notifyuser",
    Runtime="python3.7",
    Role=role_sm,
    Handler="notifyuser.lambda_handler",
    Code={"ZipFile": zip_file},
    Timeout=60*15,
    MemorySize=3008
)

  adding: notifyuser.py (deflated 12%)


{'ResponseMetadata': {'RequestId': 'a525ede3-eb80-44c8-b701-91ce53382497',
  'HTTPStatusCode': 201,
  'HTTPHeaders': {'date': 'Sat, 01 Aug 2020 13:19:19 GMT',
   'content-type': 'application/json',
   'content-length': '884',
   'connection': 'keep-alive',
   'x-amzn-requestid': 'a525ede3-eb80-44c8-b701-91ce53382497'},
  'RetryAttempts': 0},
 'FunctionName': 'notifyuser',
 'FunctionArn': 'arn:aws:lambda:us-east-1:805433377179:function:notifyuser',
 'Runtime': 'python3.7',
 'Role': 'arn:aws:iam::805433377179:role/service-role/AmazonSageMaker-ExecutionRole-20200716T084970',
 'Handler': 'notifyuser.lambda_handler',
 'CodeSize': 263,
 'Description': '',
 'Timeout': 900,
 'MemorySize': 3008,
 'LastModified': '2020-08-01T13:19:19.563+0000',
 'CodeSha256': 'KIb96AGobdF9EmRtk+NH6jWCOH7e/DZSChpqr2FgFrU=',
 'Version': '$LATEST',
 'TracingConfig': {'Mode': 'PassThrough'},
 'RevisionId': 'd68b5471-f65d-4f50-96a4-8dc79d2cbd01',
 'State': 'Active',
 'LastUpdateStatus': 'Successful'}

# 2. Step Functins: Create state machine
The definition of the state machine is created, and the creation of the state machine is performed based on the definition.

In [14]:
import sagemaker

In [15]:
sagemaker_session = sagemaker.Session()

In [16]:
sagemaker_session.boto_region_name

'us-east-1'

In [17]:
sts = boto3.client('sts')
id_info = sts.get_caller_identity()

In [18]:
id_info['Account']

'805433377179'

In [19]:
import json

In [20]:
def_sfn={
  "Comment": "Amazon Forecast example of the Amazon States Language using an AWS Lambda Function",
  "StartAt": "datasetimport",
  "States": {
    "datasetimport": {
      "Type": "Task",
      "InputPath":"$",
      "Resource": "arn:aws:lambda:" + sagemaker_session.boto_region_name + ":" + id_info['Account'] + ":function:datasetimport",
      "ResultPath":"$",
      "Next": "GetStatusImport"
    },
    "GetStatusImport": {
      "Type": "Task",
      "InputPath":"$",
      "Resource": "arn:aws:lambda:" + sagemaker_session.boto_region_name + ":" + id_info['Account'] + ":function:getstatusimport",
      "ResultPath":"$",
      "Next": "CheckStatusImport"
    },
    "CheckStatusImport": {
        "Type": "Choice",
        "InputPath":"$",
        "Choices": [
        {
        "Variable": "$.is_active_import",
        "BooleanEquals": True,
        "Next": "predictor"
        }
        ],
        "Default": "SleepCheckStatusImport"
        },
    "SleepCheckStatusImport": {
        "Type": "Wait",
        "Seconds": 300,
        "Next": "GetStatusImport"
    },
    "predictor": {
      "Type": "Task",
      "InputPath":"$",
      "Resource": "arn:aws:lambda:" + sagemaker_session.boto_region_name + ":" + id_info['Account'] + ":function:predictor",
      "ResultPath":"$",
      "Next": "GetStatusPredictor"
    },
    "GetStatusPredictor": {
      "Type": "Task",
      "InputPath":"$",
      "Resource": "arn:aws:lambda:" + sagemaker_session.boto_region_name + ":" + id_info['Account'] + ":function:getstatuspredictor",
      "ResultPath":"$",
      "Next": "CheckStatusPredictor"
    },
    "CheckStatusPredictor": {
        "Type": "Choice",
        "InputPath":"$",
        "Choices": [
        {
        "Variable": "$.is_active_predictor",
        "BooleanEquals": True,
        "Next": "forecast"
        }
        ],
        "Default": "SleepCheckStatusPredictor"
        },
    "SleepCheckStatusPredictor": {
        "Type": "Wait",
        "Seconds": 300,
        "Next": "GetStatusPredictor"
    },
    "forecast": {
      "Type": "Task",
      "InputPath":"$",
      "Resource": "arn:aws:lambda:" + sagemaker_session.boto_region_name + ":" + id_info['Account'] + ":function:forecast",
      "ResultPath":"$",
      "Next": "GetStatusForecast"
    },
    "GetStatusForecast": {
      "Type": "Task",
      "InputPath":"$",
      "Resource": "arn:aws:lambda:" + sagemaker_session.boto_region_name + ":" + id_info['Account'] + ":function:getstatusforecast",
      "ResultPath":"$",
      "Next": "CheckStatusForecast"
    },  
    "CheckStatusForecast": {
        "Type": "Choice",
        "InputPath":"$",
        "Choices": [
        {
        "Variable": "$.is_active_forecast",
        "BooleanEquals": True,
        "Next": "forecastexportjob"
        }
        ],
        "Default": "SleepCheckStatusForecast"
        },
    "SleepCheckStatusForecast": {
        "Type": "Wait",
        "Seconds": 300,
        "Next": "GetStatusForecast"
    },
    "forecastexportjob": {
      "Type": "Task",
      "InputPath":"$",
      "Resource": "arn:aws:lambda:" + sagemaker_session.boto_region_name + ":" + id_info['Account'] + ":function:forecastexportjob",
      "ResultPath":"$",
      "Next": "GetStatusForecastExportjob"
    },
    "GetStatusForecastExportjob": {
      "Type": "Task",
      "InputPath":"$",
      "Resource": "arn:aws:lambda:" + sagemaker_session.boto_region_name + ":" + id_info['Account'] + ":function:getstatusforecastexportjob",
      "ResultPath":"$",
      "Next": "CheckStatusExport"
    },  
    "CheckStatusExport": {
        "Type": "Choice",
        "InputPath":"$",
        "Choices": [
        {
        "Variable": "$.is_active_export",
        "BooleanEquals": True,
        "Next": "NotifyUser"
        }
        ],
        "Default": "SleepCheckStatusExport"
        },
    "SleepCheckStatusExport": {
        "Type": "Wait",
        "Seconds": 300,
        "Next": "GetStatusForecastExportjob"
    },
    "NotifyUser": {
      "Type": "Task",
      "InputPath":"$",
      "Resource": "arn:aws:lambda:" + sagemaker_session.boto_region_name + ":" + id_info['Account'] + ":function:notifyuser",
      "ResultPath":"$",
      "End": True
    }
  }
}


In [21]:
with open('./definition.json', 'w') as f:
    json.dump(def_sfn, f, indent=2, ensure_ascii=False)

In [22]:
import boto3
sfn = boto3.client('stepfunctions')

In [23]:
sfn.create_state_machine(
        name="demo-forecast",
        definition=open("definition.json").read(),
        roleArn=role_sm
)

{'stateMachineArn': 'arn:aws:states:us-east-1:805433377179:stateMachine:demo-forecast',
 'creationDate': datetime.datetime(2020, 8, 1, 13, 20, 15, 742000, tzinfo=tzlocal()),
 'ResponseMetadata': {'RequestId': 'a9e068ed-3946-4fa0-8141-ed476092a6c9',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': 'a9e068ed-3946-4fa0-8141-ed476092a6c9',
   'content-type': 'application/x-amz-json-1.0',
   'content-length': '118'},
  'RetryAttempts': 0}}

# 3.AWS CloudTrail :create trail  
https://docs.aws.amazon.com/step-functions/latest/dg/tutorial-cloudwatch-events-s3.html

Configure CloudTrail and CloudWatch Events to run Step Functions by triggering S3 object placement as described in this guide.

https://boto3.amazonaws.com/v1/documentation/api/1.9.42/reference/services/cloudtrail.html#CloudTrail.Client.create_trail

In [24]:
! pip freeze | grep boto3

boto3==1.14.16


In [25]:
cloudtrail = boto3.client('cloudtrail')

In [26]:
sts = boto3.client('sts')
id_info = sts.get_caller_identity()
print(id_info['Account'])

805433377179


In [27]:
bucket_name = 'demo-forecast-' + id_info['Account']

In [28]:
bucket_name

'demo-forecast-805433377179'

In [29]:
output_trail_bucket = bucket_name + '-trail'

In [30]:
s3 = boto3.client('s3')
s3.create_bucket(Bucket=output_trail_bucket)

{'ResponseMetadata': {'RequestId': '870DF6C2D4340C06',
  'HostId': 'zy5hflw1mCT/7iIyIyD/5hP8YZriRTPGwkwATe0EWV1PctLfZ/jhXYWn75cSuB3KO8vSI0RFYas=',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amz-id-2': 'zy5hflw1mCT/7iIyIyD/5hP8YZriRTPGwkwATe0EWV1PctLfZ/jhXYWn75cSuB3KO8vSI0RFYas=',
   'x-amz-request-id': '870DF6C2D4340C06',
   'date': 'Sat, 01 Aug 2020 13:20:29 GMT',
   'location': '/demo-forecast-805433377179-trail',
   'content-length': '0',
   'server': 'AmazonS3'},
  'RetryAttempts': 0},
 'Location': '/demo-forecast-805433377179-trail'}

## set bucket policy
setting policy with boto3  
https://boto3.amazonaws.com/v1/documentation/api/latest/guide/s3-example-bucket-policies.html

In [31]:
import json

In [32]:
bucket_policy = {
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AWSCloudTrailAclCheck20150319",
            "Effect": "Allow",
            "Principal": {
                "Service": "cloudtrail.amazonaws.com"
            },
            "Action": "s3:GetBucketAcl",
            "Resource": "arn:aws:s3:::" + bucket_name + "-trail"
        },
        {
            "Sid": "AWSCloudTrailWrite20150319",
            "Effect": "Allow",
            "Principal": {
                "Service": "cloudtrail.amazonaws.com"
            },
            "Action": "s3:PutObject",
            "Resource": "arn:aws:s3:::" + bucket_name + "-trail/AWSLogs/" + id_info['Account'] + "/*",
            "Condition": {
                "StringEquals": {
                    "s3:x-amz-acl": "bucket-owner-full-control"
                }
            }
        }
    ]
}

# Convert the policy from JSON dict to string
bucket_policy = json.dumps(bucket_policy)

In [33]:
# Set the new policy
s3 = boto3.client('s3')
s3.put_bucket_policy(Bucket=output_trail_bucket, Policy=bucket_policy)

{'ResponseMetadata': {'RequestId': '1D9D6F91598DB542',
  'HostId': 'dCV/RaH/P2RibIpug2cLQk4HynWdOC1fhtO7wLqFWMtWFkP9PWt+W1Ly+dQQz3JJXadxK0rnfSg=',
  'HTTPStatusCode': 204,
  'HTTPHeaders': {'x-amz-id-2': 'dCV/RaH/P2RibIpug2cLQk4HynWdOC1fhtO7wLqFWMtWFkP9PWt+W1Ly+dQQz3JJXadxK0rnfSg=',
   'x-amz-request-id': '1D9D6F91598DB542',
   'date': 'Sat, 01 Aug 2020 13:20:34 GMT',
   'server': 'AmazonS3'},
  'RetryAttempts': 0}}

In [34]:
cloudtrail.create_trail(
    Name='forecast-trail',
    S3BucketName=output_trail_bucket,
    EnableLogFileValidation=True
)

{'Name': 'forecast-trail',
 'S3BucketName': 'demo-forecast-805433377179-trail',
 'IncludeGlobalServiceEvents': True,
 'IsMultiRegionTrail': False,
 'TrailARN': 'arn:aws:cloudtrail:us-east-1:805433377179:trail/forecast-trail',
 'LogFileValidationEnabled': True,
 'IsOrganizationTrail': False,
 'ResponseMetadata': {'RequestId': '1f1cbd4e-4827-4683-bd21-0416b4a6eb56',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': '1f1cbd4e-4827-4683-bd21-0416b4a6eb56',
   'content-type': 'application/x-amz-json-1.1',
   'content-length': '272',
   'date': 'Sat, 01 Aug 2020 13:20:34 GMT'},
  'RetryAttempts': 0}}

In [35]:
cloudtrail.put_event_selectors(
    TrailName='forecast-trail',
    EventSelectors=[
        {
            'ReadWriteType': 'All',
            'IncludeManagementEvents': True,
            'DataResources': [
                {
                    'Type': 'AWS::S3::Object',
                    'Values': [
                        f'arn:aws:s3:::{bucket_name}/',
                    ]
                },
            ]
        },
    ]
)

{'TrailARN': 'arn:aws:cloudtrail:us-east-1:805433377179:trail/forecast-trail',
 'EventSelectors': [{'ReadWriteType': 'All',
   'IncludeManagementEvents': True,
   'DataResources': [{'Type': 'AWS::S3::Object',
     'Values': ['arn:aws:s3:::demo-forecast-805433377179/']}],
   'ExcludeManagementEventSources': []}],
 'ResponseMetadata': {'RequestId': 'da6c7e1a-c103-4cf6-bfb8-b64ffd372168',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': 'da6c7e1a-c103-4cf6-bfb8-b64ffd372168',
   'content-type': 'application/x-amz-json-1.1',
   'content-length': '285',
   'date': 'Sat, 01 Aug 2020 13:20:36 GMT'},
  'RetryAttempts': 0}}

### enable logging
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/cloudtrail.html#CloudTrail.Client.start_logging

In [36]:
cloudtrail.start_logging(Name='forecast-trail')

{'ResponseMetadata': {'RequestId': '3c4ed1be-4a8a-4f4d-93d4-25f7a84ddea6',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': '3c4ed1be-4a8a-4f4d-93d4-25f7a84ddea6',
   'content-type': 'application/x-amz-json-1.1',
   'content-length': '2',
   'date': 'Sat, 01 Aug 2020 13:20:38 GMT'},
  'RetryAttempts': 0}}

# 4.CloudWatch Event: build a rule
https://boto3.amazonaws.com/v1/documentation/api/1.9.42/reference/services/events.html


put_rule(): create rule  
https://boto3.amazonaws.com/v1/documentation/api/1.9.42/reference/services/events.html#CloudWatchEvents.Client.put_rule


In [37]:
bucket_name

'demo-forecast-805433377179'

In [38]:
cwe = boto3.client('events')

In [39]:
ep_str ='{"source":["aws.s3"], \
        "detail-type":["AWS API Call via CloudTrail"], \
        "detail":{"eventSource":["s3.amazonaws.com"], \
        "eventName":["PutObject", "CompleteMultipartUpload"], \
        "requestParameters":{"bucketName":["'+ bucket_name + '"]}}}'

In [40]:
cwe.put_rule(
    Name='demo-forecast',
    EventPattern=ep_str,
    State='ENABLED'
)

{'RuleArn': 'arn:aws:events:us-east-1:805433377179:rule/demo-forecast',
 'ResponseMetadata': {'RequestId': '22981581-3982-4491-b752-3f008910c28c',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': '22981581-3982-4491-b752-3f008910c28c',
   'content-type': 'application/x-amz-json-1.1',
   'content-length': '70',
   'date': 'Sat, 01 Aug 2020 13:20:44 GMT'},
  'RetryAttempts': 0}}

In [41]:
cwe.put_targets(
    Rule='demo-forecast',
    Targets=[
        {
            'Id': 'forecast',
            'Arn': "arn:aws:states:" + sagemaker_session.boto_region_name + ":" + id_info['Account'] + ":stateMachine:demo-forecast",
            'RoleArn': role_sm
        }
    ]
)

{'FailedEntryCount': 0,
 'FailedEntries': [],
 'ResponseMetadata': {'RequestId': '0f8f6d09-bf7a-4e96-b286-b76451b89fa5',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': '0f8f6d09-bf7a-4e96-b286-b76451b89fa5',
   'content-type': 'application/x-amz-json-1.1',
   'content-length': '41',
   'date': 'Sat, 01 Aug 2020 13:20:45 GMT'},
  'RetryAttempts': 0}}

# 5.Put additional file into S3
When upload a file to S3, Step Functions runs.

It may not work if you run it all at once. wait about 30 seconds. the S3 file can be uploaded repeatedly.

In [42]:
s3 = boto3.resource('s3')
bucket = s3.Bucket(bucket_name)

bucket.upload_file('./output/tr_target_add_20091201_20101209.csv', 'input/tr_target_add_20091201_20101209.csv')

# 6.Next
When Step Functions is completed, the prediction results are output to S3, import the data source in QuickSight using the same procedure as before and visualize it.