# Automate Model Retraining & Deployment Using the AWS Step Functions Data Science SDK

1. [Introduction](#Introduction)
1. [Setup](#Setup)
1. [Create Resources](#Create-Resources)
1. [Build a Machine Learning Workflow](#Build-a-Machine-Learning-Workflow)
1. [Run the Workflow](#Run-the-Workflow)
1. [Clean Up](#Clean-Up)

## Introduction

This notebook describes how to use the AWS Step Functions Data Science SDK to create a machine learning model retraining workflow. The Step Functions SDK is an open source library that allows data scientists to easily create and execute machine learning workflows using AWS Step Functions and Amazon SageMaker. For more information, please see the following resources:
* [AWS Step Functions](https://aws.amazon.com/step-functions/)
* [AWS Step Functions Developer Guide](https://docs.aws.amazon.com/step-functions/latest/dg/welcome.html)
* [AWS Step Functions Data Science SDK](https://aws-step-functions-data-science-sdk.readthedocs.io)

In this notebook, we will use the SDK to create steps that capture and transform data using AWS Glue, encorporate this data into the training of a machine learning model, deploy the model to a SageMaker endpoint, link these steps together to create a workflow, and then execute the workflow in AWS Step Functions.

## Setup

First, we'll need to install and load all the required modules. Then we'll create fine-grained IAM roles for the Lambda, Glue, and Step Functions resources that we will create. The IAM roles grant the services permissions within your AWS environment.

In [1]:
import sys

!{sys.executable} -m pip install --upgrade stepfunctions

Collecting stepfunctions
  Downloading stepfunctions-2.2.0.tar.gz (64 kB)
[K     |████████████████████████████████| 64 kB 3.9 MB/s  eta 0:00:01
Building wheels for collected packages: stepfunctions
  Building wheel for stepfunctions (setup.py) ... [?25ldone
[?25h  Created wheel for stepfunctions: filename=stepfunctions-2.2.0-py2.py3-none-any.whl size=74960 sha256=2e4fb43184db1f8c0477281eb7015df1ed869b41ff978fa1ae920f1235365b2f
  Stored in directory: /home/ec2-user/.cache/pip/wheels/f5/5e/bb/79fb2362e3b81874d0065521c886ed3ad0dd2ecfd230012617
Successfully built stepfunctions
Installing collected packages: stepfunctions
Successfully installed stepfunctions-2.2.0
You should consider upgrading via the '/home/ec2-user/anaconda3/envs/python3/bin/python -m pip install --upgrade pip' command.[0m


### Import the Required Modules

In [2]:
import uuid
import logging
import stepfunctions
import boto3
import sagemaker

# from sagemaker.amazon.amazon_estimator import image_uris
# from sagemaker.inputs import TrainingInput
# from sagemaker.s3 import S3Uploader
# from stepfunctions import steps
# from stepfunctions.steps import TrainingStep, ModelStep
# from stepfunctions.inputs import ExecutionInput
# from stepfunctions.workflow import Workflow
from sagemaker import get_execution_role
session = sagemaker.Session()
stepfunctions.set_stream_logger(level=logging.INFO)

notebook_role = get_execution_role()

region = boto3.Session().region_name
bucket = session.default_bucket()
id = uuid.uuid4().hex

# Create a unique name for the AWS Glue job to be created. If you change the
# default name, you may need to change the Step Functions execution role.
# job_name = "glue-customer-churn-etl-{}".format(id)

# Create a unique name for the AWS Lambda function to be created. If you change
# the default name, you may need to change the Step Functions execution role.
# function_name = "query-training-status-{}".format(id)
# model_register_function = "register-model-version-{}".format(id)

In [3]:
notebook_role

'arn:aws:iam::667350535149:role/TeamRole'

Next, we'll create fine-grained IAM roles for the Lambda, Glue, and Step Functions resources. The IAM roles grant the services permissions within your AWS environment.

### Add permissions to your notebook role in IAM

The IAM role assumed by your notebook requires permission to create and run workflows in AWS Step Functions. If this notebook is running on a SageMaker notebook instance, do the following to provide IAM permissions to the notebook:

1. Open the Amazon [SageMaker console](https://console.aws.amazon.com/sagemaker/). 
2. Select **Notebook instances** and choose the name of your notebook instance.
3. Under **Permissions and encryption** select the role ARN to view the role on the IAM console.
4. Copy and save the IAM role ARN for later use. 
5. Choose **Attach policies** and search for `AWSStepFunctionsFullAccess`.
6. Select the check box next to `AWSStepFunctionsFullAccess` and choose **Attach policy**.

We also need to provide permissions that allow the notebook instance the ability to create an AWS Lambda function and AWS Glue job. We will edit the managed policy attached to our role directly to encorporate these specific permissions:

1. Under **Permisions policies** expand the AmazonSageMaker-ExecutionPolicy-******** policy and choose **Edit policy**.
2. Select **Add additional permissions**. Choose **IAM**  for Service and **PassRole** for Actions.
3. Under Resources, choose **Specific**. Select **Add ARN** and enter `query_training_status-role` for **Role name with path*** and choose **Add**. You will create this role later on in this notebook.
4. Select **Add additional permissions** a second time. Choose **Lambda** for Service, **Write** for Access level, and **All resources** for Resources.
5. Select **Add additional permissions** a final time. Choose **Glue** for Service, **Write** for Access level, and **All resources** for Resources.
6. Choose **Review policy** and then **Save changes**.

If you are running this notebook outside of SageMaker, the SDK will use your configured AWS CLI configuration. For more information, see [Configuring the AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html).

In [4]:
iam = boto3.client("iam")
iam.attach_role_policy(
    PolicyArn='arn:aws:iam::aws:policy/AWSStepFunctionsFullAccess',
    RoleName=notebook_role.split('/')[-1]
)

iam.attach_role_policy(
    PolicyArn='arn:aws:iam::aws:policy/AWSLambda_FullAccess',
    RoleName=notebook_role.split('/')[-1]
)

iam.attach_role_policy(
    PolicyArn='arn:aws:iam::aws:policy/AmazonEventBridgeFullAccess',
    RoleName=notebook_role.split('/')[-1]
)

iam.attach_role_policy(
    PolicyArn='arn:aws:iam::aws:policy/CloudWatchEventsFullAccess',
    RoleName=notebook_role.split('/')[-1]
)

{'ResponseMetadata': {'RequestId': 'acd972c8-9baf-422d-85aa-16561849425d',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': 'acd972c8-9baf-422d-85aa-16561849425d',
   'content-type': 'text/xml',
   'content-length': '212',
   'date': 'Wed, 10 Nov 2021 09:45:06 GMT'},
  'RetryAttempts': 0}}

Next, let's create an execution role in IAM for Step Functions. 

### Create an Execution Role for Step Functions

Your Step Functions workflow requires an IAM role to interact with other services in your AWS environment. 

1. Go to the [IAM console](https://console.aws.amazon.com/iam/).
2. Select **Roles** and then **Create role**.
3. Under **Choose the service that will use this role** select **Step Functions**.
4. Choose **Next** until you can enter a **Role name**.
5. Enter a name such as `AmazonSageMaker-StepFunctionsWorkflowExecutionRole` and then select **Create role**.

Next, create and attach a policy to the role you created. As a best practice, the following steps will attach a policy that only provides access to the specific resources and actions needed for this solution.

1. Under the **Permissions** tab, click **Attach policies** and then **Create policy**.
2. Enter the following in the **JSON** tab:

```json
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "iam:PassRole",
            "Resource": "NOTEBOOK_ROLE_ARN",
            "Condition": {
                "StringEquals": {
                    "iam:PassedToService": "sagemaker.amazonaws.com"
                }
            }
        },
        {
            "Effect": "Allow",
            "Action": [
                "sagemaker:CreateModel",
                "sagemaker:DeleteEndpointConfig",
                "sagemaker:DescribeTrainingJob",
                "sagemaker:CreateEndpoint",
                "sagemaker:StopTrainingJob",
                "sagemaker:CreateTrainingJob",
                "sagemaker:UpdateEndpoint",
                "sagemaker:CreateEndpointConfig",
                "sagemaker:DeleteEndpoint"
            ],
            "Resource": [
                "arn:aws:sagemaker:*:*:*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "events:DescribeRule",
                "events:PutRule",
                "events:PutTargets"
            ],
            "Resource": [
                "arn:aws:events:*:*:rule/StepFunctionsGetEventsForSageMakerTrainingJobsRule"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "lambda:InvokeFunction"
            ],
            "Resource": [
                "arn:aws:lambda:*:*:function:query-training-status*", "arn:aws:lambda:*:*:function:query-training-status*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "glue:StartJobRun",
                "glue:GetJobRun",
                "glue:BatchStopJobRun",
                "glue:GetJobRuns"
            ],
            "Resource": "arn:aws:glue:*:*:job/glue-customer-churn-etl*"
        }
    ]
}
```

3. Replace **NOTEBOOK_ROLE_ARN** with the ARN for your notebook that you created in the previous step.
4. Choose **Review policy** and give the policy a name such as `AmazonSageMaker-StepFunctionsWorkflowExecutionPolicy`.
5. Choose **Create policy**.
6. Select **Roles** and search for your `AmazonSageMaker-StepFunctionsWorkflowExecutionRole` role.
7. Under the **Permissions** tab, click **Attach policies**.
8. Search for your newly created `AmazonSageMaker-StepFunctionsWorkflowExecutionPolicy` policy and select the check box next to it.
9. Choose **Attach policy**. You will then be redirected to the details page for the role.
10. Copy the AmazonSageMaker-StepFunctionsWorkflowExecutionRole **Role ARN** at the top of the Summary.

In [5]:
import json 

role_name = "AmazonSageMaker-StepFunctionsWorkflowExecutionRole"
assume_role_policy_document = {
    "Version": "2012-10-17",
    "Statement": [
        {
          "Effect": "Allow",
          "Principal": {
            "Service": ["states.amazonaws.com"]
          },
          "Action": "sts:AssumeRole"
        }
    ]
}
create_role_response = iam.create_role(
    RoleName = role_name,
    AssumeRolePolicyDocument = json.dumps(assume_role_policy_document)
)


In [6]:
create_role_response

{'Role': {'Path': '/',
  'RoleName': 'AmazonSageMaker-StepFunctionsWorkflowExecutionRole',
  'RoleId': 'AROAZWYJRBPWUI3ELOXZA',
  'Arn': 'arn:aws:iam::667350535149:role/AmazonSageMaker-StepFunctionsWorkflowExecutionRole',
  'CreateDate': datetime.datetime(2021, 11, 10, 9, 45, 16, tzinfo=tzlocal()),
  'AssumeRolePolicyDocument': {'Version': '2012-10-17',
   'Statement': [{'Effect': 'Allow',
     'Principal': {'Service': ['states.amazonaws.com']},
     'Action': 'sts:AssumeRole'}]}},
 'ResponseMetadata': {'RequestId': '39d008cf-46fc-4772-af0d-5573594f9222',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': '39d008cf-46fc-4772-af0d-5573594f9222',
   'content-type': 'text/xml',
   'content-length': '860',
   'date': 'Wed, 10 Nov 2021 09:45:16 GMT'},
  'RetryAttempts': 0}}

In [7]:
stepfunction_exec_role_arn = create_role_response['Role']['Arn']

In [8]:
stepfunction_exec_role_policy = '''{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "iam:PassRole",
            "Resource": "NOTEBOOK_ROLE_ARN",
            "Condition": {
                "StringEquals": {
                    "iam:PassedToService": "sagemaker.amazonaws.com"
                }
            }
        },
        {
            "Effect": "Allow",
            "Action": [
                "sagemaker:CreateModel",
                "sagemaker:DeleteEndpointConfig",
                "sagemaker:DescribeTrainingJob",
                "sagemaker:CreateEndpoint",
                "sagemaker:StopTrainingJob",
                "sagemaker:CreateTrainingJob",
                "sagemaker:UpdateEndpoint",
                "sagemaker:CreateEndpointConfig",
                "sagemaker:DeleteEndpoint"
            ],
            "Resource": [
                "arn:aws:sagemaker:*:*:*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "events:DescribeRule",
                "events:PutRule",
                "events:PutTargets"
            ],
            "Resource": [
                "arn:aws:events:*:*:rule/StepFunctionsGetEventsForSageMakerTrainingJobsRule"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "lambda:InvokeFunction"
            ],
            "Resource": [
                "arn:aws:lambda:*:*:function:query-training-status*", "arn:aws:lambda:*:*:function:register-model-version*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "glue:StartJobRun",
                "glue:GetJobRun",
                "glue:BatchStopJobRun",
                "glue:GetJobRuns"
            ],
            "Resource": "arn:aws:glue:*:*:job/glue-customer-churn-etl*"
        }
    ]
}
'''

stepfunction_exec_role_policy = stepfunction_exec_role_policy.replace("NOTEBOOK_ROLE_ARN", notebook_role)

stepfunction_exec_role_policy

'{\n    "Version": "2012-10-17",\n    "Statement": [\n        {\n            "Effect": "Allow",\n            "Action": "iam:PassRole",\n            "Resource": "arn:aws:iam::667350535149:role/TeamRole",\n            "Condition": {\n                "StringEquals": {\n                    "iam:PassedToService": "sagemaker.amazonaws.com"\n                }\n            }\n        },\n        {\n            "Effect": "Allow",\n            "Action": [\n                "sagemaker:CreateModel",\n                "sagemaker:DeleteEndpointConfig",\n                "sagemaker:DescribeTrainingJob",\n                "sagemaker:CreateEndpoint",\n                "sagemaker:StopTrainingJob",\n                "sagemaker:CreateTrainingJob",\n                "sagemaker:UpdateEndpoint",\n                "sagemaker:CreateEndpointConfig",\n                "sagemaker:DeleteEndpoint"\n            ],\n            "Resource": [\n                "arn:aws:sagemaker:*:*:*"\n            ]\n        },\n        {\n   

In [9]:
response = iam.create_policy(
    PolicyName='AmazonSageMaker-StepFunctionsWorkflowExecutionPolicy',
    PolicyDocument=stepfunction_exec_role_policy
)


In [10]:
policy_arn = response['Policy']['Arn']

In [11]:
iam.attach_role_policy(
    RoleName=role_name,
    PolicyArn=policy_arn
)

{'ResponseMetadata': {'RequestId': '33806f59-c481-4b59-8363-53f1f9443213',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': '33806f59-c481-4b59-8363-53f1f9443213',
   'content-type': 'text/xml',
   'content-length': '212',
   'date': 'Wed, 10 Nov 2021 09:45:36 GMT'},
  'RetryAttempts': 0}}

### Configure Execution Roles

In [12]:
# paste the AmazonSageMaker-StepFunctionsWorkflowExecutionRole ARN from above
workflow_execution_role = stepfunction_exec_role_arn

# SageMaker Execution Role
# You can use sagemaker.get_execution_role() if running inside sagemaker's notebook instance
sagemaker_execution_role = (
    sagemaker.get_execution_role()
)  # Replace with ARN if not in an AWS SageMaker notebook

#### Create a Glue IAM Role
You need to create an IAM role so that you can create and execute an AWS Glue Job on your data in Amazon S3.

1. Go to the [IAM console](https://console.aws.amazon.com/iam/).
2. Select **Roles** and then **Create role**.
3. Under **Choose the service that will use this role** select **Glue**.
4. Choose **Next** until you can enter a **Role name**.
5. Enter a name such as `AWS-Glue-S3-Bucket-Access` and then select **Create role**.

Next, create and attach a policy to the role you created. The following steps attach a managed policy that provides Glue access to the specific S3 bucket holding your data.

1. Under the **Permissions** tab, click **Attach policies** and then **Create policy**.
2. Enter the following in the **JSON** tab:

```json
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "ListObjectsInBucket",
            "Effect": "Allow",
            "Action": ["s3:ListBucket"],
            "Resource": ["arn:aws:s3:::BUCKET-NAME"]
        },
        {
            "Sid": "AllObjectActions",
            "Effect": "Allow",
            "Action": "s3:*Object",
            "Resource": ["arn:aws:s3:::BUCKET-NAME/*"]
        }
    ]
}
```

3. Run the next cell (below) to retrieve the specific **S3 bucket name** that we will grant permissions to.

In [13]:
glue_role_name = "AWS-Glue-S3-Bucket-Access"
glue_assume_role_policy_document = {
    "Version": "2012-10-17",
    "Statement": [
        {
          "Effect": "Allow",
          "Principal": {
            "Service": ["glue.amazonaws.com"]
          },
          "Action": "sts:AssumeRole"
        }
    ]
}
create_role_response = iam.create_role(
    RoleName = glue_role_name,
    AssumeRolePolicyDocument = json.dumps(glue_assume_role_policy_document)
)


In [14]:
session = sagemaker.Session()
bucket = session.default_bucket()
print(bucket)

sagemaker-us-east-1-667350535149


In [15]:
glue_exec_role_policy = '''{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "ListObjectsInBucket",
            "Effect": "Allow",
            "Action": ["s3:ListBucket"],
            "Resource": ["arn:aws:s3:::BUCKET-NAME"]
        },
        {
            "Sid": "AllObjectActions",
            "Effect": "Allow",
            "Action": "s3:*Object",
            "Resource": ["arn:aws:s3:::BUCKET-NAME/*"]
        }
    ]
}'''
glue_exec_role_policy = glue_exec_role_policy.replace("BUCKET-NAME", bucket)
glue_exec_role_policy

'{\n    "Version": "2012-10-17",\n    "Statement": [\n        {\n            "Sid": "ListObjectsInBucket",\n            "Effect": "Allow",\n            "Action": ["s3:ListBucket"],\n            "Resource": ["arn:aws:s3:::sagemaker-us-east-1-667350535149"]\n        },\n        {\n            "Sid": "AllObjectActions",\n            "Effect": "Allow",\n            "Action": "s3:*Object",\n            "Resource": ["arn:aws:s3:::sagemaker-us-east-1-667350535149/*"]\n        }\n    ]\n}'

In [16]:
response = iam.create_policy(
    PolicyName='SageMakerStepFunctionGlueS3Policy',
    PolicyDocument=glue_exec_role_policy
)

In [17]:
iam.attach_role_policy(
    RoleName=glue_role_name,
    PolicyArn=response['Policy']['Arn']
)

{'ResponseMetadata': {'RequestId': 'fe133e59-2c08-4e3e-afa2-c33779735665',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': 'fe133e59-2c08-4e3e-afa2-c33779735665',
   'content-type': 'text/xml',
   'content-length': '212',
   'date': 'Wed, 10 Nov 2021 09:45:53 GMT'},
  'RetryAttempts': 0}}

4. Copy the output of the above cell and replace the **two occurances** of **BUCKET-NAME** in the JSON text that you entered.
5. Choose **Review policy** and give the policy a name such as `S3BucketAccessPolicy`.
6. Choose **Create policy**.
7. Select **Roles**, then search for and select your `AWS-Glue-S3-Bucket-Access` role.
8. Under the **Permissions** tab, click **Attach policies**.
9. Search for your newly created `S3BucketAccessPolicy` policy and select the check box next to it.
10. Choose **Attach policy**. You will then be redirected to the details page for the role.
11. Copy the **Role ARN** at the top of the Summary tab.

In [18]:
# paste the AWS-Glue-S3-Bucket-Access role ARN from above
glue_role = create_role_response['Role']['Arn']

In [19]:
glue_role

'arn:aws:iam::667350535149:role/AWS-Glue-S3-Bucket-Access'

#### Create a Lambda IAM Role
You also need to create an IAM role so that you can create and execute an AWS Lambda function stored in Amazon S3.

1. Go to the [IAM console](https://console.aws.amazon.com/iam/).
2. Select **Roles** and then **Create role**.
3. Under **Choose the service that will use this role** select **Lambda**.
4. Choose **Next** until you can enter a **Role name**.
5. Enter a name such as `query_training_status-role` and then select **Create role**.

Next, attach policies to the role you created. The following steps attach policies that provides Lambda access to S3 and read-only access to SageMaker.

1. Under the **Permissions** tab, click **Attach Policies**.
2. In the search box, type **SageMaker** and select **AmazonSageMakerReadOnly** **should be full access** from the populated list.
3. In the search box type **AWSLambda** and select **AWSLambdaBasicExecutionRole** from the populated list.
4. Choose **Attach policy**. You will then be redirected to the details page for the role.
5. Copy the **Role ARN** at the top of the **Summary**.


In [20]:
lambda_role_name = "LambdaExecutionRole"
lambda_assume_role_policy_document = {
    "Version": "2012-10-17",
    "Statement": [
        {
          "Effect": "Allow",
          "Principal": {
            "Service": ["lambda.amazonaws.com"]
          },
          "Action": "sts:AssumeRole"
        }
    ]
}
create_role_response = iam.create_role(
    RoleName = lambda_role_name,
    AssumeRolePolicyDocument = json.dumps(lambda_assume_role_policy_document)
)

In [21]:
iam.attach_role_policy(
    PolicyArn='arn:aws:iam::aws:policy/AmazonSageMakerFullAccess',
    RoleName=lambda_role_name
)

{'ResponseMetadata': {'RequestId': '841f7a83-b785-490e-8583-ee294e5c5aa2',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': '841f7a83-b785-490e-8583-ee294e5c5aa2',
   'content-type': 'text/xml',
   'content-length': '212',
   'date': 'Wed, 10 Nov 2021 09:46:40 GMT'},
  'RetryAttempts': 0}}

In [22]:
# paste the query_training_status-role role ARN from above
lambda_role = create_role_response['Role']['Arn']

In [23]:
lambda_role

'arn:aws:iam::667350535149:role/LambdaExecutionRole'

In [24]:
%store notebook_role 
%store stepfunction_exec_role_arn
%store glue_role
%store lambda_role 




Stored 'notebook_role' (str)
Stored 'stepfunction_exec_role_arn' (str)
Stored 'glue_role' (str)
Stored 'lambda_role' (str)


---