# SageMaker Bring Your Own Algorithm Container

<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Test-on-Local-Machine" data-toc-modified-id="Test-on-Local-Machine-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Test on Local Machine</a></span><ul class="toc-item"><li><span><a href="#Build-Docker-Image" data-toc-modified-id="Build-Docker-Image-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>Build Docker Image</a></span></li><li><span><a href="#Local-Test" data-toc-modified-id="Local-Test-1.2"><span class="toc-item-num">1.2&nbsp;&nbsp;</span>Local Test</a></span><ul class="toc-item"><li><span><a href="#train_local.sh" data-toc-modified-id="train_local.sh-1.2.1"><span class="toc-item-num">1.2.1&nbsp;&nbsp;</span><code>train_local.sh</code></a></span></li><li><span><a href="#serve_local.sh" data-toc-modified-id="serve_local.sh-1.2.2"><span class="toc-item-num">1.2.2&nbsp;&nbsp;</span><code>serve_local.sh</code></a></span></li><li><span><a href="#predict.sh" data-toc-modified-id="predict.sh-1.2.3"><span class="toc-item-num">1.2.3&nbsp;&nbsp;</span><code>predict.sh</code></a></span></li></ul></li><li><span><a href="#Publish-Image-to-ECR" data-toc-modified-id="Publish-Image-to-ECR-1.3"><span class="toc-item-num">1.3&nbsp;&nbsp;</span>Publish Image to ECR</a></span><ul class="toc-item"><li><span><a href="#Manual-Steps" data-toc-modified-id="Manual-Steps-1.3.1"><span class="toc-item-num">1.3.1&nbsp;&nbsp;</span>Manual Steps</a></span></li></ul></li></ul></li><li><span><a href="#Train-Model-in-SageMaker" data-toc-modified-id="Train-Model-in-SageMaker-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Train Model in SageMaker</a></span><ul class="toc-item"><li><span><a href="#Setup-the-environment" data-toc-modified-id="Setup-the-environment-2.1"><span class="toc-item-num">2.1&nbsp;&nbsp;</span>Setup the environment</a></span><ul class="toc-item"><li><span><a href="#Copy-Files-to-S3" data-toc-modified-id="Copy-Files-to-S3-2.1.1"><span class="toc-item-num">2.1.1&nbsp;&nbsp;</span>Copy Files to S3</a></span></li><li><span><a href="#Create-IAM-Role" data-toc-modified-id="Create-IAM-Role-2.1.2"><span class="toc-item-num">2.1.2&nbsp;&nbsp;</span>Create IAM Role</a></span></li><li><span><a href="#Setup-SageMaker-Notebook-Instance" data-toc-modified-id="Setup-SageMaker-Notebook-Instance-2.1.3"><span class="toc-item-num">2.1.3&nbsp;&nbsp;</span>Setup SageMaker Notebook Instance</a></span></li></ul></li><li><span><a href="#Initialize-Variables" data-toc-modified-id="Initialize-Variables-2.2"><span class="toc-item-num">2.2&nbsp;&nbsp;</span>Initialize Variables</a></span></li><li><span><a href="#Create-an-Estimator-and-Fit-the-Model" data-toc-modified-id="Create-an-Estimator-and-Fit-the-Model-2.3"><span class="toc-item-num">2.3&nbsp;&nbsp;</span>Create an Estimator and Fit the Model</a></span></li></ul></li><li><span><a href="#Create-Endpoint-in-SageMaker" data-toc-modified-id="Create-Endpoint-in-SageMaker-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Create Endpoint in SageMaker</a></span><ul class="toc-item"><li><span><a href="#Test-by-Using-Predictor" data-toc-modified-id="Test-by-Using-Predictor-3.1"><span class="toc-item-num">3.1&nbsp;&nbsp;</span>Test by Using Predictor</a></span></li><li><span><a href="#Test-by-Invoking-Endpoint" data-toc-modified-id="Test-by-Invoking-Endpoint-3.2"><span class="toc-item-num">3.2&nbsp;&nbsp;</span>Test by Invoking Endpoint</a></span></li><li><span><a href="#Optional-Cleanup" data-toc-modified-id="Optional-Cleanup-3.3"><span class="toc-item-num">3.3&nbsp;&nbsp;</span>Optional Cleanup</a></span></li></ul></li><li><span><a href="#Run-Batch-Transform-Job" data-toc-modified-id="Run-Batch-Transform-Job-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Run Batch Transform Job</a></span><ul class="toc-item"><li><span><a href="#Create-a-Transform-Job" data-toc-modified-id="Create-a-Transform-Job-4.1"><span class="toc-item-num">4.1&nbsp;&nbsp;</span>Create a Transform Job</a></span></li><li><span><a href="#View-Output" data-toc-modified-id="View-Output-4.2"><span class="toc-item-num">4.2&nbsp;&nbsp;</span>View Output</a></span></li></ul></li></ul></div>

## Test on Local Machine

### Build Docker Image

Build the image using Dockerfile in `container` folder.

```sh
cd container
docker build -t sagemaker-bring-your-own . 
```

### Local Test

To test the algorithm and docker image, use the three shell scripts in the **`test`** folder. It builds the image and runs it in a container to train and test the model. It mounts a directory structure that mimics production.

#### `train_local.sh`

- Run the script with the name of the image. 
- It maps `test_dir` folder to `/opt/ml` folder. 
- Test data is placed in `test_dir/input/data`.
- (Optional) Modify the file `test_dir/input/config/hyperparameters.json` to have the hyperparameter settings that you want to test (as strings).
- Trained model will be saved to `test_dir/models` folder.

```sh
./train_local.sh sagemaker-bring-your-own
```

#### `serve_local.sh`

- Run this with the name of the image to serve the model after model is trained.

```sh
./serve_local.sh sagemaker-bring-your-own
```

#### `predict.sh`

- Run this with the name of a payload file and (optionally) the HTTP content type you want. The content type will default to `text/csv`. For example, you can run 

```sh
./predict.sh test/data/payload.csv text/csv
```

- Alternatively, can run following command to test the prediction. Need to use full path in the curl command.

```sh
curl --data-binary @test/data/payload.csv -H "Content-Type: text/csv" -v http://localhost:8080/invocations
```

### Publish Image to ECR

Run the `build_and_push.sh <IMAGE_NAME>` script in the folder `container`.

```
./build_and_push.sh sagemaker-bring-your-own
```


#### Manual Steps

For debugging purpose, you can also run following commands one by one.

1. With AWS CLI 2, login into AWS ECR.

```sh
aws ecr get-login-password --region ap-southeast-1 | docker login --username AWS --password-stdin <ACCOUNT_ID>.dkr.ecr.ap-southeast-1.amazonaws.com
```

2. Tag local image with full ECR image name.

```sh
docker tag sagemaker_bring_your_own <ACCOUNT_ID>.dkr.ecr.ap-southeast-1.amazonaws.com/sagemaker_bring_your_own:latest
```

3. Push image to ECR.

```sh
docker push <ACCOUNT_ID>.dkr.ecr.ap-southeast-1.amazonaws.com/sagemaker_bring_your_own:latest
```

## Train Model in SageMaker

After local test, use SageMaker to train models and use the model for hosting or batch transforms.

### Setup the environment

#### Copy Files to S3

- Copy `test/test_dir/input` folder into the working S3 bucket, e.g. `s3://temp-305326993135/sagemaker_bring_your_own/`

```sh
aws s3 cp test\test_dir\input s3://temp-305326993135/sagemaker_bring_your_own/input --recursive
```

- Copy `test/data` folder into the working s3 bucket, e.g. ``s3://temp-305326993135/sagemaker_bring_your_own/`

```sh
aws s3 cp test\data s3://temp-305326993135/sagemaker_bring_your_own/data --recursive
```

#### Create IAM Role

Create an IAM role, e.g. `u-SageMakerExecutionRole`, with SageMaker as trusted relationship. Add  following policy and `AmazonSageMakerFullAccess` permissions.

```json
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "cloudwatch:PutMetricData",
                "logs:CreateLogStream",
                "logs:PutLogEvents",
                "logs:CreateLogGroup",
                "logs:DescribeLogStreams",
                "s3:GetObject",
                "s3:PutObject",
                "s3:ListBucket",
                "ecr:GetAuthorizationToken",
                "ecr:BatchCheckLayerAvailability",
                "ecr:GetDownloadUrlForLayer",
                "ecr:BatchGetImage"
            ],
            "Resource": "*",
            "Effect": "Allow"
        }
    ]
}
```

#### Setup SageMaker Notebook Instance

- Create a SageMaker Notebook instance
- Start jupyter notebook from this instance
- Create a folder same as the docker image name, e.g. `sagemaker-bring-your-own`, and CD into it.
- Upload this Jupyter Notebook file into above folder
- Upload `payload.json` file (in `test/data` folder) into above folder

### Initialize Variables

- Import libraries
- Get SageMaker execution role
- Get current AWS region

In [46]:
import boto3
import re
import os
import numpy as np
import pandas as pd
import sagemaker
import json
from time import gmtime, strftime

role = sagemaker.get_execution_role()
account_id = boto3.client('sts').get_caller_identity().get('Account')
region = boto3.Session().region_name

- Setup S3 data paths to input training data and output model artifacts

In [35]:
s3_client = boto3.client('s3')

IMAGE_NAME = 'sagemaker_bring_your_own'

# Where the training data is located
input_bucket = 'temp-305326993135'
input_data_prefix = f'{IMAGE_NAME}/input/data'
input_config_prefix = f'{IMAGE_NAME}/input/config'

# Where to save code and model artifacts
# output_bucket = sagemaker.Session().default_bucket()
# TEST
output_bucket = 'temp-305326993135'
output_prefix = f'{IMAGE_NAME}'

Read hyper-parameters for the model.

In [52]:
hyperparameters = {}
try:
    hyper_param_file = f'{input_config_prefix}/hyperparameters.json'
    response = s3_client.get_object(Bucket=input_bucket, Key=hyper_param_file)
    content = response['Body']
    hyperparameters = json.loads(content.read())
except Exception as ex:
    pass

### Create an Estimator and Fit the Model

In order to use SageMaker to fit our algorithm, we'll create an `Estimator` that defines how to use the container to train. This includes the configuration we need to invoke SageMaker training:

* The __container name__. This is constructed as in the shell commands above.
* The __role__. As defined above.
* The __instance count__ which is the number of machines to use for training.
* The __instance type__ which is the type of machine to use for training.
* The __output path__ determines where the model artifact will be written.
* The __session__ is the SageMaker session object that we defined above.

Then we use fit() on the estimator to train against the data that we uploaded above.

In [54]:
full_image_name = f"{account_id}.dkr.ecr.{region}.amazonaws.com/{IMAGE_NAME}:latest"

model = sagemaker.estimator.Estimator(
    base_job_name=IMAGE_NAME.replace('_', '-'),
    image_uri=full_image_name,
    role=role,
    instance_count=1,
    instance_type="ml.c4.2xlarge",
    volume_size= 5,     # GB
    output_path=f"s3://{output_bucket}/{output_prefix}/output",
    sagemaker_session=sagemaker.Session()
)

# Set hyperparameters for the model training
model.set_hyperparameters(**hyperparameters)

# Specify s3 folder which contains training data and its data type
train_input = sagemaker.TrainingInput(s3_data=f's3://{input_bucket}/{input_data_prefix}/train/', 
                                      content_type='text/csv')

model.fit({'train': train_input}, wait=True)


2022-01-18 07:53:46 Starting - Starting the training job...
2022-01-18 07:54:09 Starting - Launching requested ML instancesProfilerReport-1642492425: InProgress
...
2022-01-18 07:54:44 Starting - Preparing the instances for training.........
2022-01-18 07:56:10 Downloading - Downloading input data
2022-01-18 07:56:10 Training - Downloading the training image..[34mRows in training data 150[0m
[34mStarting the training.[0m
[34mTraining complete.[0m

2022-01-18 07:56:46 Uploading - Uploading generated training model
2022-01-18 07:57:10 Completed - Training job completed
Training seconds: 61
Billable seconds: 61


## Create Endpoint in SageMaker
You can use a trained model to get real time predictions using HTTP endpoint. Follow these steps to walk you through the process.

Deploying the model to SageMaker hosting just requires a `deploy` call on the fitted model. This call takes an instance count, instance type, and optionally serializer and deserializer functions. These are used when the resulting predictor is created on the endpoint.

In [81]:
from sagemaker.deserializers import JSONDeserializer
from sagemaker.serializers import CSVSerializer

predictor = model.deploy(initial_instance_count=1, 
                         instance_type="ml.m4.xlarge", 
                         serializer=CSVSerializer())

----!

In [82]:
endpoint_name = predictor.endpoint_name
print(f'Endpoint: {predictor.endpoint_name}')

Endpoint: sagemaker-bring-your-own-2022-01-18-08-36-25-531


### Test by Using Predictor

In order to do some predictions, we'll extract some of the data we used for training and do predictions against it. This is, of course, bad statistical practice, but a good way to see how the mechanism works.

In [83]:
test_data = pd.read_csv("payload.csv", header=None)
test_data.sample(3)

Unnamed: 0,0,1,2,3
0,5.0,3.5,1.3,0.3
23,6.8,3.2,5.9,2.3
21,6.9,3.1,5.1,2.3


Prediction is as easy as calling predict with the predictor we got back from deploy and the data we want to do predictions with. The serializers take care of doing the data conversions for us.

In [89]:
result = predictor.predict(data=test_data.values).decode('utf-8')
result = result.split()
print(result)

['setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica', 'virginica']


### Test by Invoking Endpoint

Load test data, e.g. `payload.csv` file.

In [2]:
with open('payload.csv') as f:
    payload = f.read()

FileNotFoundError: [Errno 2] No such file or directory: 'payload.csv'

Invoke endpoint and print result.

In [27]:
# Invoke Endpoint
sagemaker_runtime = boto3.client("sagemaker-runtime", region)
response = sagemaker_runtime.invoke_endpoint(
                            EndpointName=endpoint_name, ContentType='text/csv', 
                            Body=payload)

In [33]:
# Unpack response
result = json.loads(response['Body'].read().decode())
print(result)

{'records': [{'url': 'https://eponline.mom.gov.sg/epol/PEPOLUAMT012DisplayAction.do', 'loadedUrl': 'https://eponline.mom.gov.sg/epol/PEPOLUAMT012DisplayAction.do', 'startingUrl': 'https://www.mom.gov.sg/', 'statusCode': 200, 'title': 'EP Online - PEPOLUAMT012 - Title', 'bodyText': 'Best viewed with Internet Explorer (IE) 11.0\xa0\n10 November 2021, 7:24 AM \xa0 \xa0\nTerms and Conditions PEPOLUAMT012\xa0\nWORK PASS DIVISION\nMINISTRY OF MANPOWER \xa0\nTERMS AND CONDITIONS\nWhen you click on the "I Agree" button, you agree to be bound by the Terms and Conditions of use of the EP Online system.\nAgreement on Terms and Conditions of Use of the\nEP ONLINE SYSTEM \xa0\n1.\nWelcome to the EP Online system ("System"), a service provided by the Ministry of Manpower ("MOM"). This system is owned and maintained by the Work Pass Division ("WPD"). The system is a service that allows you to enquire Employment, S and related passes application status or validity, and is free-of-charge.\n2.\nBy click

### Optional Cleanup
When you're done with the endpoint, you'll want to clean it up.

In [None]:
sagemaker.Session().delete_endpoint(predictor.endpoint)

## Run Batch Transform Job
You can use a trained model to get inference on large data sets by using [Amazon SageMaker Batch Transform](https://docs.aws.amazon.com/sagemaker/latest/dg/how-it-works-batch.html). A batch transform job takes your input data S3 location and outputs the predictions to the specified S3 output folder. Similar to hosting, you can extract inferences for training data to test batch transform.

### Create a Transform Job
We'll create an `Transformer` that defines how to use the container to get inference results on a data set. This includes the configuration we need to invoke SageMaker batch transform:

* The __instance count__ which is the number of machines to use to extract inferences
* The __instance type__ which is the type of machine to use to extract inferences
* The __output path__ determines where the inference results will be written

In [94]:
transform_bucket = 'temp-305326993135'
transform_prefix = f'{IMAGE_NAME}/data'
transform_output_path = f's3://{transform_bucket}/{transform_prefix}'
transformer = model.transformer(
    instance_count=1,
    instance_type="ml.m4.xlarge",
    assemble_with="Line",
    accept="text/csv",
    output_path=transform_output_path,
)

We use tranform() on the transfomer to get inference results against the data that we uploaded. You can use these options when invoking the transformer. 

* The __data_location__ which is the location of input data
* The __content_type__ which is the content type set when making HTTP request to container to get prediction
* The __split_type__ which is the delimiter used for splitting input data 
* The __input_filter__ which indicates the first column (ID) of the input will be dropped before making HTTP request to container

In [95]:
transform_input = f's3://{transform_bucket}/{transform_prefix}/iris.csv'
transformer.transform(
    data=transform_input, content_type="text/csv", split_type="Line", input_filter="$[1:]"
)
transformer.wait()

...........................
[34mStarting the inference server with 4 workers.[0m
[34m[2022-01-18 08:57:32 +0000] [10] [INFO] Starting gunicorn 20.1.0[0m
[34m[2022-01-18 08:57:32 +0000] [10] [INFO] Listening at: unix:/tmp/gunicorn.sock (10)[0m
[34m[2022-01-18 08:57:32 +0000] [10] [INFO] Using worker: sync[0m
[34m[2022-01-18 08:57:32 +0000] [14] [INFO] Booting worker with pid: 14[0m
[34m[2022-01-18 08:57:32 +0000] [15] [INFO] Booting worker with pid: 15[0m
[34m[2022-01-18 08:57:32 +0000] [17] [INFO] Booting worker with pid: 17[0m
[34m[2022-01-18 08:57:32 +0000] [18] [INFO] Booting worker with pid: 18[0m
[34mLoad model from /opt/ml/model/model.pkl[0m
[34m169.254.255.130 - - [18/Jan/2022:08:57:38 +0000] "GET /ping HTTP/1.1" 200 2 "-" "Go-http-client/1.1"[0m
[34m169.254.255.130 - - [18/Jan/2022:08:57:38 +0000] "GET /execution-parameters HTTP/1.1" 404 2 "-" "Go-http-client/1.1"[0m
[34mInvoked with 150 records[0m
[34mPerform prediction on 150 rows of input[0m
[34m16

For more information on the configuration options, see [CreateTransformJob API](https://docs.aws.amazon.com/sagemaker/latest/dg/API_CreateTransformJob.html)

### View Output

The output file will be the input file name + `.out`.

In [115]:
s3_client = sagemaker.Session().boto_session.client("s3")
output_file =  f"{transform_prefix}/iris.csv.out"

result = []
response = s3_client.list_objects(Bucket=transform_bucket, Prefix=output_file)
for o in response.get('Contents'):
    data = s3_client.get_object(Bucket=transform_bucket, Key=o.get('Key'))
    contents = data['Body'].read()
    result.extend([line.strip() for line in contents.decode("utf-8").split()])

print(result)

['setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'virginica', 'versicolor', 'virginica', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor', 'virginica', 'versicolor', 'versicolor'