# sagemaker-demo-notebook

This notebook is an interactive companion to the article. In it we will do the following:

* Build a machine learning model image and store it on ECR, Amazon's container registry service.
* Train a machine learning model based on the image we just pushed.
* Deploy that model to a web endpoint.
* Deploy an arbitrary Sagemaker-complaint model artifact to a web endpoint.
* Perform a batch classification job using a SageMaker-compliant model artifact (unfinished?).

You may run this notebook either locally or in an AWS SageMaker instance.

If you are running locally, make sure that the account you are running this notebook under has all of the necessary permissions: `S3ReadOnlyAccess`, `SagemakerFullAccess`, `iam:GetRole`, and `ECRFullAccess`.

If you are running on AWS SageMaker, make sure that the role you pass to the notebook instance has all of these permissions available. Note that the default SageMaker execution context is **not** enough; it has the first permissions in the list above but not the latter two. You need to attach those permissions to the instance yourself.


## Getting the code

We start by downloading the code from [its repository](https://github.com/ResidentMario/quilt-sagemaker-demo) on GitHub.

In [1]:
!rm -rf quilt-sagemaker-demo > /dev/null 2>&1
!git clone https://github.com/ResidentMario/quilt-sagemaker-demo

Cloning into 'quilt-sagemaker-demo'...
remote: Enumerating objects: 108, done.[K
remote: Counting objects: 100% (108/108), done.[K
remote: Compressing objects: 100% (72/72), done.[K
remote: Total 108 (delta 58), reused 82 (delta 32), pack-reused 0[K
Receiving objects: 100% (108/108), 861.00 KiB | 35.87 MiB/s, done.
Resolving deltas: 100% (58/58), done.


In [2]:
%ls quilt-sagemaker-demo

app.py       health-check-data.csv  sagemaker-demo-notebook.ipynb
build.ipynb  requirements.txt
Dockerfile   [0m[01;32mrun.sh[0m*


The files are:
* `build.ipynb` &mdash; A Jupyter notebook that walks through building and training a model for classifying clothing that is based on the Fashion MNIST dataset.
* `app.py` &mdash; A simple `flask` app that serves a SageMaker-compliant model-as-an-app.
* `health-check-data.csv` &mdash; A small sample dataset used to ping the web service for health checks.
* `Dockerfile` &mdash; A Dockerfile that builds an image suitable for distribution on SageMaker.
* `run.sh` &mdash; The image runtime entrypoint.
* `requirements.txt` &mdash; A list of dependencies necesssary for building or running the model (locally or remotely).

...and this notebook.

## Pusing the container

The following shell script, inlined in this notebook, builds the Docker image we've imported and stores it in ECR.

In [3]:
%%sh

# construct the ECR name.
account=$(aws sts get-caller-identity --query Account --output text)
region=$(aws configure get region)
fullname="${account}.dkr.ecr.${region}.amazonaws.com/quiltdata/sagemaker-demo:latest"

# If the repository doesn't exist in ECR, create it.
# The pipe trick redirects stderr to stdout and passes it /dev/null.
# It's just there to silence the error.
aws ecr describe-repositories --repository-names "quiltdata/sagemaker-demo" > /dev/null 2>&1

# Check the error code, if it's non-zero then know we threw an error and no repo exists
if [ $? -ne 0 ]
then
    aws ecr create-repository --repository-name "quiltdata/sagemaker-demo" > /dev/null
fi

# Get the login command from ECR and execute it directly
$(aws ecr get-login --region ${region} --no-include-email)

# Build the docker image, tag it with the full name, and push it to ECR
docker build  -t "quiltdata/sagemaker-demo" quilt-sagemaker-demo/
docker tag "quiltdata/sagemaker-demo" ${fullname}

docker push ${fullname}

Login Succeeded
Sending build context to Docker daemon  1.351MB
Step 1/14 : FROM python:3.6
3.6: Pulling from library/python
cd8eada9c7bb: Pulling fs layer
c2677faec825: Pulling fs layer
fcce419a96b1: Pulling fs layer
045b51e26e75: Pulling fs layer
3b969ad6f147: Pulling fs layer
6992ba8c827e: Pulling fs layer
15cdf2df3fc4: Pulling fs layer
2929c9fb25e5: Pulling fs layer
bf6c76496fdd: Pulling fs layer
045b51e26e75: Waiting
3b969ad6f147: Waiting
6992ba8c827e: Waiting
15cdf2df3fc4: Waiting
2929c9fb25e5: Waiting
bf6c76496fdd: Waiting
fcce419a96b1: Verifying Checksum
fcce419a96b1: Download complete
c2677faec825: Verifying Checksum
c2677faec825: Download complete
cd8eada9c7bb: Verifying Checksum
cd8eada9c7bb: Download complete
6992ba8c827e: Verifying Checksum
6992ba8c827e: Download complete
15cdf2df3fc4: Verifying Checksum
15cdf2df3fc4: Download complete
045b51e26e75: Verifying Checksum
045b51e26e75: Download complete
2929c9fb25e5: Verifying Checksum
2929c9fb25e5: Download complete
bf6c764

https://docs.docker.com/engine/reference/commandline/login/#credentials-store



## Training a model

We use `sagemaker.estimator.Estimator` object to perform model training.

Note that the `Estimator` object is parameterized with the image ARN (resource name), a role and session (passed down from the role executing this notebook instance), an instance and instance count, and an output path.

The `output_path` is an interesting case. The default behavior of the various algorithms that SageMaker comes packaged with is to output a `*.tar.gz` model artifact into an S3 bucket, and this is a design pattern you are encouraged to use when using a custom image (as well) by e.g. the presence of this argument.

Our image serializes model objects itself instead of relying on SageMaker to do it for us, rendering this argument useless. However it's not wise to omit it as SageMaker will automatically create a fresh run-dependent bucket for you if you do...

**User note**: you should change `output_path` in the code cell that follows to any random S3 bucket that you own or that hasn't been claimed yet.

In [4]:
import boto3
import re

import os
import numpy as np
import pandas as pd

from sagemaker import get_execution_role
import sagemaker as sage

In [9]:
# this line of code require additional iam:GetRole permissions.
role = get_execution_role()

sess = sage.Session()

account = sess.boto_session.client('sts').get_caller_identity()['Account']
region = sess.boto_session.region_name
image = '{}.dkr.ecr.{}.amazonaws.com/quiltdata/sagemaker-demo'.format(account, region)

Once the model is defined training is performed via `Esimator.fit`, mimicking the `scikit-learn` API.

In [6]:
clf = sage.estimator.Estimator(image,
                               role, 1, 'ml.c4.2xlarge',
                               output_path="s3://quilt-example/quilt/quilt_sagemaker_demo/model",
                               sagemaker_session=sess)

clf.fit()

INFO:sagemaker:Creating training-job with name: sagemaker-demo-2019-01-18-22-01-23-156


2019-01-18 22:01:23 Starting - Starting the training job...
2019-01-18 22:01:31 Starting - Launching requested ML instances......
2019-01-18 22:02:37 Starting - Preparing the instances for training......
2019-01-18 22:03:47 Downloading - Downloading input data
2019-01-18 22:03:47 Training - Downloading the training image.....
[31m[NbConvertApp] Converting notebook build.ipynb to notebook[0m
[31m[NbConvertApp] Executing notebook with kernel: python3[0m

2019-01-18 22:04:29 Training - Training image download completed. Training in progress.[31m2019-01-18 22:04:58.244211: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA[0m
[31m[NbConvertApp] Writing 401021 bytes to build.ipynb[0m

2019-01-18 22:05:31 Uploading - Uploading generated training model
2019-01-18 22:05:31 Completed - Training job completed
Billable seconds: 111


Running this code block trains out model and deposits it in a `clf.tar.gz` file in an S3 bucket somewhere.

## Deploying a model

### Deploy a fitted model as an endpoint

In [17]:
from sagemaker.predictor import csv_serializer
predictor = clf.deploy(1, 'ml.m4.xlarge', serializer=csv_serializer)

INFO:sagemaker:Creating model with name: sagemaker-demo-2019-01-18-22-11-52-395
INFO:sagemaker:Creating endpoint with name sagemaker-demo-2019-01-18-22-01-23-156


----------------------------------------------------------------------------!

In [30]:
# This fails because it lacks an authentication token.
# It might be possible to reconstruct the actual POST request being made.
# predictor.sagemaker_session.boto_session.get_credentials().token
# But the AWS docs are unclear about what name this hearder has.

# !curl -X "POST" -H "Content-Type: text/csv" -d @health-check-data.csv URI

In [61]:
X_test = pd.read_csv("./fashion-mnist_train.csv").head().iloc[:, 1:].values

In [18]:
sess.delete_endpoint(predictor.endpoint)

INFO:sagemaker:Deleting endpoint with name: sagemaker-demo-2019-01-18-22-01-23-156


#### Deploy a pre-trained model artifact as an endpoint

In [65]:
from sagemaker import Model

In [9]:
model = Model(
    model_data='s3://quilt-example/quilt/quilt_sagemaker_demo/model/sagemaker-demo-2019-01-18-22-01-23-156/output/model.tar.gz',
    image=image,
    role=role,
    sagemaker_session=sess
)
model.deploy(1, 'ml.c4.2xlarge')

In [8]:
predictor = sage.predictor.RealTimePredictor(
    'sagemaker-demo-2019-01-18-22-48-00-247', 
    sagemaker_session=sess, 
    content_type="text/csv")

In [7]:
inp = "\n".join([",".join(l) for l in X_test.astype('str').tolist()])

In [6]:
response = predictor.predict(inp)

In [5]:
response

#### Use a model artifact to perform a batch prediction run

In order to perform a batch transform you must have a model.

In [4]:
transformer = sage.transformer.Transformer(
    base_transform_job_name='Batch-Transform',
    model_name='sagemaker-demo-2019-01-18-22-48-00-247',  # take this from a past training session
    instance_count=1,
    instance_type='ml.c4.xlarge',
    output_path='s3://quilt-example/quilt/quilt_sagemaker_demo/model',
    sagemaker_session=sess
)

In [3]:
# start the job
# note: this requires that the input data be in exactly the format expected by the model!
transformer.transform(
    's3://alpha-quilt-storage/aleksey/fashion_mnist/fashion-mnist_train.csv', 
    content_type='text/csv', 
    split_type='Line'
)

# wait until transform job is completed
transformer.wait()

In [1]:
import boto3
s3_client = boto3.resource('s3')

In [2]:
s3_client.download_file('s3://quilt-example/', 'quilt_sagemaker_demo/model/[...]')