# Building a docker container for training/deploying our classifier
In this exercise we'll create a Docker image that will have the required code for training and deploying a ML model. In this particular example, we'll use scikit-learn (https://scikit-learn.org/) and the **Random Forest Tree** implementation of that library to train a flower classifier. The dataset used in this experiment is a toy dataset called Iris (http://archive.ics.uci.edu/ml/datasets/iris). The clallenge itself is very basic, so you can focus on the mechanics and the features of this automated environment.

A first pipeline will be executed at the end of this exercise, automatically. It will get the assets you'll push to a Git repo, build this image and push it to ECR, a docker image repository, used by SageMaker.

> **Question**: Why would I create a Scikit-learn container from scratch if SageMaker already offerst one (https://docs.aws.amazon.com/sagemaker/latest/dg/sklearn.html).  
> **Answer**: This is an exercise and the idea here is also to show you how you can create your own container. In a real-life scenario, the best approach is to use the native container offered by SageMaker.

In [1]:
import boto3
import sagemaker
from sagemaker import get_execution_role

ecr_repository_name = 'iris-model'
role = get_execution_role()
account_id = role.split(':')[4]
region = boto3.Session().region_name
sagemaker_session = sagemaker.session.Session()
bucket = sagemaker_session.default_bucket()

print('ecr_repository_name:', ecr_repository_name)
print('account_id:',account_id)
print('region:',region)
print('role:',role)
print('bucket:',bucket)

ecr_repository_name: iris-model
account_id: 725879053979
region: us-east-1
role: arn:aws:iam::725879053979:role/MLOps
bucket: sagemaker-us-east-1-725879053979


## PART 1 - Creating the assets required to build/test a docker image

### 1.1 Let's start by creating the training script!

As you can see, this is a very basic example of Scikit-Learn. Nothing fancy.

In [2]:
!rm -rf package
!rm -rf docker

!mkdir docker
!mkdir docker/code

!mkdir package
!mkdir package/src
!mkdir package/src/custom_lightgbm_inference
!touch package/src/custom_lightgbm_inference/__init__.py

In [3]:
# %%writefile src/my_training.py
# import os
# import pandas as pd
# import re
# import joblib
# import json
# from sklearn.ensemble import RandomForestClassifier

# def load_dataset(path):
#     # Take the set of files and read them all into a single pandas dataframe
#     files = [ os.path.join(path, file) for file in os.listdir(path) ]
    
#     if len(files) == 0:
#         raise ValueError("Invalid # of files in dir: {}".format(path))

#     raw_data = [ pd.read_csv(file, sep=",", header=None ) for file in files ]
#     data = pd.concat(raw_data)

#     # labels are in the first column
#     y = data.iloc[:,0]
#     X = data.iloc[:,1:]
#     return X,y
    
# def main(args):
#     print("Training mode")

#     try:
#         X_train, y_train = load_dataset(args.training)
#         X_test, y_test = load_dataset(args.testing)
        
#         hyperparameters = {
#             "max_depth": args.max_depth,
#             "verbose": 1, # show all logs
#             "n_jobs": args.n_jobs,
#             "n_estimators": args.n_estimators
#         }
#         print("Training the classifier")
#         model = RandomForestClassifier()
#         model.set_params(**hyperparameters)
#         model.fit(X_train, y_train)
#         print("Score: {}".format( model.score(X_test, y_test)) )
#         joblib.dump(model, open(os.path.join(args.model_dir, "iris_model.pkl"), "wb"))
    
#     except Exception as e:
#         # Write out an error file. This will be returned as the failureReason in the
#         # DescribeTrainingJob result.
#         trc = traceback.format_exc()
#         with open(os.path.join(output_path, "failure"), "w") as s:
#             s.write("Exception during training: " + str(e) + "\\n" + trc)
            
#         # Printing this causes the exception to be in the training job logs, as well.
#         print("Exception during training: " + str(e) + "\\n" + trc, file=sys.stderr)
        
#         # A non-zero exit code causes the training job to be marked as Failed.
#         sys.exit(255)

### 1.2 Ok. Lets then create the handler. The **Inference Handler** is how we use the SageMaker Inference Toolkit to encapsulate our code and expose it as a SageMaker container.
SageMaker Inference Toolkit: https://github.com/aws/sagemaker-inference-toolkit

In [4]:
# %%writefile src/handler_service.py
# from sagemaker_inference.default_handler_service import DefaultHandlerService
# from sagemaker_inference.transformer import Transformer
# from inference_handler import CustomInferenceHandler

# class HandlerService(DefaultHandlerService):
#     def __init__(self):
#         transformer = Transformer(default_inference_handler=CustomInferenceHandler())
#         super(HandlerService, self).__init__(transformer=transformer)

In [5]:
# %%writefile src/inference_handler.py
# import os
# import sys
# import joblib
# from sagemaker_inference.default_inference_handler import DefaultInferenceHandler
# from sagemaker_inference import content_types, errors, transformer, encoder, decoder

# class CustomInferenceHandler(DefaultInferenceHandler):    
#     ## Loads the model from the disk
#     def default_model_fn(self, model_dir):
#         model_filename = os.path.join(model_dir, "model.joblib")
#         return joblib.load(model_filename)
    
#     ## Parse and check the format of the input data
#     def default_input_fn(self, input_data, content_type):
#         if content_type != "text/csv":
#             raise Exception("Invalid content-type: %s" % content_type)
#         return decoder.decode(input_data, content_type).reshape(1,-1)
    
#     ## Run our model and do the prediction
#     def default_predict_fn(self, payload, model):
#         return model.predict( payload ).tolist()
    
#     ## Gets the prediction output and format it to be returned to the user
#     def default_output_fn(self, prediction, accept):
#         if accept != "text/csv":
#             raise Exception("Invalid accept: %s" % accept)
#         return encoder.encode(prediction, accept)
    

In [6]:
%%writefile package/src/custom_lightgbm_inference/handler.py
# !pygmentize package/src/custom_lightgbm_inference/handler.py
import os
import sys
import joblib
from sagemaker_inference.default_inference_handler import DefaultInferenceHandler
from sagemaker_inference.default_handler_service import DefaultHandlerService
from sagemaker_inference import content_types, errors, transformer, encoder, decoder

class HandlerService(DefaultHandlerService, DefaultInferenceHandler):
    def __init__(self):
        op = transformer.Transformer(default_inference_handler=self)
        super(HandlerService, self).__init__(transformer=op)
    
    ## Loads the model from the disk
    def default_model_fn(self, model_dir):
        model_filename = os.path.join(model_dir, "model.joblib")
        return joblib.load(model_filename)
    
    ## Parse and check the format of the input data
    def default_input_fn(self, input_data, content_type):
        if content_type != "text/csv":
            raise Exception("Invalid content-type: %s" % content_type)
        return decoder.decode(input_data, content_type).reshape(1,-1)
    
    ## Run our model and do the prediction
    def default_predict_fn(self, payload, model):
        return model.predict( payload ).tolist()
    
    ## Gets the prediction output and format it to be returned to the user
    def default_output_fn(self, prediction, accept):
        if accept != "text/csv":
            raise Exception("Invalid accept: %s" % accept)
        return encoder.encode(prediction, accept)

Writing package/src/custom_lightgbm_inference/handler.py


### 1.3 Now we need to create the entrypoint of our container. The main function

We'll use **SageMaker Training Toolkit** (https://github.com/aws/sagemaker-training-toolkit) to work with the arguments and environment variables defined by SageMaker. This library will make our code simpler.

In [7]:
%%writefile package/src/custom_lightgbm_inference/my_serving.py
# !pygmentize package/src/custom_lightgbm_inference/my_serving.py

from sagemaker_inference import model_server
from custom_lightgbm_inference import handler

HANDLER_SERVICE = handler.__name__

def main():
    print('Running handler service:', HANDLER_SERVICE)
    model_server.start_model_server(handler_service=HANDLER_SERVICE)


Writing package/src/custom_lightgbm_inference/my_serving.py


In [8]:
%%writefile package/setup.py
# !pygmentize package/setup.py

from __future__ import absolute_import

from glob import glob
import os
from os.path import basename
from os.path import splitext

from setuptools import find_packages, setup

setup(
    name='custom_lightgbm_inference',
    version='0.1.0',
    description='Custom container serving package for SageMaker.',
    keywords="custom container serving package SageMaker",

    packages=find_packages(where='src'),
    package_dir={'': 'src'},
    py_modules=[splitext(basename(path))[0] for path in glob('src/*.py')],
    
    install_requires=[
        'sagemaker-inference==1.3.0',
        'multi-model-server==1.1.2'
    ],
    entry_points={"console_scripts": ["serve=custom_lightgbm_inference.my_serving:main"]},
)


Writing package/setup.py


In [9]:
# %%writefile setup.py

# from __future__ import absolute_import

# from glob import glob
# import os
# from os.path import basename
# from os.path import splitext

# from setuptools import find_packages, setup

# setup(
#     name='sagemaker-custom',
#     version='0.1.0',
#     description='Custom container serving package for SageMaker.',
#     keywords="custom container serving package SageMaker",

#     packages=find_packages(where='src'),
#     package_dir={'': 'src'},
#     py_modules=[splitext(basename(path))[0] for path in glob('src/*.py')],
    
#     install_requires=['sagemaker-inference==1.3.0'],
#     entry_points={"console_scripts": ["serve=custom_inference.cli.init_serve:main"]},
# )

In [10]:
# %%writefile src/cli/init_serve.py

# import os

# def main():
#     serving_module = os.environ.get('SAGEMAKER_SERVING_MODULE')

#     serving_name, entry_point_name = serving_module.split(":")

#     serving = importlib.import_module(serving_name)

#     # the logger is configured after importing the framework library, allowing
#     # the framework to configure logging at import time.
#     logging_config.configure_logger(env.log_level)
#     logger.info("Using serving module %s", serving_name)

#     entrypoint = getattr(serving, entry_point_name)
#     entrypoint()
    
# if __name__=='__main__':
#     main()
    

In [11]:
# !pip install sagemaker-training sagemaker-inference multi-model-server

In [12]:
!cd package && pip install -e .

Obtaining file:///home/ec2-user/SageMaker/sagemaker_custom/1_custom_inference/package
Installing collected packages: custom-lightgbm-inference
  Attempting uninstall: custom-lightgbm-inference
    Found existing installation: custom-lightgbm-inference 0.1.0
    Uninstalling custom-lightgbm-inference-0.1.0:
      Successfully uninstalled custom-lightgbm-inference-0.1.0
  Running setup.py develop for custom-lightgbm-inference
Successfully installed custom-lightgbm-inference
You should consider upgrading via the '/home/ec2-user/anaconda3/envs/python3/bin/python -m pip install --upgrade pip' command.[0m


In [13]:
!cd package/ && python setup.py sdist && cp dist/custom_lightgbm_inference-0.1.0.tar.gz ../docker/code/


running sdist
running egg_info
writing src/custom_lightgbm_inference.egg-info/PKG-INFO
writing dependency_links to src/custom_lightgbm_inference.egg-info/dependency_links.txt
writing entry points to src/custom_lightgbm_inference.egg-info/entry_points.txt
writing requirements to src/custom_lightgbm_inference.egg-info/requires.txt
writing top-level names to src/custom_lightgbm_inference.egg-info/top_level.txt
reading manifest file 'src/custom_lightgbm_inference.egg-info/SOURCES.txt'
writing manifest file 'src/custom_lightgbm_inference.egg-info/SOURCES.txt'

running check


creating custom_lightgbm_inference-0.1.0
creating custom_lightgbm_inference-0.1.0/src
creating custom_lightgbm_inference-0.1.0/src/custom_lightgbm_inference
creating custom_lightgbm_inference-0.1.0/src/custom_lightgbm_inference.egg-info
copying files to custom_lightgbm_inference-0.1.0...
copying setup.py -> custom_lightgbm_inference-0.1.0
copying src/custom_lightgbm_inference/__init__.py -> custom_lightgbm_inference-0.

In [14]:
!serve

Running handler service: custom_lightgbm_inference.handler
ERROR - Given model-path /opt/ml/model is not a valid directory. Point to a valid model-path directory.
Traceback (most recent call last):
  File "/home/ec2-user/anaconda3/envs/python3/bin/serve", line 11, in <module>
    load_entry_point('custom-lightgbm-inference', 'console_scripts', 'serve')()
  File "/home/ec2-user/SageMaker/sagemaker_custom/1_custom_inference/package/src/custom_lightgbm_inference/my_serving.py", line 10, in main
    model_server.start_model_server(handler_service=HANDLER_SERVICE)
  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker_inference/model_server.py", line 75, in start_model_server
    _adapt_to_mms_format(handler_service)
  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker_inference/model_server.py", line 122, in _adapt_to_mms_format
    subprocess.check_call(model_archiver_cmd)
  File "/home/ec2-user/anaconda3/envs/python3/lib/python3

### 1.4 Then, we can create the Dockerfile
Just pay attention to the packages we'll install in our container. Here, we'll use **SageMaker Inference Toolkit** (https://github.com/aws/sagemaker-inference-toolkit) and **SageMaker Training Toolkit** (https://github.com/aws/sagemaker-training-toolkit) to prepare the container for training/serving our model. **By serving** you can understand: exposing our model as a webservice that can be called through an api call.

In [15]:
# %%writefile Dockerfile
# FROM python:3.7-buster

# # Set a docker label to advertise multi-model support on the container
# LABEL com.amazonaws.sagemaker.capabilities.multi-models=false
# # Set a docker label to enable container to use SAGEMAKER_BIND_TO_PORT environment variable if present
# LABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true

# RUN apt-get update -y && apt-get -y install --no-install-recommends default-jdk
# RUN rm -rf /var/lib/apt/lists/*

# RUN pip --no-cache-dir install multi-model-server sagemaker-inference sagemaker-training
# RUN pip --no-cache-dir install pandas numpy scipy scikit-learn

# COPY dist/sagemaker-custom-0.1.0.tar.gz /sagemaker-custom-0.1.0.tar.gz
# RUN pip --no-cache install /sagemaker-custom-0.1.0.tar.gz && \
#     rm /sagemaker-custom-0.1.0.tar.gz

# ENV PYTHONUNBUFFERED=TRUE
# ENV PYTHONDONTWRITEBYTECODE=TRUE
# ENV PYTHONPATH="/opt/ml/code:${PATH}"

# #####################
# # Required ENV vars #
# #####################
# # Set SageMaker training environment variables
# ENV SM_INPUT /opt/ml/input
# ENV SM_INPUT_TRAINING_CONFIG_FILE $SM_INPUT/config/hyperparameters.json
# ENV SM_INPUT_DATA_CONFIG_FILE $SM_INPUT/config/inputdataconfig.json
# ENV SM_CHECKPOINT_CONFIG_FILE $SM_INPUT/config/checkpointconfig.json

# # Set SageMaker serving environment variables
# ENV SM_MODEL_DIR /opt/ml/model

# ENV CODE_DIR /opt/ml/code
# # COPY main.py $CODE_DIR/main.py
# COPY src/my_training.py $CODE_DIR/my_training.py
# COPY src/my_serving.py $CODE_DIR/my_serving.py

# COPY src/handler_service.py $CODE_DIR/handler_service.py
# COPY src/inference_handler.py $CODE_DIR/inference_handler.py

# ENV SAGEMAKER_TRAINING_MODULE my_training:main
# ENV SAGEMAKER_SERVING_MODULE my_serving:main

# # ENTRYPOINT ["python", "/opt/ml/code/main.py"]


In [16]:
%%writefile docker/Dockerfile
# !pygmentize docker/Dockerfile

# Part of the implementation of this container is based on the Amazon SageMaker Apache MXNet container.
# https://github.com/aws/sagemaker-mxnet-container
FROM sagemaker-training-containers/framework-container:latest

# Defining some variables used at build time to install Python3
ARG PYTHON=python3
ARG PYTHON_PIP=python3-pip
ARG PIP=pip3
ARG PYTHON_VERSION=3.6.6


# Framework Training docker
# COPY code/custom_lightgbm_framework-1.0.0.tar.gz /custom_lightgbm_framework-1.0.0.tar.gz

# Installing numpy, pandas, scikit-learn, scipy
# RUN ${PIP} install --no-cache --upgrade \
#         /custom_lightgbm_framework-1.0.0.tar.gz && \
#     rm /custom_lightgbm_framework-1.0.0.tar.gz

# Setting some environment variables.
# ENV PYTHONDONTWRITEBYTECODE=1 \
#     PYTHONUNBUFFERED=1 \
#     LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:/usr/local/lib" \
#     PYTHONIOENCODING=UTF-8 \
#     LANG=C.UTF-8 \
#     LC_ALL=C.UTF-8


# Set a docker label to advertise multi-model support on the container
LABEL com.amazonaws.sagemaker.capabilities.multi-models=false
# Set a docker label to enable container to use SAGEMAKER_BIND_TO_PORT environment variable if present
LABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true


# Previous inference docker
RUN apt-get update -y && apt-get -y install --no-install-recommends default-jdk
RUN rm -rf /var/lib/apt/lists/*

# RUN pip --no-cache-dir install multi-model-server sagemaker-inference sagemaker-training
# RUN pip --no-cache-dir install pandas numpy scipy scikit-learn

COPY code/custom_lightgbm_inference-0.1.0.tar.gz /custom_lightgbm_inference-0.1.0.tar.gz
        
RUN ${PIP} install --no-cache --upgrade \
        /custom_lightgbm_inference-0.1.0.tar.gz && \
    rm /custom_lightgbm_inference-0.1.0.tar.gz

# ENV PYTHONUNBUFFERED=TRUE
# ENV PYTHONDONTWRITEBYTECODE=TRUE
# ENV PYTHONPATH="/opt/ml/code:${PATH}"

#####################
# Required ENV vars #
#####################
# Set SageMaker training environment variables
# ENV SM_INPUT /opt/ml/input
# ENV SM_INPUT_TRAINING_CONFIG_FILE $SM_INPUT/config/hyperparameters.json
# ENV SM_INPUT_DATA_CONFIG_FILE $SM_INPUT/config/inputdataconfig.json
# ENV SM_CHECKPOINT_CONFIG_FILE $SM_INPUT/config/checkpointconfig.json

# Set SageMaker serving environment variables
ENV SM_MODEL_DIR /opt/ml/model

ENV CODE_DIR /opt/ml/code
# COPY main.py $CODE_DIR/main.py

# COPY src/my_serving.py $CODE_DIR/my_serving.py

# COPY src/handler_service.py $CODE_DIR/handler_service.py
# COPY src/inference_handler.py $CODE_DIR/inference_handler.py

# ENV SAGEMAKER_TRAINING_MODULE my_training:main

#Not injecting inference code, hence no need for env var
# ENV SAGEMAKER_SERVING_MODULE my_serving:main

# ENTRYPOINT ["python", "/opt/ml/code/main.py"]


Writing docker/Dockerfile


### 1.5 Finally, let's create the buildspec
This file will be used by CodeBuild for creating our Container image.  
With this file, CodeBuild will run the "docker build" command, using the assets we created above, and deploy the image to the Registry.  
As you can see, each command is a bash command that will be executed from inside a Linux Container.

In [17]:
%%writefile buildspec.yml
version: 0.2

phases:
  install:
    runtime-versions:
      docker: 18

  pre_build:
    commands:
      - echo Logging in to Amazon ECR...
      - $(aws ecr get-login --no-include-email --region $AWS_DEFAULT_REGION)
  build:
    commands:
      - echo Build started on `date`
      - echo Building the Docker image...
      - docker build -t $IMAGE_REPO_NAME:$IMAGE_TAG .
      - docker tag $IMAGE_REPO_NAME:$IMAGE_TAG $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG

  post_build:
    commands:
      - echo Build completed on `date`
      - echo Pushing the Docker image...
      - echo docker push $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG
      - docker push $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG
      - echo $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG > image.url
      - echo Done
artifacts:
  files:
    - image.url
  name: image_url
  discard-paths: yes

Overwriting buildspec.yml


## PART 2 - Local Test: Let's build the image locally and do some tests
### 2.1 Building the image locally, first
Each SageMaker Jupyter Notebook already has a **docker** envorinment pre-installed. So we can play with Docker containers just using the same environment.

Building and pushing to ECR

In [18]:
!docker build -f docker/Dockerfile -t iris_model:1.0 ./docker

Sending build context to Docker daemon   7.68kB
Step 1/13 : FROM sagemaker-training-containers/framework-container:latest
 ---> 6e6a47cf0f36
Step 2/13 : ARG PYTHON=python3
 ---> Using cache
 ---> a8cb49fa3ec4
Step 3/13 : ARG PYTHON_PIP=python3-pip
 ---> Using cache
 ---> 7096781dc9d8
Step 4/13 : ARG PIP=pip3
 ---> Using cache
 ---> 30b0cfbde1dc
Step 5/13 : ARG PYTHON_VERSION=3.6.6
 ---> Using cache
 ---> 65862c7be643
Step 6/13 : LABEL com.amazonaws.sagemaker.capabilities.multi-models=false
 ---> Using cache
 ---> 3c6656952914
Step 7/13 : LABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true
 ---> Using cache
 ---> 8c6ac00dd42a
Step 8/13 : RUN apt-get update -y && apt-get -y install --no-install-recommends default-jdk
 ---> Running in 076974884a35
Get:1 http://security.ubuntu.com/ubuntu xenial-security InRelease [109 kB]
Hit:2 http://archive.ubuntu.com/ubuntu xenial InRelease
Hit:3 http://ppa.launchpad.net/deadsnakes/ppa/ubuntu xenial InRelease
Get:4 http://security.ubuntu

In [19]:
! scripts/build_and_push.sh $account_id $region $ecr_repository_name

Building docker image...
Sending build context to Docker daemon   7.68kB
Step 1/13 : FROM sagemaker-training-containers/framework-container:latest
 ---> 6e6a47cf0f36
Step 2/13 : ARG PYTHON=python3
 ---> Using cache
 ---> a8cb49fa3ec4
Step 3/13 : ARG PYTHON_PIP=python3-pip
 ---> Using cache
 ---> 7096781dc9d8
Step 4/13 : ARG PIP=pip3
 ---> Using cache
 ---> 30b0cfbde1dc
Step 5/13 : ARG PYTHON_VERSION=3.6.6
 ---> Using cache
 ---> 65862c7be643
Step 6/13 : LABEL com.amazonaws.sagemaker.capabilities.multi-models=false
 ---> Using cache
 ---> 3c6656952914
Step 7/13 : LABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true
 ---> Using cache
 ---> 8c6ac00dd42a
Step 8/13 : RUN apt-get update -y && apt-get -y install --no-install-recommends default-jdk
 ---> Using cache
 ---> 26fc46725d87
Step 9/13 : RUN rm -rf /var/lib/apt/lists/*
 ---> Using cache
 ---> 18e1c6168f09
Step 10/13 : COPY code/custom_lightgbm_inference-0.1.0.tar.gz /custom_lightgbm_inference-0.1.0.tar.gz
 ---> Using ca

---
TESTS

In [20]:
container_image_uri = '725879053979.dkr.ecr.us-east-1.amazonaws.com/iris-model:latest'

In [21]:
train_config = 's3://sagemaker-us-east-1-725879053979/sagemaker-custom/data/iris_train.csv'
test_config = 's3://sagemaker-us-east-1-725879053979/sagemaker-custom/data/iris_test.csv'


In [22]:
sources = 's3://sagemaker-us-east-1-725879053979/sagemaker-custom/code/sourcedir.tar.gz'

In [23]:
import sagemaker
import json

# JSON encode hyperparameters.
def json_encode_hyperparameters(hyperparameters):
    return {str(k): json.dumps(v) for (k, v) in hyperparameters.items()}

hyperparameters = json_encode_hyperparameters({
    "sagemaker_program": "train.py",
    "sagemaker_submit_directory": sources})
#     "hp1": "value1",
#     "hp2": 300,
#     "hp3": 0.001}
# )

estimator = sagemaker.estimator.Estimator(container_image_uri,
                                    role,
                                    train_instance_count=1, 
                                    train_instance_type='local',
                                    base_job_name='iris',
                                    hyperparameters=hyperparameters,
                                         )



In [24]:
estimator.fit({'train': train_config, 'validation': test_config })



Creating tmpubddauzb_algo-1-l8422_1 ... 
[1BAttaching to tmpubddauzb_algo-1-l8422_12mdone[0m
[36malgo-1-l8422_1  |[0m 2020-08-11 17:46:10,870 sagemaker-training-toolkit INFO     Imported framework custom_lightgbm_framework.training
[36malgo-1-l8422_1  |[0m 2020-08-11 17:46:10,872 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)
[36malgo-1-l8422_1  |[0m 2020-08-11 17:46:10,885 custom_lightgbm_framework.training INFO     Invoking user training script.
[36malgo-1-l8422_1  |[0m 2020-08-11 17:46:11,038 sagemaker-training-toolkit INFO     Module train.py does not provide a setup.py. 
[36malgo-1-l8422_1  |[0m Generating setup.py
[36malgo-1-l8422_1  |[0m 2020-08-11 17:46:11,038 sagemaker-training-toolkit INFO     Generating setup.cfg
[36malgo-1-l8422_1  |[0m 2020-08-11 17:46:11,039 sagemaker-training-toolkit INFO     Generating MANIFEST.in
[36malgo-1-l8422_1  |[0m 2020-08-11 17:46:11,039 sagemaker-training-toolkit INFO     Installing module w

In [26]:
predictor = estimator.deploy(initial_instance_count=1,
                 instance_type='local',
                )



Attaching to tmp9g6c9zke_algo-1-5r1gr_1
[36malgo-1-5r1gr_1  |[0m Running handler service: custom_lightgbm_inference.handler
[36malgo-1-5r1gr_1  |[0m 2020-08-11 17:47:12,843 [INFO ] main com.amazonaws.ml.mms.ModelServer - 
[36malgo-1-5r1gr_1  |[0m MMS Home: /usr/local/lib/python3.6/site-packages
[36malgo-1-5r1gr_1  |[0m Current directory: /
[36malgo-1-5r1gr_1  |[0m Temp directory: /tmp
[36malgo-1-5r1gr_1  |[0m Number of GPUs: 0
[36malgo-1-5r1gr_1  |[0m Number of CPUs: 4
[36malgo-1-5r1gr_1  |[0m Max heap size: 3566 M
[36malgo-1-5r1gr_1  |[0m Python executable: /usr/local/bin/python3.6
[36malgo-1-5r1gr_1  |[0m Config file: /etc/sagemaker-mms.properties
[36malgo-1-5r1gr_1  |[0m Inference address: http://0.0.0.0:8080
[36malgo-1-5r1gr_1  |[0m Management address: http://0.0.0.0:8080
[36malgo-1-5r1gr_1  |[0m Model Store: /.sagemaker/mms/models
[36malgo-1-5r1gr_1  |[0m Initial Models: ALL
[36malgo-1-5r1gr_1  |[0m Log dir: /logs
[36malgo-1-5r1gr_1  |[0m Metrics di

In [41]:
import pandas as pd
import random
from sagemaker.predictor import csv_serializer, csv_deserializer

# configure the predictor to do everything for us
predictor.content_type = 'text/csv'
predictor.accept = 'text/csv'
predictor.serializer = csv_serializer
predictor.deserializer = None

# load the testing data from the validation csv
validation = pd.read_csv('../0_custom_train/notebook/data/iris_test.csv', header=None)
idx = random.randint(0,len(validation)-5)
req = validation.iloc[idx:idx+5].values

# cut a sample with 5 lines from our dataset and then split the label from the features.
X = req[:,0:-1].tolist()
y = req[:,-1].tolist()

# call the local endpoint
for features,label in zip(X,y):
    prediction = float(predictor.predict(features).decode('utf-8').strip())

    # compare the results
    print("\nRESULT: {} == {} ? {}\n".format( label, prediction, label == prediction ) )

[36malgo-1-5r1gr_1  |[0m 2020-08-11 17:56:20,175 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 1
[36malgo-1-5r1gr_1  |[0m 2020-08-11 17:56:20,176 [INFO ] W-9000-model ACCESS_LOG - /172.18.0.1:50784 "POST /invocations HTTP/1.1" 200 3

RESULT: 1.0 == 1.0 ? True

[36malgo-1-5r1gr_1  |[0m 2020-08-11 17:56:20,181 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 1
[36malgo-1-5r1gr_1  |[0m 2020-08-11 17:56:20,182 [INFO ] W-9000-model ACCESS_LOG - /172.18.0.1:50784 "POST /invocations HTTP/1.1" 200 2

RESULT: 0.0 == 0.0 ? True

[36malgo-1-5r1gr_1  |[0m 2020-08-11 17:56:20,185 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 1
[36malgo-1-5r1gr_1  |[0m 2020-08-11 17:56:20,185 [INFO ] W-9000-model ACCESS_LOG - /172.18.0.1:50784 "POST /invocations HTTP/1.1" 200 2

RESULT: 1.0 == 1.0 ? True

[36malgo-1-5r1gr_1  |[0m 2020-08-11 17:56:20,189 [INFO ] W-9000-model com.amazonaws.ml.mm

Look open source implementations of a few SageMaker containers:

https://github.com/aws/sagemaker-scikit-learn-container

https://github.com/aws/sagemaker-xgboost-container

---

### 2.2 Now that we have the algorithm image we can run it to train/deploy a model

### Then, we need to prepare the dataset
You'll see that we're splitting the dataset into training and validation and also saving these two subsets of the dataset into csv files. These files will be then uploaded to an S3 Bucket and shared with SageMaker.

In [21]:
!rm -rf input
!mkdir -p input/data/training
!mkdir -p input/data/testing

import pandas as pd
import numpy as np

from sklearn import datasets
from sklearn.model_selection import train_test_split

iris = datasets.load_iris()

dataset = np.insert(iris.data, 0, iris.target,axis=1)

df = pd.DataFrame(data=dataset, columns=["iris_id"] + iris.feature_names)
X = df.iloc[:,1:]
y = df.iloc[:,0]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

train_df = X_train.copy()
train_df.insert(0, "iris_id", y_train)
train_df.to_csv("input/data/training/training.csv", sep=",", header=None, index=None)

test_df = X_test.copy()
test_df.insert(0, "iris_id", y_test)
test_df.to_csv("input/data/testing/testing.csv", sep=",", header=None, index=None)

df.head()

Unnamed: 0,iris_id,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm)
0,0.0,5.1,3.5,1.4,0.2
1,0.0,4.9,3.0,1.4,0.2
2,0.0,4.7,3.2,1.3,0.2
3,0.0,4.6,3.1,1.5,0.2
4,0.0,5.0,3.6,1.4,0.2


### 2.3 Just a basic local test, using the local Docker daemon
Here we will simulate SageMaker calling our docker container for training and serving. We'll do that using the built-in Docker Daemon of the Jupyter Notebook Instance.

In [22]:
!rm -rf input/config && mkdir -p input/config

In [23]:
%%writefile input/config/hyperparameters.json
{"max_depth": 20, "n_jobs": 4, "n_estimators": 120}

Writing input/config/hyperparameters.json


In [24]:
%%writefile input/config/resourceconfig.json
{"current_host": "localhost", "hosts": ["algo-1-kipw9"]}

Writing input/config/resourceconfig.json


In [25]:
%%writefile input/config/inputdataconfig.json
{"training": {"TrainingInputMode": "File"}, "testing": {"TrainingInputMode": "File"}}

Writing input/config/inputdataconfig.json


In [26]:
%%time
!rm -rf model/
!mkdir -p model

print( "Training...")
!docker run --rm --name "my_model" \
    -v "$PWD/model:/opt/ml/model" \
    -v "$PWD/input:/opt/ml/input" iris_model:1.0 train

Training...
2020-07-28 15:53:41,354 sagemaker-training-toolkit INFO     Imported framework my_training
CPU times: user 39.6 ms, sys: 13.9 ms, total: 53.5 ms
Wall time: 2.21 s


### 2.4 This is the serving test. It simulates an Endpoint exposed by Sagemaker

After you execute the next cell, this Jupyter notebook will freeze. A webservice will be exposed at the port 8080. 

In [28]:
!docker run --rm --name "my_model" \
    -p 8080:8080 \
    -v "$PWD/model:/opt/ml/model" \
    -v "$PWD/input:/opt/ml/input" iris_model:1.0 serve

Traceback (most recent call last):
  File "/usr/local/bin/serve", line 5, in <module>
    from custom_inference.cli.init_serve import main
ModuleNotFoundError: No module named 'custom_inference'


> While the above cell is running, click here [TEST NOTEBOOK](02_Testing%20our%20local%20model%20server.ipynb) to run some tests.

> After you finish the tests, press **STOP**

## PART 3 - Integrated Test: Everything seems ok, now it's time to put all together

We'll start by running a local **CodeBuild** test, to check the buildspec and also deploy this image into the container registry. Remember that SageMaker will only see images published to ECR.


In [43]:
import boto3

sts_client = boto3.client("sts")
session = boto3.session.Session()

account_id = sts_client.get_caller_identity()["Account"]
region = session.region_name
credentials = session.get_credentials()
credentials = credentials.get_frozen_credentials()

repo_name="iris-model"
image_tag="test"

In [None]:
!sudo rm -rf tests && mkdir -p tests
# !cp handler.py main.py train.py Dockerfile buildspec.yml tests/
!cp handler_service.py inference_handler.py serving.py training.py Dockerfile buildspec.yml tests/
with open("tests/vars.env", "w") as f:
    f.write("AWS_ACCOUNT_ID=%s\n" % account_id)
    f.write("IMAGE_TAG=%s\n" % image_tag)
    f.write("IMAGE_REPO_NAME=%s\n" % repo_name)
    f.write("AWS_DEFAULT_REGION=%s\n" % region)
    f.write("AWS_ACCESS_KEY_ID=%s\n" % credentials.access_key)
    f.write("AWS_SECRET_ACCESS_KEY=%s\n" % credentials.secret_key)
    f.write("AWS_SESSION_TOKEN=%s\n" % credentials.token )
    f.close()

!cat tests/vars.env

In [20]:
%%time

!/tmp/aws-codebuild/local_builds/codebuild_build.sh \
    -a "$PWD/tests/output" \
    -s "$PWD/tests" \
    -i "samirsouza/aws-codebuild-standard:3.0" \
    -e "$PWD/tests/vars.env" \
    -c

Build Command:

docker run -it -v /var/run/docker.sock:/var/run/docker.sock -e "IMAGE_NAME=samirsouza/aws-codebuild-standard:3.0" -e "ARTIFACTS=/home/ec2-user/SageMaker/tmp_ars/amazon-sagemaker-mlops-workshop/lab/01_CreateAlgorithmContainer/scikit_based/tests/output" -e "SOURCE=/home/ec2-user/SageMaker/tmp_ars/amazon-sagemaker-mlops-workshop/lab/01_CreateAlgorithmContainer/scikit_based/tests" -v "/home/ec2-user/SageMaker/tmp_ars/amazon-sagemaker-mlops-workshop/lab/01_CreateAlgorithmContainer/scikit_based/tests:/LocalBuild/envFile/" -e "ENV_VAR_FILE=vars.env" -e "AWS_CONFIGURATION=/home/ec2-user/.aws" -e "AWS_CLOUDWATCH_HOME=/opt/aws/apitools/mon" -e "AWS_PATH=/opt/aws" -e "AWS_AUTO_SCALING_HOME=/opt/aws/apitools/as" -e "AWS_ELB_HOME=/opt/aws/apitools/elb" -e "INITIATOR=ec2-user" amazon/aws-codebuild-local:latest

Removing agent-resources_build_1 ... 
Removing agent-resources_agent_1 ... 
[2BRemoving network agent-resources_defaultne[0m
Removing volume agent-resources_source_volume
Re

> Now that we have an image deployed in the ECR repo we can also run some local tests using the SageMaker Estimator.

> Click on this [TEST NOTEBOOK](03_Testing%20the%20container%20using%20SageMaker%20Estimator.ipynb) to run some tests.

> After you finishing the tests, come back to **this notebook** to push the assets to the Git Repo


## PART 4 - Let's push all the assets to the Git Repo connected to the Build pipeline
There is a CodePipeine configured to keep listeining to this Git Repo and start a new Building process with CodeBuild.

In [None]:
%%bash
cd ../../../mlops
git checkout iris_model
cp $OLDPWD/buildspec.yml $OLDPWD/handler.py $OLDPWD/train.py $OLDPWD/main.py $OLDPWD/Dockerfile .

git add --all
git commit -a -m " - files for building an iris model image"
git push

> Alright, now open the AWS console and go to the **CodePipeline** dashboard. Look for a pipeline called **mlops-iris-model**. This pipeline will deploy the final image to an ECR repo. When this process finishes, open the **Elastic Compute Registry** dashboard, in the AWS console, and check if you have an image called **iris-model:latest**. If yes, you can go to the next exercise. If not, wait a little more.