# Building a docker container for training/deploying our classifier
In this exercise we'll create a Docker image that will have the required code for training and deploying a ML model. In this particular example, we'll use scikit-learn (https://scikit-learn.org/) and the Random Forest Tree implementation of that library to train a flower classifier. The dataset used in this experiment is a toy dataset called Iris (http://archive.ics.uci.edu/ml/datasets/iris). The challenge itself is very basic, so you can focus on the mechanics and the features of this automated environment.

A first pipeline will be executed at the end of this exercise, automatically. It will get the assets you'll push to a Git repo, build this image and push it to ECR, a docker image repository, used by SageMaker.

Question: Why would I create a Scikit-learn container from scratch if SageMaker already offerst one (https://docs.aws.amazon.com/sagemaker/latest/dg/sklearn.html).
Answer: This is an exercise and the idea here is also to show you how you can create your own container. In a real-life scenario, the best approach is to use the native container offered by SageMaker.

## PART 1 - Creating the assets required to build/test a docker image
### 1.1 Let's start by creating the training script!
As you can see, this is a very basic example of Scikit-Learn. Nothing fancy.

In [1]:
%%writefile train.py
import os
import sys
import pandas as pd
import re
import joblib
import json
from sklearn.ensemble import RandomForestClassifier

def load_dataset(path):
    # Take the set of files and read them all into a single pandas dataframe
    files = [ os.path.join(path, file) for file in os.listdir(path) ]
    
    if len(files) == 0:
        raise ValueError("Invalid # of files in dir: {}".format(path))

    raw_data = [ pd.read_csv(file, sep=",", header=None ) for file in files ]
    data = pd.concat(raw_data)

    # labels are in the first column
    y = data.iloc[:,0]
    X = data.iloc[:,1:]
    return X,y
    
def start(args):
    print("Training mode")

    try:
        X_train, y_train = load_dataset(args.train)
        X_test, y_test = load_dataset(args.validation)
        
        hyperparameters = {
            "max_depth": args.max_depth,
            "verbose": 1, # show all logs
            "n_jobs": args.n_jobs,
            "n_estimators": args.n_estimators
        }
        print("Training the classifier")
        model = RandomForestClassifier()
        model.set_params(**hyperparameters)
        model.fit(X_train, y_train)
        print("Score: {}".format( model.score(X_test, y_test)) )
        joblib.dump(model, open(os.path.join(args.model_dir, "iris_model.pkl"), "wb"))
    
    except Exception as e:
        # Write out an error file. This will be returned as the failureReason in the
        # DescribeTrainingJob result.
        trc = traceback.format_exc()
        with open(os.path.join(args.output_dir, "failure"), "w") as s:
            s.write("Exception during training: " + str(e) + "\\n" + trc)
            
        # Printing this causes the exception to be in the training job logs, as well.
        print("Exception during training: " + str(e) + "\\n" + trc, file=sys.stderr)
        
        # A non-zero exit code causes the training job to be marked as Failed.
        sys.exit(255)

Overwriting train.py


### 1.2 Ok. Lets then create the handler. The Inference Handler is how we use the SageMaker Inference Toolkit to encapsulate our code and expose it as a SageMaker container.
SageMaker Inference Toolkit: https://github.com/aws/sagemaker-inference-toolkit

In [2]:
%%writefile handler.py
import os
import sys
import joblib
from sagemaker_inference.default_inference_handler import DefaultInferenceHandler
from sagemaker_inference.default_handler_service import DefaultHandlerService
from sagemaker_inference import content_types, errors, transformer, encoder, decoder

class HandlerService(DefaultHandlerService, DefaultInferenceHandler):
    def __init__(self):
        op = transformer.Transformer(default_inference_handler=self)
        super(HandlerService, self).__init__(transformer=op)
    
    ## Loads the model from the disk
    def default_model_fn(self, model_dir):
        model_filename = os.path.join(model_dir, "iris_model.pkl")
        return joblib.load(open(model_filename, "rb"))
    
    ## Parse and check the format of the input data
    def default_input_fn(self, input_data, content_type):
        if content_type != "text/csv":
            raise Exception("Invalid content-type: %s" % content_type)
        return decoder.decode(input_data, content_type).reshape(1,-1)
    
    ## Run our model and do the prediction
    def default_predict_fn(self, payload, model):
        return model.predict( payload ).tolist()
    
    ## Gets the prediction output and format it to be returned to the user
    def default_output_fn(self, prediction, accept):
        if accept != "text/csv":
            raise Exception("Invalid accept: %s" % accept)
        return encoder.encode(prediction, accept)


Overwriting handler.py


### 1.3 Now we need to create the entrypoint of our container. The main function
We'll use SageMaker Training Toolkit (https://github.com/aws/sagemaker-training-toolkit) to work with the arguments and environment variables defined by SageMaker. This library will make our code simpler.

In [3]:
%%writefile main.py
import train
import argparse
import sys
import os
import traceback
from sagemaker_inference import model_server
from sagemaker_training import environment

if __name__ == "__main__":
    if len(sys.argv) < 2 or ( not sys.argv[1] in [ "serve", "train" ] ):
        raise Exception("Invalid argument: you must inform 'train' for training mode or 'serve' predicting mode") 
        
    if sys.argv[1] == "train":
        
        env = environment.Environment()
        
        parser = argparse.ArgumentParser()
        # https://github.com/aws/sagemaker-training-toolkit/blob/master/ENVIRONMENT_VARIABLES.md
        parser.add_argument("--max-depth", type=int, default=10)
        parser.add_argument("--n-jobs", type=int, default=env.num_cpus)
        parser.add_argument("--n-estimators", type=int, default=120)
        
        # reads input channels training and testing from the environment variables
        parser.add_argument("--train", type=str, default=env.channel_input_dirs["train"])
        parser.add_argument("--validation", type=str, default=env.channel_input_dirs["validation"])

        parser.add_argument("--model-dir", type=str, default=env.model_dir)
        parser.add_argument("--output-dir", type=str, default=env.output_dir)
        
        args,unknown = parser.parse_known_args()
        train.start(args)
    else:
        model_server.start_model_server(handler_service="serving.handler")

Overwriting main.py


### 1.4 Then, we can create the Dockerfile
Just pay attention to the packages we'll install in our container. Here, we'll use SageMaker Inference Toolkit (https://github.com/aws/sagemaker-inference-toolkit) and SageMaker Training Toolkit (https://github.com/aws/sagemaker-training-toolkit) to prepare the container for training/serving our model. By serving you can understand: exposing our model as a webservice that can be called through an api call.

In [5]:
%%writefile Dockerfile
FROM python:3.7-buster

# Set a docker label to advertise multi-model support on the container
LABEL com.amazonaws.sagemaker.capabilities.multi-models=false
# Set a docker label to enable container to use SAGEMAKER_BIND_TO_PORT environment variable if present
LABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true

RUN apt-get update -y && apt-get -y install --no-install-recommends default-jdk
RUN rm -rf /var/lib/apt/lists/*

RUN pip --no-cache-dir install multi-model-server sagemaker-inference sagemaker-training
RUN pip --no-cache-dir install pandas numpy scipy scikit-learn

ENV PYTHONUNBUFFERED=TRUE
ENV PYTHONDONTWRITEBYTECODE=TRUE
ENV PYTHONPATH="/opt/ml/code:${PATH}"

COPY main.py /opt/ml/code/main.py
COPY train.py /opt/ml/code/train.py
COPY handler.py /opt/ml/code/serving/handler.py

ENTRYPOINT ["python", "/opt/ml/code/main.py"]

Overwriting Dockerfile


### 1.5 Finally, let's create the buildspec
This file will be used by CodeBuild for creating our Container image.
With this file, CodeBuild will run the "docker build" command, using the assets we created above, and deploy the image to the Registry.
As you can see, each command is a bash command that will be executed from inside a Linux Container.

In [6]:
%%writefile buildspec.yml
version: 0.2

phases:
  install:
    runtime-versions:
      docker: 18

  pre_build:
    commands:
      - echo Logging in to Amazon ECR...
      - $(aws ecr get-login --no-include-email --region $AWS_DEFAULT_REGION)
  build:
    commands:
      - echo Build started on `date`
      - echo Building the Docker image...
      - docker build -t $IMAGE_REPO_NAME:$IMAGE_TAG .
      - docker tag $IMAGE_REPO_NAME:$IMAGE_TAG $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG

  post_build:
    commands:
      - echo Build completed on `date`
      - echo Pushing the Docker image...
      - echo docker push $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG
      - docker push $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG
      - echo $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG > image.url
      - echo Done
artifacts:
  files:
    - image.url
  name: image_url
  discard-paths: yes

Overwriting buildspec.yml


## PART 2 - Local Test: Let's build the image locally and do some tests
### 2.1 Building the image locally, first
Each SageMaker Jupyter Notebook already has a docker envorinment pre-installed. So we can play with Docker containers just using the same environment.

In [7]:
!docker build -f Dockerfile -t iris_model:1.0 .

Sending build context to Docker daemon    233kB
Step 1/14 : FROM python:3.7-buster
 ---> de1fe4b12444
Step 2/14 : LABEL com.amazonaws.sagemaker.capabilities.multi-models=false
 ---> Using cache
 ---> 7f7394407f70
Step 3/14 : LABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true
 ---> Running in db7d90104242
Removing intermediate container db7d90104242
 ---> b6962f4b2fb1
Step 4/14 : RUN apt-get update -y && apt-get -y install --no-install-recommends default-jdk
 ---> Running in 07407b4da94d
Get:1 http://deb.debian.org/debian buster InRelease [122 kB]
Get:2 http://deb.debian.org/debian-security buster/updates InRelease [34.8 kB]
Get:3 http://deb.debian.org/debian buster-updates InRelease [56.6 kB]
Get:4 http://deb.debian.org/debian buster/main amd64 Packages [7911 kB]
Get:5 http://deb.debian.org/debian-security buster/updates/main amd64 Packages [347 kB]
Get:6 http://deb.debian.org/debian buster-updates/main amd64 Packages [8788 B]
Fetched 8479 kB in 2s (5022 kB/s)
Reading 

Selecting previously unselected package libavahi-common3:amd64.
Preparing to unpack .../02-libavahi-common3_0.7-4+deb10u1_amd64.deb ...
Unpacking libavahi-common3:amd64 (0.7-4+deb10u1) ...
Selecting previously unselected package libdbus-1-3:amd64.
Preparing to unpack .../03-libdbus-1-3_1.12.20-0+deb10u1_amd64.deb ...
Unpacking libdbus-1-3:amd64 (1.12.20-0+deb10u1) ...
Selecting previously unselected package libavahi-client3:amd64.
Preparing to unpack .../04-libavahi-client3_0.7-4+deb10u1_amd64.deb ...
Unpacking libavahi-client3:amd64 (0.7-4+deb10u1) ...
Selecting previously unselected package libcups2:amd64.
Preparing to unpack .../05-libcups2_2.2.10-6+deb10u6_amd64.deb ...
Unpacking libcups2:amd64 (2.2.10-6+deb10u6) ...
Selecting previously unselected package libnspr4:amd64.
Preparing to unpack .../06-libnspr4_2%3a4.20-1_amd64.deb ...
Unpacking libnspr4:amd64 (2:4.20-1) ...
Selecting previously unselected package libnss3:amd64.
Preparing to unpack .../07-libnss3_2%3a3.42.1-1+deb10u5_a

Setting up libpciaccess0:amd64 (0.14-1) ...
Setting up libxi6:amd64 (2:1.7.9-1) ...
Setting up java-common (0.71) ...
Setting up libglvnd0:amd64 (1.1.0-1) ...
Setting up libxtst6:amd64 (2:1.2.3-1) ...
Setting up libxcb-glx0:amd64 (1.13.1-2) ...
Setting up libsensors-config (1:3.5.0-3) ...
Setting up libxxf86vm1:amd64 (1:1.1.4-1+b2) ...
Setting up libxcb-present0:amd64 (1.13.1-2) ...
Setting up libasound2-data (1.1.8-1) ...
Setting up libnspr4:amd64 (2:4.20-1) ...
Setting up libxfixes3:amd64 (1:5.0.3-1) ...
Setting up libxcb-sync1:amd64 (1.13.1-2) ...
Setting up libavahi-common-data:amd64 (0.7-4+deb10u1) ...
Setting up libdbus-1-3:amd64 (1.12.20-0+deb10u1) ...
Setting up libpcsclite1:amd64 (1.8.24-1) ...
Setting up libsensors5:amd64 (1:3.5.0-3) ...
Setting up libglapi-mesa:amd64 (18.3.6-2+deb10u1) ...
Setting up libxcb-dri2-0:amd64 (1.13.1-2) ...
Setting up libgif7:amd64 (5.1.4-3) ...
Setting up libxshmfence1:amd64 (1.3-1) ...
Setting up libasound2:amd64 (1.1.8-1) ...
Setting up libllvm

done.
Setting up default-jre-headless (2:1.11-71) ...
Processing triggers for hicolor-icon-theme (0.17-2) ...
Processing triggers for libc-bin (2.28-10+deb10u1) ...
Processing triggers for ca-certificates (20200601~deb10u2) ...
Updating certificates in /etc/ssl/certs...
0 added, 0 removed; done.
Running hooks in /etc/ca-certificates/update.d...

done.
done.
Processing triggers for mime-support (3.62) ...
Setting up openjdk-11-jre-headless:amd64 (11.0.16+8-1~deb10u1) ...
update-alternatives: using /usr/lib/jvm/java-11-openjdk-amd64/bin/java to provide /usr/bin/java (java) in auto mode
update-alternatives: using /usr/lib/jvm/java-11-openjdk-amd64/bin/jjs to provide /usr/bin/jjs (jjs) in auto mode
update-alternatives: using /usr/lib/jvm/java-11-openjdk-amd64/bin/keytool to provide /usr/bin/keytool (keytool) in auto mode
update-alternatives: using /usr/lib/jvm/java-11-openjdk-amd64/bin/rmid to provide /usr/bin/rmid (rmid) in auto mode
update-alternatives: using /usr/lib/jvm/java-11-openjdk

Collecting protobuf<3.20,>=3.9.2
  Downloading protobuf-3.19.4-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.1/1.1 MB 273.0 MB/s eta 0:00:00
Collecting cryptography>=2.5
  Downloading cryptography-37.0.4-cp36-abi3-manylinux_2_24_x86_64.whl (4.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.1/4.1 MB 93.0 MB/s eta 0:00:00
Collecting bcrypt>=3.1.3
  Downloading bcrypt-4.0.0-cp36-abi3-manylinux_2_28_x86_64.whl (594 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 594.4/594.4 KB 272.3 MB/s eta 0:00:00
Collecting pynacl>=1.0.1
  Downloading PyNaCl-1.5.0-cp36-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (856 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 856.7/856.7 KB 161.9 MB/s eta 0:00:00
Collecting MarkupSafe>=2.1.1
  Downloading MarkupSafe-2.1.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (25 kB)
Collecting botocore<1.28.0,>=1.27.64
  Downloading botocore-1.27.64-py3-none-a

You should consider upgrading via the '/usr/local/bin/python -m pip install --upgrade pip' command.
[0mRemoving intermediate container c6ff4ba230af
 ---> 3b65fc9296bb
Step 8/14 : ENV PYTHONUNBUFFERED=TRUE
 ---> Running in 4eaf51ecd6af
Removing intermediate container 4eaf51ecd6af
 ---> f9d171426e2c
Step 9/14 : ENV PYTHONDONTWRITEBYTECODE=TRUE
 ---> Running in 08a242a089bf
Removing intermediate container 08a242a089bf
 ---> f18ef4a1c1db
Step 10/14 : ENV PYTHONPATH="/opt/ml/code:${PATH}"
 ---> Running in a2ac724f0932
Removing intermediate container a2ac724f0932
 ---> ad291ebd58ce
Step 11/14 : COPY main.py /opt/ml/code/main.py
 ---> 8c54bd046f39
Step 12/14 : COPY train.py /opt/ml/code/train.py
 ---> e442e17d2d95
Step 13/14 : COPY handler.py /opt/ml/code/serving/handler.py
 ---> cee1f2c9a78e
Step 14/14 : ENTRYPOINT ["python", "/opt/ml/code/main.py"]
 ---> Running in f1228ec0ce12
Removing intermediate container f1228ec0ce12
 ---> 4d5e03e7f9d2
Successfully built 4d5e03e7f9d2
Successfully tagg

### 2.2 Now that we have the algorithm image we can run it to train/deploy a model
### Then, we need to prepare the dataset
You'll see that we're splitting the dataset into training and validation and also saving these two subsets of the dataset into csv files. These files will be then uploaded to an S3 Bucket and shared with SageMaker.

In [8]:
!rm -rf input
!mkdir -p input/data/train
!mkdir -p input/data/validation

import pandas as pd
import numpy as np

from sklearn import datasets
from sklearn.model_selection import train_test_split

iris = datasets.load_iris()

dataset = np.insert(iris.data, 0, iris.target, axis=1)

df= pd.DataFrame(data=dataset, columns=['iris_id'] + iris.feature_names)
X = df.iloc[:, 1:]
y = df.iloc[:, 0]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=123)

train_df = X_train.copy()
train_df.insert(0, "iris_id", y_train)
train_df.to_csv("input/data/train/training.csv", sep=',', header=None, index=None)

test_df = X_test.copy()
test_df.insert(0, "iris_id", y_test)
test_df.to_csv("input/data/validation/testing.csv", sep=',', header=None, index=None)

df.head()

Unnamed: 0,iris_id,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm)
0,0.0,5.1,3.5,1.4,0.2
1,0.0,4.9,3.0,1.4,0.2
2,0.0,4.7,3.2,1.3,0.2
3,0.0,4.6,3.1,1.5,0.2
4,0.0,5.0,3.6,1.4,0.2


### 2.3 Just a basic local test, using the local Docker daemon
Here we will simulate SageMaker calling our docker container for training and serving. We'll do that using the built-in Docker Daemon of the Jupyter Notebook Instance.

In [9]:
!rm -rf input/config && mkdir -p input/config

In [10]:
%%writefile input/config/hyperparameters.json
{"max_depth":20, "n_jobs":4, "n_setimators":120}

Writing input/config/hyperparameters.json


In [11]:
%%writefile input/config/resourceconfig.json
{"current_host":"localhost", "hosts":["algo-1-kipw9"]}

Writing input/config/resourceconfig.json


In [12]:
%%writefile input/config/inputdataconfig.json
{"train":{"TrainingInputMode":"File"}, "validation":{"TrainingInputMode":"File"}}

Writing input/config/inputdataconfig.json


In [13]:
%%time
!rm -rf model
!mkdir -p model
print("Training...")
!docker run --rm --name "my_model" \
    -v "$PWD/model:/opt/ml/model" \
    -v "$PWD/model:/opt/ml/output" \
    -v "$PWD/input:/opt/ml/input" iris_model:1.0 train

Training...
Training mode
Training the classifier
[Parallel(n_jobs=2)]: Using backend ThreadingBackend with 2 concurrent workers.
[Parallel(n_jobs=2)]: Done  46 tasks      | elapsed:    0.1s
[Parallel(n_jobs=2)]: Done 120 out of 120 | elapsed:    0.1s finished
[Parallel(n_jobs=2)]: Using backend ThreadingBackend with 2 concurrent workers.
[Parallel(n_jobs=2)]: Done  46 tasks      | elapsed:    0.0s
[Parallel(n_jobs=2)]: Done 120 out of 120 | elapsed:    0.0s finished
Score: 0.94
CPU times: user 35.7 ms, sys: 20.8 ms, total: 56.5 ms
Wall time: 2.53 s


### 2.4 This is the serving test. It simulates an Endpoint exposed by Sagemaker
After you execute the next cell, this Jupyter notebook will freeze. A webservice will be exposed at the port 8080.

In [14]:
!docker run --rm --name "my_model" \
    -p 8080:8080 \
    -v "$PWD/model:/opt/ml/model" \
    -v "$PWD/input:/opt/ml/input" iris_model:1.0 serve

2022-09-01T19:11:10,933 [INFO ] main com.amazonaws.ml.mms.ModelServer - 
MMS Home: /usr/local/lib/python3.7/site-packages
Current directory: /
Temp directory: /tmp
Number of GPUs: 0
Number of CPUs: 2
Max heap size: 966 M
Python executable: /usr/local/bin/python
Config file: /etc/sagemaker-mms.properties
Inference address: http://0.0.0.0:8080
Management address: http://0.0.0.0:8080
Model Store: /.sagemaker/mms/models
Initial Models: ALL
Log dir: null
Metrics dir: null
Netty threads: 0
Netty client threads: 0
Default workers per model: 2
Blacklist Regex: N/A
Maximum Response Size: 6553500
Maximum Request Size: 6553500
Preload model: false
Prefer direct buffer: false
2022-09-01T19:11:11,078 [WARN ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerLifeCycle - attachIOStreams() threadName=W-9000-model
2022-09-01T19:11:11,217 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - model_service_worker started with args: --sock-type unix --sock-name /tmp/.mms.sock.9000 --handler ser

 While the above cell is running, run some tests using **02_test_local_model_server.ipynb**

After you finish the tests, press **STOP**

## PART 3 - Integrated Test: Everything seems ok, now it's time to put all together
We'll start by running a local CodeBuild test, to check the buildspec and also deploy this image into the container registry. Remember that SageMaker will only see images published to ECR.

In [16]:
import boto3

sts_client = boto3.client("sts")
session = boto3.session.Session()

account_id = sts_client.get_caller_identity()["Account"]
region = session.region_name
credentials = session.get_credentials()
credentials = credentials.get_frozen_credentials()

repo_name = "iris-model"
image_tag = "test"

In [17]:
!sudo rm -rf tests && mkdir -p tests
!cp handler.py main.py train.py Dockerfile buildspec.yml tests/
with open("tests/vars.env", "w") as f:
    f.write("AWS_ACCOUNT_ID=%s\n" % account_id)
    f.write("IMAGE_TAG=%s\n" % image_tag)
    f.write("IMAGE_REPO_NAME=%s\n" % repo_name)
    f.write("AWS_DEFAULT_REGION=%s\n" % region)
    f.write("AWS_ACCESS_KEY_ID=%s\n" % credentials.access_key)
    f.write("AWS_SECRET_ACCESS_KEY=%s\n" % credentials.secret_key)
    f.write("AWS_SESSION_TOKEN=%s\n" % credentials.token )
    f.close()

!cat tests/vars.env

AWS_ACCOUNT_ID=523666378432
IMAGE_TAG=test
IMAGE_REPO_NAME=iris-model
AWS_DEFAULT_REGION=ap-south-1
AWS_ACCESS_KEY_ID=ASIAXT3HSZ3AFH2TSJOT
AWS_SECRET_ACCESS_KEY=5lXiwOzgd0w3g8uIPwKePFOmOECzadl+bk/o0DNX
AWS_SESSION_TOKEN=IQoJb3JpZ2luX2VjEPP//////////wEaCmFwLXNvdXRoLTEiSDBGAiEAoPzylGteK7I7wcyIATFvglL4oOKN9S8Wq8ODOHG3hvkCIQCSKqWPJWGtPKLH71GeCdGpVJHf0BMXs1AuvBfnxOw9hCq0Agh8EAAaDDUyMzY2NjM3ODQzMiIMLim1Jju7XbaVnJoNKpECCcmZKCCoD2CuDZvJ+Yc5J9bztQUHHEFTv3wQtQ/+nDw+nHumNjiiMknkkXbYZ4Rq/o8jUx9jzhlFTU4vBg64egVIWqpQc/PelN5CkJL3LCbsAg3dhDJ4p7ngkASuA6FCRTT0tOVdgSpX686zUPljL7yPiEblinGAcznAoBOvLy4Wo08v5YR8ZKMS5LwGUM/jRh1gD42Oc6wBmoGNaAoJhN6zNI51LuIDdHqYEeC9xaYlaQUAkyiG81MNy80MnyEprb+aJ2DgsO9X1GApMnNhHGKQR0gxIpqtFjZvgGXWs+HdnFloZfWarjyDpdTipVYbzH7QQntoA/bPxrKPRXvTRYRYhf83M7XY6C76H7EkvTd3MJyExJgGOpIBa/0SQXzo7eEqxri6dppuH1B7qLbrt2co1D0/iKbOcAdGgVhytWxuzTdbBN/AKRqXT8XN4utx3OiU4rPa9mS8xeXMuOGpaZal/dlcMJCAEHZMx0kv58M4cL2dXPVNk10DQkEVyXGTpyDB9u4hNO/uJjV33prUFunz3KO4zhdiPSjsmA4VGMipNC2PJsCD4sKyfB0=


In [20]:
%%time

!/tmp/aws-codebuild/local_builds/codebuild_build.sh \
    -a "$PWD/tests/output" \
    -s "$PWD/tests" \
    -i "samirsouza/aws-codebuild-standard:3.0" \
    -e "$PWD/tests/vars.env" \
    -c

/bin/bash: /tmp/aws-codebuild/local_builds/codebuild_build.sh: No such file or directory
CPU times: user 6.15 ms, sys: 3.79 ms, total: 9.94 ms
Wall time: 122 ms


Now that we have an image deployed in the ECR repo we can also run some local tests using the SageMaker Estimator.

Click on this TEST NOTEBOOK to run some tests.

After you finishing the tests, come back to this notebook to push the assets to the Git Repo

## PART 4 - Let's push all the assets to the Git Repo connected to the Build pipeline
There is a CodePipeine configured to keep listeining to this Git Repo and start a new Building process with CodeBuild.

In [19]:
%%bash
cd ../../../mlops
git checkout iris_model
cp $OLDPWD/buildspec.yml $OLDPWD/handler.py $OLDPWD/train.py $OLDPWD/main.py $OLDPWD/Dockerfile .

git add --all
git commit -a -m " - files for building an iris model image"
git push

[aws 390fe2f]  - files for building an iris model image
 24 files changed, 3714 insertions(+), 12 deletions(-)
 create mode 100644 01_CreateAlgorithmContainer/.ipynb_checkpoints/02_test_local_model_server-checkpoint.ipynb
 create mode 100644 01_CreateAlgorithmContainer/.ipynb_checkpoints/03_test_container_using_SageMaker-checkpoint.ipynb
 create mode 100644 01_CreateAlgorithmContainer/.ipynb_checkpoints/Untitled-checkpoint.ipynb
 create mode 100644 01_CreateAlgorithmContainer/02_test_local_model_server.ipynb
 create mode 100644 01_CreateAlgorithmContainer/03_test_container_using_SageMaker.ipynb
 create mode 100644 01_CreateAlgorithmContainer/Dockerfile
 create mode 100644 01_CreateAlgorithmContainer/Untitled.ipynb
 create mode 100644 01_CreateAlgorithmContainer/buildspec.yml
 create mode 100644 01_CreateAlgorithmContainer/handler.py
 create mode 100644 01_CreateAlgorithmContainer/input/config/hyperparameters.json
 create mode 100644 01_CreateAlgorithmContainer/input/config/inputdatacon

bash: line 1: cd: ../../../mlops: No such file or directory
error: pathspec 'iris_model' did not match any file(s) known to git
cp: cannot stat ‘/buildspec.yml’: No such file or directory
cp: cannot stat ‘/handler.py’: No such file or directory
cp: cannot stat ‘/train.py’: No such file or directory
cp: cannot stat ‘/main.py’: No such file or directory
cp: cannot stat ‘/Dockerfile’: No such file or directory
fatal: could not read Username for 'https://github.com/CrookedNoob/aws_mlops_practice.git': No such device or address


CalledProcessError: Command 'b'cd ../../../mlops\ngit checkout iris_model\ncp $OLDPWD/buildspec.yml $OLDPWD/handler.py $OLDPWD/train.py $OLDPWD/main.py $OLDPWD/Dockerfile .\n\ngit add --all\ngit commit -a -m " - files for building an iris model image"\ngit push\n'' returned non-zero exit status 128.