# Deploy Scikit-learn Models to Amazon SageMaker with the SageMaker Python SDK using Script mode
> The aim of this notebook is to demonstrate how to train and deploy a scikit-learn model in Amazon SageMaker using script mode.

- toc: true 
- badges: true
- comments: true
- categories: [aws, ml, sagemaker]
- keyword: [aws, ml, sagemaker, sklearn, scikit-learn, python]
- image: images/copied_from_nb/images/2022-07-07-sagemaker-script-mode.jpeg

![](images/2022-07-07-sagemaker-script-mode.jpeg)

# Introduction
You may have trained a model with your favorite ML framework, and now you are asked to move your code to Amazon SageMaker. The good news is that SageMaker's fully managed training works well with many popular ML frameworks, including `scikit-learn`. In addition, SageMaker provides its prebuilt container for the scikit-learn framework, enabling us to seemlessly port our scripts to SageMaker and benefit from its training and deployment capabilities. SageMaker's scikit-learn Container is an open source library for making the scikit-learn framework run on the Amazon SageMaker platform. You can read more about sklearn container features from its GitHub page [SageMaker Scikit-learn Container](https://github.com/aws/sagemaker-scikit-learn-container).

Amazon SageMaker also provides open source Python SDK to train and deploy models on SageMaker. SageMaker SDK provides several high-level abstractions (classes), including:
* `Session` Provides a collection of methods for working with SageMaker resources 
* `Estimators` Encapsulate training on SageMaker
* `Predictors` Provide real-time inference and transformation using Python data types against a SageMaker endpoint

You can read more on SageMaker Python SDK from its official site [Amazon SageMaker Python SDK](https://sagemaker.readthedocs.io/en/stable/overview.html)

This approach of using a custom training script with SageMaker's prebuilt container is commonly called as **Script Mode**. To train a scikit-learn model by using the SageMaker Python SDK involves three steps:

1. **Prepare a training script**. The training script is similar to any other scikit-learn training script that you might use outside of SageMaker
2. **Create an Estimator object from class `sagemaker.sklearn.SKLearn`**. Scikit-learn estimator class handles end-to-end training and deployment of custom scikit-learn code. We pass our training script to the SKLearn estimator, and it executes the script within a SageMaker Training Job. This training job is an Amazon-built Docker container that runs functions defined in the provided Python script. 
3. **Call the Estimator's `fit` method on training data**. Training is started by calling `fit()` on this Estimator. After training is complete, calling `deploy()` creates a hosted SageMaker endpoint and returns a `SKLearnPredictor` instance that can be used to perform inference against the hosted model. We will discuss the `SKLearn` Estimator in more detail later in this post.

To read more about using scikit-learn with the SageMaker Python SDK, you may refer to the official documentation [using Scikit-learn with the SageMaker Python SDK](https://sagemaker.readthedocs.io/en/stable/frameworks/sklearn/using_sklearn.html). The official documentation is valuable, and I would highly recommend checking it and keeping it as a reference.

In this post we will built a scikit-learn [RandomForrestClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html) on [iris public dataset](https://archive.ics.uci.edu/ml/datasets/iris). There is a similar example in SageMaker documentation. [Train a SKLearn Model using Script Mode](https://sagemaker-examples.readthedocs.io/en/latest/sagemaker-script-mode/sklearn/sklearn_byom_outputs.html). But it does not discuss many important aspects of a scikit-learn container and its environment. In this post, we will learn about them and cover all the details of training a scikit-learn model with script mode. I also noted that the example in the documentation uses `RandomForrestRegressor` on a classification problem which I believe is a mistake.

We have much to cover and learn, so let's start.

# Environment
This notebook is prepared with AWS SageMaker notebook running on `ml.t3.medium` instance and "conda_python3" kernel.

In [1]:
!aws --version

aws-cli/1.22.97 Python/3.8.12 Linux/5.10.102-99.473.amzn2.x86_64 botocore/1.24.46


In [2]:
!cat /etc/os-release

NAME="Amazon Linux"
VERSION="2"
ID="amzn"
ID_LIKE="centos rhel fedora"
VERSION_ID="2"
PRETTY_NAME="Amazon Linux 2"
ANSI_COLOR="0;33"
CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2"
HOME_URL="https://amazonlinux.com/"


In [3]:
!python3 --version

Python 3.8.12


In [4]:
#collapse-output
!conda env list

# conda environments:
#
base                     /home/ec2-user/anaconda3
JupyterSystemEnv         /home/ec2-user/anaconda3/envs/JupyterSystemEnv
R                        /home/ec2-user/anaconda3/envs/R
amazonei_mxnet_p36       /home/ec2-user/anaconda3/envs/amazonei_mxnet_p36
amazonei_pytorch_latest_p37     /home/ec2-user/anaconda3/envs/amazonei_pytorch_latest_p37
amazonei_tensorflow2_p36     /home/ec2-user/anaconda3/envs/amazonei_tensorflow2_p36
mxnet_p37                /home/ec2-user/anaconda3/envs/mxnet_p37
python3               *  /home/ec2-user/anaconda3/envs/python3
pytorch_p38              /home/ec2-user/anaconda3/envs/pytorch_p38
tensorflow2_p38          /home/ec2-user/anaconda3/envs/tensorflow2_p38



# Prepare training and test data
We will use **Iris flower dataset**. It includes three iris species (Iris setosa, Iris virginica, and Iris versicolor) with 50 samples each. Four features were measured for each sample: the length and the width of the sepals and petals, in centimeters. We can train a model to distinguish the species from each other based on the combination of these four features. You can read more about this dataset at [Iris flower data set](https://en.wikipedia.org/wiki/Iris_flower_data_set). The dataset has five columns representing.
1. sepal length in cm
2. sepal width in cm
3. petal length in cm
4. petal width in cm
5. class: Iris Setosa, Iris Versicolour, Iris Virginica

## Download and preprocess data

In [5]:
##
# download dataset
import boto3
import pandas as pd
import numpy as np

s3 = boto3.client("s3")
s3.download_file(
    f"sagemaker-sample-files", "datasets/tabular/iris/iris.data", "iris.data"
)

df = pd.read_csv(
    "iris.data",
    header=None,
    names=["sepal_len", "sepal_wid", "petal_len", "petal_wid", "class"],
)
df.head()

Unnamed: 0,sepal_len,sepal_wid,petal_len,petal_wid,class
0,5.1,3.5,1.4,0.2,Iris-setosa
1,4.9,3.0,1.4,0.2,Iris-setosa
2,4.7,3.2,1.3,0.2,Iris-setosa
3,4.6,3.1,1.5,0.2,Iris-setosa
4,5.0,3.6,1.4,0.2,Iris-setosa


In [6]:
##
# Convert the three classes from strings to integers in {0,1,2}
df["class_cat"] = df["class"].astype("category").cat.codes
categories_map = dict(enumerate(df["class"].astype("category").cat.categories))
print(categories_map)
df.head()

{0: 'Iris-setosa', 1: 'Iris-versicolor', 2: 'Iris-virginica'}


Unnamed: 0,sepal_len,sepal_wid,petal_len,petal_wid,class,class_cat
0,5.1,3.5,1.4,0.2,Iris-setosa,0
1,4.9,3.0,1.4,0.2,Iris-setosa,0
2,4.7,3.2,1.3,0.2,Iris-setosa,0
3,4.6,3.1,1.5,0.2,Iris-setosa,0
4,5.0,3.6,1.4,0.2,Iris-setosa,0


## Prepare and store train and test sets as CSV files

In [7]:
##
# split the data into train and test set
from sklearn.model_selection import train_test_split

train, test = train_test_split(df, test_size=0.2, random_state=42)

print(f"train.shape: {train.shape}")
print(f"test.shape: {test.shape}")

train.shape: (120, 6)
test.shape: (30, 6)


We have our dataset ready. Let's define a local directory `local_path` to keep all the files and artifacts related to this post. I will refer to this directory as 'workspace'.

In [8]:
##
# `local_path` will be the root directory for this post.
local_path = "./datasets/2022-07-07-sagemaker-script-mode"

We have train and test sets ready. Let's create two more directories in our workspace and store our data in them.

In [9]:
from pathlib import Path

# local paths
local_train_path = local_path + "/train"
local_test_path = local_path + "/test"

# create local directories
Path(local_train_path).mkdir(parents=True, exist_ok=True)
Path(local_test_path).mkdir(parents=True, exist_ok=True)

print("local_train_path: ", local_train_path)
print("local_test_path: ", local_test_path)

# local file names
local_train_file = local_train_path + "/train.csv"
local_test_file = local_test_path + "/test.csv"

# write train and test CSV files
train.to_csv(local_train_file, index=False)
test.to_csv(local_test_file, index=False)

print("local_train_file: ", local_train_file)
print("local_test_file: ", local_test_file)

local_train_path:  ./datasets/2022-07-07-sagemaker-script-mode/train
local_test_path:  ./datasets/2022-07-07-sagemaker-script-mode/test
local_train_file:  ./datasets/2022-07-07-sagemaker-script-mode/train/train.csv
local_test_file:  ./datasets/2022-07-07-sagemaker-script-mode/test/test.csv


## Create SageMaker session

In [10]:
import sagemaker

session = sagemaker.Session()
role = sagemaker.get_execution_role()
bucket = session.default_bucket()
region = session.boto_region_name

print("sagemaker.__version__: ", sagemaker.__version__)
print("Session: ", session)
print("Role: ", role)
print("Bucket: ", bucket)
print("Region: ", region)

sagemaker.__version__:  2.99.0
Session:  <sagemaker.session.Session object at 0x7f6b5415c640>
Role:  arn:aws:iam::801598032724:role/service-role/AmazonSageMakerServiceCatalogProductsUseRole
Bucket:  sagemaker-us-east-1-801598032724
Region:  us-east-1


What we have done here is
* imported the SageMaker Python SDK into our runtime
* get a session to work with SageMaker API and other AWS services
* get the execution role associated with the user profile. It is the same profile that is available to the user to work from console UI and has `AmazonSageMakerFullAccess` policy attached to it.
* create or get a default bucket to use and return its name. Default bucket name has the format `sagemaker-{region}-{account_id}`. If it doesn't exist then our session will automatically create it. You may also use any other bucket in its place given that you have enough permission for reading and writing.
* get the region name attached to our session

Next, we will use this session to upload data to our default bucket. 

## Upload data to Amazon S3 bucket

In [11]:
##
# You may choose any other prefix for your bucket.
# All the data related to this post will be under this prefix.
bucket_prefix = "2022-07-07-sagemaker-script-mode"

Now upload the data. In the output, we will get the complete path (S3 URI) for our uploaded data.

In [12]:
s3_train_uri = session.upload_data(local_train_file, key_prefix=bucket_prefix + "/data")
s3_test_uri = session.upload_data(local_test_file, key_prefix=bucket_prefix + "/data")

print("s3_train_uri: ", s3_train_uri)
print("s3_test_uri: ", s3_test_uri)

s3_train_uri:  s3://sagemaker-us-east-1-801598032724/2022-07-07-sagemaker-script-mode/data/train.csv
s3_test_uri:  s3://sagemaker-us-east-1-801598032724/2022-07-07-sagemaker-script-mode/data/test.csv


At this point, our data preparation step is complete. Train and test CSV files are available on the local system and in our default Amazon S3 bucket.

# Prepare SageMaker local environment
The Amazon SageMaker training environment is managed, but SageMaker Python SDK also supports **local mode**, allowing you to train and deploy models to your local environment. This is a great way to test training scripts before running them in SageMaker's managed training or hosting environment.

## How SageMaker managed environment works?
When you send a request to SageMaker API (`fit` or `deploy` call)
* it spins up new instances with the provided specification
* loads the algorithm container
* pulls the data from S3
* runs the training code
* store the results and trained model artifacts to S3
* terminates the new instances

All this happens behind the scenes with a single line of code and is a huge advantage. Spinning up new hardware every time can be good for repeatability and security, but it can add some friction while testing and debugging our code. We can test our code on a small dataset in our local environment with SageMaker local mode and then switch seamlessly to SageMaker managed environment by changing a single line of code.

## Steps to prepare Amazon SageMaker local environment
Install the following pre-requisites if you want to set up Amazon SageMaker on your local system.
1. Install required Python packages:
    ```
    pip install boto3 sagemaker pandas scikit-learn
    pip install 'sagemaker[local]'
    ```
2. Docker Desktop installed and running on your computer:
    ```
    docker ps
    ```
3. You should have AWS credentials configured on your local machine to be able to pull the docker image from ECR.

### Instructions for SageMaker notebook instances
You can also set up SageMaker's local environment in SageMaker notebook instances. Required Python packages and Docker service is already there. You only need to upgrade the `sagemaker[local]` Python package.

In [13]:
#collapse_output
# this is required for SageMaker notebook instances
!pip install 'sagemaker[local]' --upgrade

Looking in indexes: https://pypi.org/simple, https://pip.repos.neuron.amazonaws.com
You should consider upgrading via the '/home/ec2-user/anaconda3/envs/python3/bin/python -m pip install --upgrade pip' command.[0m[33m
[0m

### Instructions for SageMaker Studio environment
Note that SageMaker `local` mode will not work in SageMaker Studio environment as it does not have docker service installed on the provided instances.

## Create SageMaker local session

SageMaker local session is required for working in a local environment. Let's create it.

In [14]:
from sagemaker.local import LocalSession

session_local = LocalSession()
session_local

<sagemaker.local.local_session.LocalSession at 0x7f6b4f3169a0>

In [15]:
##
# configure local session
session_local.config = {"local": {"local_code": True}}

# Prepare SageMaker training script

We will call our training script `train_and_serve.py` and place it in our workspace under the `/src` folder. Then, we will start with a simple `Hello World` message code. After that, we will update and complete our training script as we learn more about the SageMaker `scikit-learn` container environment.

In [16]:
script_file_name = "train_and_serve.py"
script_path = local_path + "/src"
script_file = script_path + "/" + script_file_name

print("script_file_name: ", script_file_name)
print("script_path: ", script_path)
print("script_file: ", script_file)

script_file_name:  train_and_serve.py
script_path:  ./datasets/2022-07-07-sagemaker-script-mode/src
script_file:  ./datasets/2022-07-07-sagemaker-script-mode/src/train_and_serve.py


In [17]:
##
# make sure that the directory exists
Path(script_path).mkdir(parents=True, exist_ok=True)

Now the training script.

In [18]:
%%writefile $script_file

if __name__ == "__main__":
    print("*** Hello from the SageMaker script mode***")

Overwriting ./datasets/2022-07-07-sagemaker-script-mode/src/train_and_serve.py


# Prepare SageMaker SKLearn estimator

To create SKLearn Estimator object we need to pass it following items
* **`entry_point (str)`** Path (absolute or relative) to the Python source file, which should be executed as the entry point to training
* **`framework_version (str)`** Scikit-learn version you want to use for executing your model training code
* **`role (str)`** An AWS IAM role (either name or full ARN)
* **`instance_type (str)`** Type of instance to use for training. For local mode use string **`local`**
* **`instance_count (int)`** Number of instances to use for training. Since we will train in the local environment and have a single instance, we will use '1' here

You can read more about the SKLearn Estimator class from the official documentation [Scikit Learn Estimator](https://sagemaker.readthedocs.io/en/stable/frameworks/sklearn/sagemaker.sklearn.html)

Let's find the SKLearn framework version.

In [19]:
import sklearn

print(sklearn.__version__)

1.0.1


Note that version number `1.0.1` has to be provided to the SKLearn estimator class as **`1.0-1`**. Otherwise, you will get the following error message.
```
ValueError: Unsupported sklearn version: 1.0.1. You may need to upgrade your SDK version (pip install -U sagemaker) for newer sklearn versions. Supported sklearn version(s): 0.20.0, 0.23-1, 1.0-1.
```

Now let us create the SageMaker SKLearn estimator object and pass our training script to it.

In [20]:
#collapse-output
from sagemaker.sklearn import SKLearn

sk_estimator = SKLearn(
    entry_point=script_file,
    role=role,
    instance_count=1,
    instance_type="local",
    framework_version="1.0-1"
)

sk_estimator.fit()

Creating l5ul3ase2t-algo-1-ff4c7 ... 
Creating l5ul3ase2t-algo-1-ff4c7 ... done
Attaching to l5ul3ase2t-algo-1-ff4c7
[36ml5ul3ase2t-algo-1-ff4c7 |[0m 2022-07-17 12:25:06,483 sagemaker-containers INFO     Imported framework sagemaker_sklearn_container.training
[36ml5ul3ase2t-algo-1-ff4c7 |[0m 2022-07-17 12:25:06,487 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)
[36ml5ul3ase2t-algo-1-ff4c7 |[0m 2022-07-17 12:25:06,497 sagemaker_sklearn_container.training INFO     Invoking user training script.
[36ml5ul3ase2t-algo-1-ff4c7 |[0m 2022-07-17 12:25:06,728 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)
[36ml5ul3ase2t-algo-1-ff4c7 |[0m 2022-07-17 12:25:06,742 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)
[36ml5ul3ase2t-algo-1-ff4c7 |[0m 2022-07-17 12:25:06,755 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)
[36ml5ul3ase2t-algo-1-ff4c7 |[0m

In [21]:
##
# The estimator will pick a local session when we use instance_type='local'
sk_estimator.sagemaker_session

<sagemaker.local.local_session.LocalSession at 0x7f6b4f051130>

When you first run the SKLearn estimator, executing it may take some time as it has to download the scikit-learn container to the local docker environment. You will get the container logs in the output when the container completes the execution. The logs show that the container has successfully run the training script, and the `hello` message is also printed. But there is a lot more information available in the logs. We will discuss it in the coming section.

![sklearn-output-1](images/2022-07-07-sagemaker-script-mode/sklearn-output-1.png)

# Understanding SKLearn container output and environment varaibles
From the SKLearn estimator output, we can see that our `train_and_serve.py` script is executed by the container with the following command.

```
/miniconda3/bin/python train_and_serve.py
```

## Inspecting SageMaker SKLearn docker image
Since the container was executed in the local environment, we can also inspect the SageMaker SKLearn local image.

In [22]:
!docker images

REPOSITORY                                                            TAG             IMAGE ID       CREATED       SIZE
683313688378.dkr.ecr.us-east-1.amazonaws.com/sagemaker-scikit-learn   1.0-1-cpu-py3   8a6ea8272ad0   10 days ago   3.7GB


Let's also inspect the docker image. Notice multiple container environment variables and their default values in the output.

In [23]:
#collapse-output
!docker inspect 8a6ea8272ad0

[
    {
        "Id": "sha256:8a6ea8272ad003ec816569b0f879b16c770116584301161565f065aadb99436c",
        "RepoTags": [
            "683313688378.dkr.ecr.us-east-1.amazonaws.com/sagemaker-scikit-learn:1.0-1-cpu-py3"
        ],
        "RepoDigests": [
            "683313688378.dkr.ecr.us-east-1.amazonaws.com/sagemaker-scikit-learn@sha256:fc8c3a617ff0e436c25f3b64d03e1f485f1d159478c26757f3d1d267fc849445"
        ],
        "Parent": "",
        "Comment": "",
        "Created": "2022-07-06T18:55:02.854297671Z",
        "Container": "11b9a5fec2d61294aee63e549100ed18ceb7aa0de6a4ff198da2f556dfe3ec2f",
        "ContainerConfig": {
            "Hostname": "11b9a5fec2d6",
            "Domainname": "",
            "User": "",
            "AttachStdin": false,
            "AttachStdout": false,
            "AttachStderr": false,
            "ExposedPorts": {
                "8080/tcp": {}
            },
            "Tty": false,
            "OpenStdin": false,
            "StdinOnce": false,
    

## Pass hyperparameters to SKLearn estimator

Let's pass some dummy hyperparameters to the estimator and see how it affects the output.

In [24]:
#collapse-output
sk_estimator = SKLearn(
    entry_point=script_file,
    role=role,
    instance_count=1,
    instance_type='local',
    framework_version="1.0-1",
    hyperparameters={"dummy_param_1":"val1","dummy_param_2":"val2"},
)

sk_estimator.fit()

Creating itmzsttsc9-algo-1-pikbz ... 
Creating itmzsttsc9-algo-1-pikbz ... done
Attaching to itmzsttsc9-algo-1-pikbz
[36mitmzsttsc9-algo-1-pikbz |[0m 2022-07-17 12:25:09,883 sagemaker-containers INFO     Imported framework sagemaker_sklearn_container.training
[36mitmzsttsc9-algo-1-pikbz |[0m 2022-07-17 12:25:09,888 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)
[36mitmzsttsc9-algo-1-pikbz |[0m 2022-07-17 12:25:09,896 sagemaker_sklearn_container.training INFO     Invoking user training script.
[36mitmzsttsc9-algo-1-pikbz |[0m 2022-07-17 12:25:10,093 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)
[36mitmzsttsc9-algo-1-pikbz |[0m 2022-07-17 12:25:10,106 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)
[36mitmzsttsc9-algo-1-pikbz |[0m 2022-07-17 12:25:10,119 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)
[36mitmzsttsc9-algo-1-pikbz |[0m

![sklearn-output-hyperparams](images/2022-07-07-sagemaker-script-mode/sklearn-output-hyperparams.png)

From the output we can see that our hyperparameters are passed to our training script as command line arguments. This is an important point and we will update our script using this information.


## SageMaker SKLearn container environment variables
Let's now discuss some important environment variables we see in the output.

### SM_MODULE_DIR
```
SM_MODULE_DIR=s3://sagemaker-us-east-1-801598032724/sagemaker-scikit-learn-2022-07-13-13-05-48-675/source/sourcedir.tar.gz
```
`SM_MODULE_DIR` points to a location in the S3 bucket where SageMaker will automatically backup our source code for that particular run. SageMaker will create a separate folder in the default bucket for each new run. The default value is `s3://sagemaker-{aws-region}-{aws-id}/{training-job-name}/source/sourcedir.tar.gz`

**Note**: We have used `local_code` for the SKLean estimator, then why is the source code backed up on the S3 bucket. Should it not be backed on the local system and bypass S3 altogether in local mode? Well, this should have been the default behavior, but it looks like SageMaker SDK is doing it otherwise, and even with the local mode it is using the S3 bucket for keeping source code. You can read more about this behavior in this issue ticket [Model repack always uploads data to S3 bucket regardless of local mode settings](https://github.com/aws/sagemaker-python-sdk/issues/3031)

### SM_MODEL_DIR
```
SM_MODEL_DIR=/opt/ml/model
```
`SM_MODEL_DIR` points to a directory located inside the container. When the training job finishes, the container and its file system will be deleted, except for the `/opt/ml/model` and `/opt/ml/output` directories. Use `/opt/ml/model` to save the trained model artifacts. These artifacts are uploaded to S3 for model hosting.

### SM_OUTPUT_DATA_DIR
```
SM_OUTPUT_DIR=/opt/ml/output
```
`SM_OUTPUT_DIR` points to a directory in the container to write output artifacts. Output artifacts may include checkpoints, graphs, and other files to save, not including model artifacts. These artifacts are compressed and uploaded to S3 to the same S3 prefix as the model artifacts.

### SM_CHANNELS
```
SM_CHANNELS='["testing","training"]'
```
A channel is a named input source that training algorithms can consume. You can partition your training data into different logical "channels" when you run training. Depending on your problem, some common channel ideas are: "training", "testing", "evaluation" or "images" and "labels". You can read more about the channels from SageMaker API reference [Channel](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_Channel.html)

### SM CHANNEL {channel_name}
```
SM_CHANNEL_TRAIN='/opt/ml/input/data/train'
SM_CHANNEL_TEST='/opt/ml/input/data/test'
```
Suppose that you have passed two input channels, 'train' and 'test', to the Scikit-learn estimator's `fit()` method, the following will be set, following the format `SM_CHANNEL_[channel_name]`:
* **`SM_CHANNEL_TRAIN`**: it points to the directory in the container that has the *train* channel data downloaded
* **`SM_CHANNEL_TEST`**: Same as above, but for the *test* channel

Note that the channel names `train` and `test` are the conventions. Still, you can use any name here, and the environment variables will be created accordingly. It is important to know that the SageMaker container automatically downloads the data from the provided input channels and makes them available in the respective local directories once it starts executing. The training script can then load the data from the local container directories.

There are more environment variables available, and you can read about them from [Environment variables](https://github.com/aws/sagemaker-training-toolkit/blob/master/ENVIRONMENT_VARIABLES.md)

# Pass input channel to SKLearn estimator

Now that we understand the SKLearn container environment more let's pass the training data channel to the estimator and see if the data becomes available inside the container directory. 

Update our script to list all the files in the `SM_CHANNEL_TRAIN` directory.

In [25]:
%%writefile $script_file
import argparse, os, sys

if __name__ == "__main__":
    print(" *** Hello from SageMaker script container *** ")

    training_dir = os.environ.get("SM_CHANNEL_TRAIN")
    dir_list = os.listdir(training_dir)

    print("training_dir files list: ", dir_list)


Overwriting ./datasets/2022-07-07-sagemaker-script-mode/src/train_and_serve.py


In [26]:
#collapse-output
sk_estimator = SKLearn(
    entry_point=script_file,
    role=role,
    instance_count=1,
    instance_type='local',
    framework_version="1.0-1",
    hyperparameters={"dummy_param_1":"val1","dummy_param_2":"val2"},
)

sk_estimator.fit({"train": f"file://{local_train_path}"})

Creating 7oowjuanoi-algo-1-xcbv0 ... 
Creating 7oowjuanoi-algo-1-xcbv0 ... done
Attaching to 7oowjuanoi-algo-1-xcbv0
[36m7oowjuanoi-algo-1-xcbv0 |[0m 2022-07-17 12:25:12,701 sagemaker-containers INFO     Imported framework sagemaker_sklearn_container.training
[36m7oowjuanoi-algo-1-xcbv0 |[0m 2022-07-17 12:25:12,706 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)
[36m7oowjuanoi-algo-1-xcbv0 |[0m 2022-07-17 12:25:12,715 sagemaker_sklearn_container.training INFO     Invoking user training script.
[36m7oowjuanoi-algo-1-xcbv0 |[0m 2022-07-17 12:25:12,894 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)
[36m7oowjuanoi-algo-1-xcbv0 |[0m 2022-07-17 12:25:12,908 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)
[36m7oowjuanoi-algo-1-xcbv0 |[0m 2022-07-17 12:25:12,920 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)
[36m7oowjuanoi-algo-1-xcbv0 |[0m

![sklearn-output-traincsv](images/2022-07-07-sagemaker-script-mode/sklearn-output-traincsv.png)

From the output, we can see that `train.csv`, which was in our local environment, is now available inside the container on path `SM_CHANNEL_TRAIN=/opt/ml/input/data/train`. 

Let's also test the same with our training data on the S3 bucket.

In [27]:
#collapse-output
sk_estimator = SKLearn(
    entry_point=script_file,
    role=role,
    instance_count=1,
    instance_type='local',
    framework_version="1.0-1",
    hyperparameters={"dummy_param_1":"val1","dummy_param_2":"val2"},
)

sk_estimator.fit({"train": s3_train_uri})

Creating vjizgx2zqm-algo-1-m06ka ... 
Creating vjizgx2zqm-algo-1-m06ka ... done
Attaching to vjizgx2zqm-algo-1-m06ka
[36mvjizgx2zqm-algo-1-m06ka |[0m 2022-07-17 12:25:15,654 sagemaker-containers INFO     Imported framework sagemaker_sklearn_container.training
[36mvjizgx2zqm-algo-1-m06ka |[0m 2022-07-17 12:25:15,659 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)
[36mvjizgx2zqm-algo-1-m06ka |[0m 2022-07-17 12:25:15,667 sagemaker_sklearn_container.training INFO     Invoking user training script.
[36mvjizgx2zqm-algo-1-m06ka |[0m 2022-07-17 12:25:15,847 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)
[36mvjizgx2zqm-algo-1-m06ka |[0m 2022-07-17 12:25:15,862 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)
[36mvjizgx2zqm-algo-1-m06ka |[0m 2022-07-17 12:25:15,880 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)
[36mvjizgx2zqm-algo-1-m06ka |[0m

Again the results are the same. SageMaker will download the data from the S3 bucket and make it available in the container. In the environment variables section we also learned that two directories are special `/opt/ml/model` and `/opt/ml/output`. Container environment variables `SM_MODEL_DIR` and `SM_OUTPUT_DATA_DIR` point to them, respectively. Whatever artifacts we put on them will be stored on the S3 bucket when the training job finishes. "SM_MODEL_DIR" is for trained models, and "SM_OUTPUT_DATA_DIR" is for other artifacts like logs, graphs, plots, results, etc. Let's update our training script and put some dummy data in these directories. Once the job is complete, we will verify the stored artifacts on the S3 bucket.

In [28]:
%%writefile $script_file
import argparse, os, sys

if __name__ == "__main__":
    print(" *** Hello from SageMaker script container *** ")

    # list files in SM_CHANNEL_TRAIN
    training_dir = os.environ.get("SM_CHANNEL_TRAIN")
    dir_list = os.listdir(training_dir)
    print("training_dir files list: ", dir_list)

    # write dummy model file to SM_MODEL_DIR
    sm_model_dir = os.environ.get("SM_MODEL_DIR")
    with open(f"{sm_model_dir}/dummy-model.txt", "w") as f:
        f.write("this is a dummy model")

    # write dummy artifact file to SM_OUTPUT_DATA_DIR
    sm_output_data_dir = os.environ.get("SM_OUTPUT_DATA_DIR")
    with open(f"{sm_output_data_dir}/dummy-output-data.txt", "w") as f:
        f.write("this is a dummy output data")

Overwriting ./datasets/2022-07-07-sagemaker-script-mode/src/train_and_serve.py


In [29]:
#collapse-output
sk_estimator = SKLearn(
    entry_point=script_file,
    role=role,
    instance_count=1,
    instance_type='local',
    framework_version="1.0-1",
    hyperparameters={"dummy_param_1":"val1","dummy_param_2":"val2"},
)

sk_estimator.fit({"train": s3_train_uri})

Creating 90hhy5fi8d-algo-1-r944l ... 
Creating 90hhy5fi8d-algo-1-r944l ... done
Attaching to 90hhy5fi8d-algo-1-r944l
[36m90hhy5fi8d-algo-1-r944l |[0m 2022-07-17 12:25:18,739 sagemaker-containers INFO     Imported framework sagemaker_sklearn_container.training
[36m90hhy5fi8d-algo-1-r944l |[0m 2022-07-17 12:25:18,743 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)
[36m90hhy5fi8d-algo-1-r944l |[0m 2022-07-17 12:25:18,752 sagemaker_sklearn_container.training INFO     Invoking user training script.
[36m90hhy5fi8d-algo-1-r944l |[0m 2022-07-17 12:25:18,940 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)
[36m90hhy5fi8d-algo-1-r944l |[0m 2022-07-17 12:25:18,960 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)
[36m90hhy5fi8d-algo-1-r944l |[0m 2022-07-17 12:25:18,981 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)
[36m90hhy5fi8d-algo-1-r944l |[0m

Failed to delete: /tmp/tmp5_rbrzm2/algo-1-r944l Please remove it manually.


===== Job Complete =====


Our training job is now complete. Let us check the S3 bucket to see if our dummy model and other artifacts are present.

First, we need the S3 URI for these artifacts. For our dummy model (from SM_MODEL_DIR), we can use our estimator object to get the URI.

In [30]:
model_data = sk_estimator.model_data
model_data

's3://sagemaker-us-east-1-801598032724/sagemaker-scikit-learn-2022-07-17-12-25-16-297/model.tar.gz'

Let's download `model_data` from S3 to a local directory for verification. For this create a local `/tmp` to store these downloaded files.

In [31]:
local_tmp_path = local_path + "/tmp"
print(local_tmp_path)

# create the local '/tmp' directory
Path(local_tmp_path).mkdir(parents=True, exist_ok=True)

./datasets/2022-07-07-sagemaker-script-mode/tmp


We will use SageMaker `S3Downloader` object to download the model file.

In [32]:
from sagemaker.s3 import S3Downloader

S3Downloader.download(
    s3_uri=model_data, local_path=local_tmp_path, sagemaker_session=session
)

File is downloaded. Let's uncompress it to verify the model file.

In [33]:
!tar -xzvf $local_tmp_path/model.tar.gz -C $local_tmp_path

dummy-model.txt


Yes, the "dummy-model.txt" file is present. This tells us that SageMaker will automatically upload the files from the model directory (SM_MODEL_DIR) to the S3 bucket. Let's do the same for the output data directory (SM_OUTPUT_DATA_DIR). There is no direct way to get the S3 URI from the estimator object for the output data directory. But we can prepare it ourselves. So let's do that next.

In [34]:
print("estimator.output_path: ", sk_estimator.output_path)
print("estimator.latest_training_job.name: ", sk_estimator.latest_training_job.name)

estimator.output_path:  s3://sagemaker-us-east-1-801598032724/
estimator.latest_training_job.name:  sagemaker-scikit-learn-2022-07-17-12-25-16-297


In [35]:
def get_s3_output_uri(estimator):
    return estimator.output_path + estimator.latest_training_job.name
    
get_s3_output_uri(sk_estimator)

's3://sagemaker-us-east-1-801598032724/sagemaker-scikit-learn-2022-07-17-12-25-16-297'

In [36]:
##
# S3 URI for output data artifacts
s3_output_uri = get_s3_output_uri(sk_estimator) + '/output.tar.gz'
s3_output_uri

's3://sagemaker-us-east-1-801598032724/sagemaker-scikit-learn-2022-07-17-12-25-16-297/output.tar.gz'

In [37]:
## 
# S3 URI for model artifact. We have already veirifed it.
s3_model_uri = get_s3_output_uri(sk_estimator) + '/model.tar.gz'
s3_model_uri

's3://sagemaker-us-east-1-801598032724/sagemaker-scikit-learn-2022-07-17-12-25-16-297/model.tar.gz'

In [38]:
##
# S3 URI for source code
s3_source_uri = get_s3_output_uri(sk_estimator) + '/source/sourcedir.tar.gz'
s3_source_uri

's3://sagemaker-us-east-1-801598032724/sagemaker-scikit-learn-2022-07-17-12-25-16-297/source/sourcedir.tar.gz'

Let's download these artifacts to our local '/tmp' directory for verification.

In [39]:
!aws s3 cp $s3_output_uri $local_tmp_path
!aws s3 cp $s3_source_uri $local_tmp_path

download: s3://sagemaker-us-east-1-801598032724/sagemaker-scikit-learn-2022-07-17-12-25-16-297/output.tar.gz to datasets/2022-07-07-sagemaker-script-mode/tmp/output.tar.gz
download: s3://sagemaker-us-east-1-801598032724/sagemaker-scikit-learn-2022-07-17-12-25-16-297/source/sourcedir.tar.gz to datasets/2022-07-07-sagemaker-script-mode/tmp/sourcedir.tar.gz


In [40]:
##
# extract the output data files from 'output.tar.gz'
!tar -xzvf $local_tmp_path/output.tar.gz -C $local_tmp_path

data/
data/dummy-output-data.txt
success


In [41]:
##
# extract the source code files from 'sourcedir.tar.gz'
!tar -xzvf $local_tmp_path/sourcedir.tar.gz -C $local_tmp_path

train_and_serve.py


# Summary till now
Let's summarize what we have learned till now.
* We can use SageMaker SKLearn local mode to test our code in a local environment
* SKLearn container executes our provided script with the command `/miniconda3/bin/python train_and_server.py`
* Hyperparameters passed to the container are passed to our script as command line arguments
* Data from input channels will be downloaded by the container and made available for our script to load and process
* '/opt/ml/model' and '/opt/ml/output' directories are special. Anything stored on them will be automatically backed up on the S3 bucket when the job finishes. These directories are defined in the container environment variables 'SM_MODEL_DIR' and 'SM_OUTPUT_DATA_DIR', respectively. SM_MODEL_DIR should be used to write model artifacts. SM_OUTPUT_DATA_DIR should be used to write any other supporting artifact.

Let's use this knowledge to update our script to train a RandomForrestClassifier on the Iris flower dataset.

In [42]:
##
# cleanup /tmp directory before moving to next section
!rm -r $local_tmp_path/*

# Prepare training script for RandomForestClassifier
Let's update our training script to train a scikit-learn random forest classifier model on the iris data set. The script will read training and testing data from input data channel directories and trains a classifier on it. It will then save the model to the model directory and validation results ('y_pred.csv') to the output data directory. Notice that we have also parsed container environment variables as command line arguments. It makes sense for hyperparameters ('--estimators') because we know they will be passed to the script as command line parameters. For other environment variables (e.g. 'SM_MODEL_DIR'), we have checked first if they are given as command line arguments. If they are, then we parse them to get the values. Otherwise, we read their values from the environment. This is done so we can test our script locally from the command line without setting the environment variables.

In [43]:
%%writefile $script_file

import argparse, os
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import RandomForestClassifier
from sklearn import metrics
import joblib

if __name__ == "__main__":

    # Pass in environment variables and hyperparameters
    parser = argparse.ArgumentParser()

    # Hyperparameters
    parser.add_argument("--estimators", type=int, default=15)

    # sm_model_dir: model artifacts stored here after training
    # sm-channel-train: input training data location
    # sm-channel-test: input test data location
    # sm-output-data-dir: output artifacts location
    parser.add_argument("--sm-model-dir", type=str, default=os.environ.get("SM_MODEL_DIR"))
    parser.add_argument("--sm-channel-train", type=str, default=os.environ.get("SM_CHANNEL_TRAIN"))
    parser.add_argument("--sm-channel-test", type=str, default=os.environ.get("SM_CHANNEL_TEST"))
    parser.add_argument("--sm-output-data-dir", type=str, default=os.environ.get("SM_OUTPUT_DATA_DIR"))

    args, _ = parser.parse_known_args()

    print("command line arguments: ", args)

    estimators = args.estimators
    sm_model_dir = args.sm_model_dir
    training_dir = args.sm_channel_train
    testing_dir = args.sm_channel_test
    output_data_dir = args.sm_output_data_dir

    print(f"training_dir: {training_dir}")
    print(f"training_dir files list: {os.listdir(training_dir)}")
    print(f"testing_dir: {testing_dir}")
    print(f"testing_dir files list: {os.listdir(testing_dir)}")
    print(f"sm_model_dir: {sm_model_dir}")
    print(f"output_data_dir: {output_data_dir}")

    # Read in data
    df_train = pd.read_csv(training_dir + "/train.csv", sep=",")
    df_test = pd.read_csv(testing_dir + "/test.csv", sep=",")

    # Preprocess data
    X_train = df_train.drop(["class", "class_cat"], axis=1)
    y_train = df_train["class_cat"]
    X_test = df_test.drop(["class", "class_cat"], axis=1)
    y_test = df_test["class_cat"]

    print(f"X_train.shape: {X_train.shape}")
    print(f"y_train.shape: {y_train.shape}")
    print(f"X_train.shape: {X_test.shape}")
    print(f"y_train.shape: {y_test.shape}")

    sc = StandardScaler()
    X_train = sc.fit_transform(X_train)
    X_test = sc.transform(X_test)

    # Build model
    regressor = RandomForestClassifier(n_estimators=estimators)
    regressor.fit(X_train, y_train)
    y_pred = regressor.predict(X_test)

    # Save the model
    joblib.dump(regressor, sm_model_dir + "/model.joblib")

    # Save the results
    pd.DataFrame(y_pred).to_csv(output_data_dir + "/y_pred.csv")

Overwriting ./datasets/2022-07-07-sagemaker-script-mode/src/train_and_serve.py


Now give proper execution rights to the script.

In [44]:
!chmod +x $script_file

Let's test this script locally before passing it to the SKLearn estimator. We will invoke this script from a command line and pass the required parameters similar to how an estimator container will execute it. For testing this script, we need to pass four directory paths:
* **sm-model-dir** This will point to a directory where our script will store the trained model. We can point it to '/tmp' directory for test purposes
* **sm-channel-train** This will point to a directory containing training data. We already have it as 'local_train_path'
* **sm-channel-test** This will point to a directory containing test data. We also have it as 'local_test_path'
* **sm-output-data-dir** This will point to a directory where our script will store other artifacts. We can also point it to '/tmp' directory for test purposes

Once the script is successfully run, we will find the trained model file 'model.joblib' and 'y_pred.csv' in the '/tmp' directory.

In [45]:
#collapse-output
!python3 $script_file \
    --sm-model-dir $local_tmp_path \
    --sm-channel-train $local_train_path \
    --sm-channel-test $local_test_path \
    --sm-output-data-dir $local_tmp_path \
    --estimators 10

command line arguments:  Namespace(estimators=10, sm_channel_test='./datasets/2022-07-07-sagemaker-script-mode/test', sm_channel_train='./datasets/2022-07-07-sagemaker-script-mode/train', sm_model_dir='./datasets/2022-07-07-sagemaker-script-mode/tmp', sm_output_data_dir='./datasets/2022-07-07-sagemaker-script-mode/tmp')
training_dir: ./datasets/2022-07-07-sagemaker-script-mode/train
training_dir files list: ['train.csv']
testing_dir: ./datasets/2022-07-07-sagemaker-script-mode/test
testing_dir files list: ['test.csv']
sm_model_dir: ./datasets/2022-07-07-sagemaker-script-mode/tmp
output_data_dir: ./datasets/2022-07-07-sagemaker-script-mode/tmp
X_train.shape: (120, 4)
y_train.shape: (120,)
X_train.shape: (30, 4)
y_train.shape: (30,)


Let's check the local '/tmp' directory for artifacts.

In [46]:
!ls $local_tmp_path

model.joblib  y_pred.csv


Now that we have test our script and it is working as expected, let's pass it to SKLean container.

In [47]:
#collapse-output
sk_estimator = SKLearn(
    entry_point=script_file,
    role=role,
    instance_count=1,
    instance_type='local',
    framework_version="1.0-1",
    hyperparameters={"estimators":10},
)

sk_estimator.fit({"train": s3_train_uri, "test": s3_test_uri})

Creating 8zzvvv6e73-algo-1-byidf ... 
Creating 8zzvvv6e73-algo-1-byidf ... done
Attaching to 8zzvvv6e73-algo-1-byidf
[36m8zzvvv6e73-algo-1-byidf |[0m 2022-07-17 12:25:25,289 sagemaker-containers INFO     Imported framework sagemaker_sklearn_container.training
[36m8zzvvv6e73-algo-1-byidf |[0m 2022-07-17 12:25:25,294 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)
[36m8zzvvv6e73-algo-1-byidf |[0m 2022-07-17 12:25:25,303 sagemaker_sklearn_container.training INFO     Invoking user training script.
[36m8zzvvv6e73-algo-1-byidf |[0m 2022-07-17 12:25:25,483 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)
[36m8zzvvv6e73-algo-1-byidf |[0m 2022-07-17 12:25:25,497 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)
[36m8zzvvv6e73-algo-1-byidf |[0m 2022-07-17 12:25:25,510 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)
[36m8zzvvv6e73-algo-1-byidf |[0m

Failed to delete: /tmp/tmppg_g0dd2/algo-1-byidf Please remove it manually.


===== Job Complete =====


In [48]:
##
# cleanup /tmp directory before moving to next section
!rm -r $local_tmp_path/*

# Passing custom libraries and dependencies to SKLean container

We have successfully trained our classifier but assume we have an additional task. One of your colleagues has created a library that takes the confusion matrix array and plots it with [seaborn visualization library](https://seaborn.pydata.org/). You have been told to use this custom library with the training script and save the confusion matrix plot to the output data directory.

Let's prepare code for this custom library to take an array and return a confusion matrix plot from seaborn.

In [49]:
# create a path to store the custom library code
custom_library_path = local_path + "/my_custom_library"
custom_library_file = custom_library_path + "/seaborn_confusion_matrix.py"

print(f"custom_library_path: {custom_library_path}")
print(f"custom_library_file: {custom_library_file}")

# make sure the path exists
Path(custom_library_path).mkdir(parents=True, exist_ok=True)

custom_library_path: ./datasets/2022-07-07-sagemaker-script-mode/my_custom_library
custom_library_file: ./datasets/2022-07-07-sagemaker-script-mode/my_custom_library/seaborn_confusion_matrix.py


Now the code to plot the confusion matrix.

In [50]:
%%writefile $custom_library_file

import seaborn as sns
import numpy as np
import argparse, os


def save_confusion_matrix(cf_matrix, path="./"):
    sns_plot = sns.heatmap(cf_matrix, annot=True)
    sns_plot.figure.savefig(path + "/output_cm.png")


if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("--path", type=str, default="./")
    args, _ = parser.parse_known_args()
    path = args.path

    dummy_cm = np.array([[23, 5], [3, 30]])
    save_confusion_matrix(dummy_cm, path)

Overwriting ./datasets/2022-07-07-sagemaker-script-mode/my_custom_library/seaborn_confusion_matrix.py


Convert directory container seaborn code into a Python package directory.

In [51]:
%%writefile $custom_library_path/__init__.py

from .seaborn_confusion_matrix import *

Overwriting ./datasets/2022-07-07-sagemaker-script-mode/my_custom_library/__init__.py


Our custom library has a dependency on the seaborn Python package. So let's create 'requirements.txt' and put all our dependencies in it. Later it will be passed to the SKLean container to install them during initialization.

In [52]:
%%writefile $script_path/requirements.txt

seaborn==0.11.2

Overwriting ./datasets/2022-07-07-sagemaker-script-mode/src/requirements.txt


Let's test this library in our local environment first. It should plot a dummy confusion matrix in local temp directory.

In [53]:
#collapse-output
# intall the dependiencies first
!pip install -r $script_path/requirements.txt

Looking in indexes: https://pypi.org/simple, https://pip.repos.neuron.amazonaws.com
You should consider upgrading via the '/home/ec2-user/anaconda3/envs/python3/bin/python -m pip install --upgrade pip' command.[0m[33m
[0m

In [54]:
##
# test the custom library
!python3 $custom_library_file --path $local_tmp_path

In [55]:
##
# verify the custom library output from the /tmp directory
!ls $local_tmp_path

output_cm.png


So our custom library code works. Let's update our script to use it.

In [56]:
%%writefile $script_file

import argparse, os
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import confusion_matrix
import joblib

from my_custom_library import save_confusion_matrix

if __name__ == "__main__":

    # Pass in environment variables and hyperparameters
    parser = argparse.ArgumentParser()

    # Hyperparameters
    parser.add_argument("--estimators", type=int, default=15)

    # sm_model_dir: model artifacts stored here after training
    # sm-channel-train: input training data location
    # sm-channel-test: input test data location
    # sm-output-data-dir: output artifacts location
    parser.add_argument("--sm-model-dir", type=str, default=os.environ.get("SM_MODEL_DIR"))
    parser.add_argument("--sm-channel-train", type=str, default=os.environ.get("SM_CHANNEL_TRAIN"))
    parser.add_argument("--sm-channel-test", type=str, default=os.environ.get("SM_CHANNEL_TEST"))
    parser.add_argument("--sm-output-data-dir", type=str, default=os.environ.get("SM_OUTPUT_DATA_DIR"))

    args, _ = parser.parse_known_args()

    print("command line arguments: ", args)

    estimators = args.estimators
    sm_model_dir = args.sm_model_dir
    training_dir = args.sm_channel_train
    testing_dir = args.sm_channel_test
    output_data_dir = args.sm_output_data_dir

    print(f"training_dir: {training_dir}")
    print(f"training_dir files list: {os.listdir(training_dir)}")  
    print(f"testing_dir: {testing_dir}")
    print(f"testing_dir files list: {os.listdir(testing_dir)}")
    print(f"sm_model_dir: {sm_model_dir}")
    print(f"output_data_dir: {output_data_dir}")

    # Read in data
    df_train = pd.read_csv(training_dir + "/train.csv", sep=",")
    df_test = pd.read_csv(testing_dir + "/test.csv", sep=",")

    # Preprocess data
    X_train = df_train.drop(["class", "class_cat"], axis=1)
    y_train = df_train["class_cat"]
    X_test = df_test.drop(["class", "class_cat"], axis=1)
    y_test = df_test["class_cat"]

    print(f"X_train.shape: {X_train.shape}")
    print(f"y_train.shape: {y_train.shape}")
    print(f"X_train.shape: {X_test.shape}")
    print(f"y_train.shape: {y_test.shape}")

    sc = StandardScaler()
    X_train = sc.fit_transform(X_train)
    X_test = sc.transform(X_test)

    # Build model
    regressor = RandomForestClassifier(n_estimators=estimators)
    regressor.fit(X_train, y_train)
    y_pred = regressor.predict(X_test)

    # Save the model
    joblib.dump(regressor, sm_model_dir + "/model.joblib")

    # Save the results
    pd.DataFrame(y_pred).to_csv(output_data_dir + "/y_pred.csv")

    # save the confusion matrix
    cf_matrix = confusion_matrix(y_test, y_pred)
    save_confusion_matrix(cf_matrix, output_data_dir)

    # print sm_model_dir info
    print(f"sm_model_dir: {sm_model_dir}")
    print(f"sm_model_dir files list: {os.listdir(sm_model_dir)}")

    # print output_data_dir info
    print(f"output_data_dir: {output_data_dir}")
    print(f"output_data_dir files list: {os.listdir(output_data_dir)}")

Overwriting ./datasets/2022-07-07-sagemaker-script-mode/src/train_and_serve.py


Finally, all the ingredients are ready. Let's run our script from the SKLean container.

In the next cell, you can see that we have passed two extra parameters to the estimator.
* **source_dir** this path points to the directory with the **entry_point** script `train_and_serve.py` and `requirements.txt`. If any `requirements.txt` file is in this directory, the estimator will pick that and install those packages in the container during initialization.
* **dependencies** this points to a list of dependencies (custom libraries) that we want available in the container.

Our local directory structure is shown below.

```
local_path/
├── my_custom_library/
│   ├── seaborn_confusion_matrix.py
│   └── __init__.py
└── src/
    ├── train_and_serve.py
    └── requirements.txt
```

In [57]:
#collapse-output
sk_estimator = SKLearn(
    entry_point=script_file_name,
    source_dir=script_path,
    dependencies=[custom_library_path],
    role=role,
    instance_count=1,
    instance_type='local',
    framework_version="1.0-1",
    hyperparameters={"estimators":10},
)

sk_estimator.fit({"train": s3_train_uri, "test": s3_test_uri})

Creating lrh7l3x6jy-algo-1-4avl6 ... 
Creating lrh7l3x6jy-algo-1-4avl6 ... done
Attaching to lrh7l3x6jy-algo-1-4avl6
[36mlrh7l3x6jy-algo-1-4avl6 |[0m 2022-07-17 12:25:33,735 sagemaker-containers INFO     Imported framework sagemaker_sklearn_container.training
[36mlrh7l3x6jy-algo-1-4avl6 |[0m 2022-07-17 12:25:33,740 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)
[36mlrh7l3x6jy-algo-1-4avl6 |[0m 2022-07-17 12:25:33,749 sagemaker_sklearn_container.training INFO     Invoking user training script.
[36mlrh7l3x6jy-algo-1-4avl6 |[0m 2022-07-17 12:25:33,924 sagemaker-training-toolkit INFO     Installing dependencies from requirements.txt:
[36mlrh7l3x6jy-algo-1-4avl6 |[0m /miniconda3/bin/python -m pip install -r requirements.txt
[36mlrh7l3x6jy-algo-1-4avl6 |[0m Collecting seaborn==0.11.2
[36mlrh7l3x6jy-algo-1-4avl6 |[0m   Downloading seaborn-0.11.2-py3-none-any.whl (292 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m292.8/292.8

Failed to delete: /tmp/tmpo6timl04/algo-1-4avl6 Please remove it manually.


===== Job Complete =====


![sklearn-output-train-complete](images/2022-07-07-sagemaker-script-mode/sklearn-output-train-complete.png)

SKLearn container output shows that our classifier is successfully trained, and the model and output artifacts are placed in their respective folders. We know from the first section of this post that these artifacts will automatically be uploaded to the S3 bucket. This concludes the model training part of our implementation. Let's now proceed to model serving part of our solution.

# Serve SKLearn model in local mode

At this point, we have our trained model ready. Can we deploy it already?

The answer is no. If we try to deploy this model using command
```
sk_predictor = sk_estimator.deploy(
    initial_instance_count=1,
    instance_type='local'
)
```
It will generate an exception message telling us that the estimator does not know how to load the model. So we need to tell the estimator by implementing `model_fn` function in our script.

```
[2022-07-09 06:15:45 +0000] [31] [ERROR] Error handling request /ping
Traceback (most recent call last):
  File "/miniconda3/lib/python3.8/site-packages/sagemaker_containers/_functions.py", line 93, in wrapper
    return fn(*args, **kwargs)
  File "/miniconda3/lib/python3.8/site-packages/sagemaker_sklearn_container/serving.py", line 43, in default_model_fn
    return transformer.default_model_fn(model_dir)
  File "/miniconda3/lib/python3.8/site-packages/sagemaker_containers/_transformer.py", line 35, in default_model_fn
    raise NotImplementedError(
NotImplementedError: 
Please provide a model_fn implementation.
See documentation for model_fn at https://github.com/aws/sagemaker-python-sdk
```

The *model_fn* has the following signature:
```
def model_fn(model_dir)
```

Besides loading the model, we also need to tell the model server how to get predictions from the loaded model. For this, we need to implement the second function *predict_fn*, which has the following signature.
```
def predict_fn(input_data, model)
```

After we have called the `fit` function on our SKLearn estimator, we can deploy it by calling the `deploy` function to create an inference endpoint. Once you call `deploy` on the estimator two objects are created in response
* SageMaker scikit-learn Endpoint: This Endpoint encapsulates a model server running under it. The model server will load the model saved during training and perform inference on it. It requires two helper functions to load the model and make inferences on it: model_fn and predict_fn.
* Predictor object: This object is returned in response to the deploy call. It can be used to do inference on the Endpoint hosting our SKLearn model.

Let's update our script and add these two functions.

In [58]:
%%writefile $script_file

import argparse, os
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import confusion_matrix
import joblib

from my_custom_library import save_confusion_matrix

if __name__ == "__main__":

    # Pass in environment variables and hyperparameters
    parser = argparse.ArgumentParser()

    # Hyperparameters
    parser.add_argument("--estimators", type=int, default=15)

    # sm_model_dir: model artifacts stored here after training
    # sm-channel-train: input training data location
    # sm-channel-test: input test data location
    # sm-output-data-dir: output artifacts location
    parser.add_argument("--sm-model-dir", type=str, default=os.environ.get("SM_MODEL_DIR"))
    parser.add_argument("--sm-channel-train", type=str, default=os.environ.get("SM_CHANNEL_TRAIN"))
    parser.add_argument("--sm-channel-test", type=str, default=os.environ.get("SM_CHANNEL_TEST"))
    parser.add_argument("--sm-output-data-dir", type=str, default=os.environ.get("SM_OUTPUT_DATA_DIR"))

    args, _ = parser.parse_known_args()

    print("command line arguments: ", args)

    estimators = args.estimators
    sm_model_dir = args.sm_model_dir
    training_dir = args.sm_channel_train
    testing_dir = args.sm_channel_test
    output_data_dir = args.sm_output_data_dir

    print(f"training_dir: {training_dir}")
    print(f"training_dir files list: {os.listdir(training_dir)}")  
    print(f"testing_dir: {testing_dir}")
    print(f"testing_dir files list: {os.listdir(testing_dir)}")
    print(f"sm_model_dir: {sm_model_dir}")
    print(f"output_data_dir: {output_data_dir}")

    # Read in data
    df_train = pd.read_csv(training_dir + "/train.csv", sep=",")
    df_test = pd.read_csv(testing_dir + "/test.csv", sep=",")

    # Preprocess data
    X_train = df_train.drop(["class", "class_cat"], axis=1)
    y_train = df_train["class_cat"]
    X_test = df_test.drop(["class", "class_cat"], axis=1)
    y_test = df_test["class_cat"]

    print(f"X_train.shape: {X_train.shape}")
    print(f"y_train.shape: {y_train.shape}")
    print(f"X_train.shape: {X_test.shape}")
    print(f"y_train.shape: {y_test.shape}")

    sc = StandardScaler()
    X_train = sc.fit_transform(X_train)
    X_test = sc.transform(X_test)

    # Build model
    regressor = RandomForestClassifier(n_estimators=estimators)
    regressor.fit(X_train, y_train)
    y_pred = regressor.predict(X_test)

    # Save the model
    joblib.dump(regressor, sm_model_dir + "/model.joblib")

    # Save the results
    pd.DataFrame(y_pred).to_csv(output_data_dir + "/y_pred.csv")

    # save the confusion matrix
    cf_matrix = confusion_matrix(y_test, y_pred)
    save_confusion_matrix(cf_matrix, output_data_dir)

    # print sm_model_dir info
    print(f"sm_model_dir: {sm_model_dir}")
    print(f"sm_model_dir files list: {os.listdir(sm_model_dir)}")

    # print output_data_dir info
    print(f"output_data_dir: {output_data_dir}")
    print(f"output_data_dir files list: {os.listdir(output_data_dir)}")


# Model serving
"""
Deserialize fitted model
"""
def model_fn(model_dir):
    print(f"model_fn model_dir: {model_dir}")
    model = joblib.load(os.path.join(model_dir, "model.joblib"))
    return model

"""
predict_fn
    input_data: returned array from input_fn above
    model (sklearn model) returned model loaded from model_fn above
"""
def predict_fn(input_data, model):
    return model.predict(input_data)

Overwriting ./datasets/2022-07-07-sagemaker-script-mode/src/train_and_serve.py


In [59]:
#collapse-output
sk_estimator = SKLearn(
    entry_point=script_file_name,
    source_dir=script_path,
    dependencies=[custom_library_path],
    role=role,
    instance_count=1,
    instance_type='local',
    framework_version="1.0-1",
    hyperparameters={"estimators":10},
)

sk_estimator.fit({"train": s3_train_uri, "test": s3_test_uri})

Creating lofp48o42k-algo-1-nqtfs ... 
Creating lofp48o42k-algo-1-nqtfs ... done
Attaching to lofp48o42k-algo-1-nqtfs
[36mlofp48o42k-algo-1-nqtfs |[0m 2022-07-17 12:25:44,053 sagemaker-containers INFO     Imported framework sagemaker_sklearn_container.training
[36mlofp48o42k-algo-1-nqtfs |[0m 2022-07-17 12:25:44,057 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)
[36mlofp48o42k-algo-1-nqtfs |[0m 2022-07-17 12:25:44,067 sagemaker_sklearn_container.training INFO     Invoking user training script.
[36mlofp48o42k-algo-1-nqtfs |[0m 2022-07-17 12:25:44,316 sagemaker-training-toolkit INFO     Installing dependencies from requirements.txt:
[36mlofp48o42k-algo-1-nqtfs |[0m /miniconda3/bin/python -m pip install -r requirements.txt
[36mlofp48o42k-algo-1-nqtfs |[0m Collecting seaborn==0.11.2
[36mlofp48o42k-algo-1-nqtfs |[0m   Downloading seaborn-0.11.2-py3-none-any.whl (292 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m292.8/292.

Failed to delete: /tmp/tmpv9xn65bm/algo-1-nqtfs Please remove it manually.


[36mlofp48o42k-algo-1-nqtfs exited with code 0
[0mAborting on container exit...
===== Job Complete =====


Our model is trained. Let's also deploy it in the local model. For model loading `model_fn`, SageMaker will download the model artifacts from S3 and mount them on `/opt/ml/model`. This way, our script can load the model from within the container. 

In [60]:
#collapse-output
sk_predictor = sk_estimator.deploy(
    initial_instance_count=1,
    instance_type='local'
)

Attaching to mt618rjy05-algo-1-0hal1
[36mmt618rjy05-algo-1-0hal1 |[0m 2022-07-17 12:25:54,247 INFO - sagemaker-containers - No GPUs detected (normal if no gpus installed)
[36mmt618rjy05-algo-1-0hal1 |[0m 2022-07-17 12:25:54,251 INFO - sagemaker-containers - No GPUs detected (normal if no gpus installed)
[36mmt618rjy05-algo-1-0hal1 |[0m 2022-07-17 12:25:54,252 INFO - sagemaker-containers - nginx config: 
[36mmt618rjy05-algo-1-0hal1 |[0m worker_processes auto;
[36mmt618rjy05-algo-1-0hal1 |[0m daemon off;
[36mmt618rjy05-algo-1-0hal1 |[0m pid /tmp/nginx.pid;
[36mmt618rjy05-algo-1-0hal1 |[0m error_log  /dev/stderr;
[36mmt618rjy05-algo-1-0hal1 |[0m 
[36mmt618rjy05-algo-1-0hal1 |[0m worker_rlimit_nofile 4096;
[36mmt618rjy05-algo-1-0hal1 |[0m 
[36mmt618rjy05-algo-1-0hal1 |[0m events {
[36mmt618rjy05-algo-1-0hal1 |[0m   worker_connections 2048;
[36mmt618rjy05-algo-1-0hal1 |[0m }
[36mmt618rjy05-algo-1-0hal1 |[0m 
[36mmt618rjy05-algo-1-0hal1 |[0m http {
[36mmt618rjy

Let's create a sample request and get a prediction from our local inference endpoint.

In [61]:
request = [[9.0, 3571, 1976, 0.525]]

response  = sk_predictor.predict(request)
response = int(response[0])
response

[36mmt618rjy05-algo-1-0hal1 |[0m 2022-07-17 12:26:09,633 INFO - sagemaker-containers - No GPUs detected (normal if no gpus installed)
[36mmt618rjy05-algo-1-0hal1 |[0m model_fn model_dir: /opt/ml/model


2

[36mmt618rjy05-algo-1-0hal1 |[0m 172.18.0.1 - - [17/Jul/2022:12:26:11 +0000] "POST /invocations HTTP/1.1" 200 136 "-" "python-urllib3/1.26.8"


In [62]:
##
# map response to correct category type
print("Predicted class category {} ({})".format(response, categories_map[response]))

Predicted class category 2 (Iris-virginica)


Since the enpoint in running in the local environment we can observe a webserver running in a docker instance.

In [63]:
!docker ps

CONTAINER ID   IMAGE                                                                               COMMAND   CREATED          STATUS          PORTS                                       NAMES
22f06fd3d03f   683313688378.dkr.ecr.us-east-1.amazonaws.com/sagemaker-scikit-learn:1.0-1-cpu-py3   "serve"   18 seconds ago   Up 17 seconds   0.0.0.0:8080->8080/tcp, :::8080->8080/tcp   mt618rjy05-algo-1-0hal1


In [64]:
##
# delete the local endpoint
sk_predictor.delete_endpoint()

Gracefully stopping... (press Ctrl+C again to force)


Note that in local mode we can only serve a single model at a time.

# SKLean model server input and output processing

SageMaker model server breaks the incoming request into three steps:
1. input processing
2. prediction, and
3. output processing

In the last section, we have seen that the `predict_fn` function in the source code file defines model prediction. Similarly, SageMaker provides two additional functions to control input and output processing, defined as `input_fn` and `output_fn`, respectively. Both these function have their default implementations. But we can override them by providing our own implementation for them in the source script. If no definition is provided in the source script, then the SageMaker scikit-learn model server will use the default implementation.

* **`input_fn`**: Takes request data and deserializes the data into an object for prediction.
* **`output_fn`**: Takes the prediction result and serializes this according to the response content type.

Let's update our script to preprocess input request and output response as JSON objects.

In [65]:
%%writefile $script_file

import argparse, os
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import confusion_matrix
import joblib
import json

from my_custom_library import save_confusion_matrix

if __name__ == "__main__":

    # Pass in environment variables and hyperparameters
    parser = argparse.ArgumentParser()

    # Hyperparameters
    parser.add_argument("--estimators", type=int, default=15)

    # sm_model_dir: model artifacts stored here after training
    # sm-channel-train: input training data location
    # sm-channel-test: input test data location
    # sm-output-data-dir: output artifacts location
    parser.add_argument("--sm-model-dir", type=str, default=os.environ.get("SM_MODEL_DIR"))
    parser.add_argument("--sm-channel-train", type=str, default=os.environ.get("SM_CHANNEL_TRAIN"))
    parser.add_argument("--sm-channel-test", type=str, default=os.environ.get("SM_CHANNEL_TEST"))
    parser.add_argument("--sm-output-data-dir", type=str, default=os.environ.get("SM_OUTPUT_DATA_DIR"))

    args, _ = parser.parse_known_args()

    print("command line arguments: ", args)

    estimators = args.estimators
    sm_model_dir = args.sm_model_dir
    training_dir = args.sm_channel_train
    testing_dir = args.sm_channel_test
    output_data_dir = args.sm_output_data_dir

    print(f"training_dir: {training_dir}")
    print(f"training_dir files list: {os.listdir(training_dir)}")  
    print(f"testing_dir: {testing_dir}")
    print(f"testing_dir files list: {os.listdir(testing_dir)}")
    print(f"sm_model_dir: {sm_model_dir}")
    print(f"output_data_dir: {output_data_dir}")

    # Read in data
    df_train = pd.read_csv(training_dir + "/train.csv", sep=",")
    df_test = pd.read_csv(testing_dir + "/test.csv", sep=",")

    # Preprocess data
    X_train = df_train.drop(["class", "class_cat"], axis=1)
    y_train = df_train["class_cat"]
    X_test = df_test.drop(["class", "class_cat"], axis=1)
    y_test = df_test["class_cat"]

    print(f"X_train.shape: {X_train.shape}")
    print(f"y_train.shape: {y_train.shape}")
    print(f"X_train.shape: {X_test.shape}")
    print(f"y_train.shape: {y_test.shape}")

    sc = StandardScaler()
    X_train = sc.fit_transform(X_train)
    X_test = sc.transform(X_test)

    # Build model
    regressor = RandomForestClassifier(n_estimators=estimators)
    regressor.fit(X_train, y_train)
    y_pred = regressor.predict(X_test)

    # Save the model
    joblib.dump(regressor, sm_model_dir + "/model.joblib")

    # Save the results
    pd.DataFrame(y_pred).to_csv(output_data_dir + "/y_pred.csv")

    # save the confusion matrix
    cf_matrix = confusion_matrix(y_test, y_pred)
    save_confusion_matrix(cf_matrix, output_data_dir)

    # print sm_model_dir info
    print(f"sm_model_dir: {sm_model_dir}")
    print(f"sm_model_dir files list: {os.listdir(sm_model_dir)}")

    # print output_data_dir info
    print(f"output_data_dir: {output_data_dir}")
    print(f"output_data_dir files list: {os.listdir(output_data_dir)}")
    
# Model serving
"""
Deserialize fitted model
"""
def model_fn(model_dir):
    print(f"model_fn model_dir: {model_dir}")
    model = joblib.load(os.path.join(model_dir, "model.joblib"))
    return model

"""
predict_fn
    input_data: returned array from input_fn above
    model (sklearn model) returned model loaded from model_fn above
"""
def predict_fn(input_data, model):
    return model.predict(input_data)

"""
input_fn
    request_body: The body of the request sent to the model.
    request_content_type: (string) specifies the format/variable type of the request
"""
def input_fn(request_body, request_content_type):
    if request_content_type == "application/json":
        request_body = json.loads(request_body)
        inpVar = request_body["Input"]
        return inpVar
    else:
        raise ValueError("This model only supports application/json input")

"""
output_fn
    prediction: the returned value from predict_fn above
    content_type: the content type the endpoint expects to be returned. Ex: JSON, string
"""
def output_fn(prediction, content_type):
    res = int(prediction[0])
    respJSON = {"Output": res}
    return respJSON

Overwriting ./datasets/2022-07-07-sagemaker-script-mode/src/train_and_serve.py


In [66]:
##
# train and deploy model with input and output as JSON objects
sk_estimator = SKLearn(
    entry_point=script_file_name,
    source_dir=script_path,
    dependencies=[custom_library_path],
    role=role,
    instance_count=1,
    instance_type='local',
    framework_version="1.0-1",
    hyperparameters={"estimators":10},
)

sk_estimator.fit({"train": s3_train_uri, "test": s3_test_uri})

sk_predictor = sk_estimator.deploy(
    initial_instance_count=1,
    instance_type='local'
)

Creating 3494su1ags-algo-1-zcktn ... 
Creating 3494su1ags-algo-1-zcktn ... done
Attaching to 3494su1ags-algo-1-zcktn
[36m3494su1ags-algo-1-zcktn |[0m 2022-07-17 12:26:14,634 sagemaker-containers INFO     Imported framework sagemaker_sklearn_container.training
[36m3494su1ags-algo-1-zcktn |[0m 2022-07-17 12:26:14,638 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)
[36m3494su1ags-algo-1-zcktn |[0m 2022-07-17 12:26:14,647 sagemaker_sklearn_container.training INFO     Invoking user training script.
[36m3494su1ags-algo-1-zcktn |[0m 2022-07-17 12:26:14,849 sagemaker-training-toolkit INFO     Installing dependencies from requirements.txt:
[36m3494su1ags-algo-1-zcktn |[0m /miniconda3/bin/python -m pip install -r requirements.txt
[36m3494su1ags-algo-1-zcktn |[0m Collecting seaborn==0.11.2
[36m3494su1ags-algo-1-zcktn |[0m   Downloading seaborn-0.11.2-py3-none-any.whl (292 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m292.8/292.8

Failed to delete: /tmp/tmpok7mhpfa/algo-1-zcktn Please remove it manually.


===== Job Complete =====
Attaching to vb2p0ax589-algo-1-h2nqv
[36mvb2p0ax589-algo-1-h2nqv |[0m 2022-07-17 12:26:24,668 INFO - sagemaker-containers - No GPUs detected (normal if no gpus installed)
[36mvb2p0ax589-algo-1-h2nqv |[0m 2022-07-17 12:26:24,672 INFO - sagemaker-containers - No GPUs detected (normal if no gpus installed)
[36mvb2p0ax589-algo-1-h2nqv |[0m 2022-07-17 12:26:24,674 INFO - sagemaker-containers - nginx config: 
[36mvb2p0ax589-algo-1-h2nqv |[0m worker_processes auto;
[36mvb2p0ax589-algo-1-h2nqv |[0m daemon off;
[36mvb2p0ax589-algo-1-h2nqv |[0m pid /tmp/nginx.pid;
[36mvb2p0ax589-algo-1-h2nqv |[0m error_log  /dev/stderr;
[36mvb2p0ax589-algo-1-h2nqv |[0m 
[36mvb2p0ax589-algo-1-h2nqv |[0m worker_rlimit_nofile 4096;
[36mvb2p0ax589-algo-1-h2nqv |[0m 
[36mvb2p0ax589-algo-1-h2nqv |[0m events {
[36mvb2p0ax589-algo-1-h2nqv |[0m   worker_connections 2048;
[36mvb2p0ax589-algo-1-h2nqv |[0m }
[36mvb2p0ax589-algo-1-h2nqv |[0m 
[36mvb2p0ax589-algo-1-h2nqv |

In [67]:
sk_endpoint_name = sk_predictor.endpoint_name
sk_endpoint_name

'sagemaker-scikit-learn-2022-07-17-12-26-22-608'

In [68]:
##
# send JSON request to endpoint
import json

client = session_local.sagemaker_runtime_client

request_body = {"Input": [[9.0, 3571, 1976, 0.525]]}
data = json.loads(json.dumps(request_body))
payload = json.dumps(data)

response = client.invoke_endpoint(
    EndpointName=sk_endpoint_name, ContentType="application/json", Body=payload
)

result = json.loads(response["Body"].read().decode())["Output"]
result

[36mvb2p0ax589-algo-1-h2nqv |[0m 2022-07-17 12:26:39,338 INFO - sagemaker-containers - No GPUs detected (normal if no gpus installed)
[36mvb2p0ax589-algo-1-h2nqv |[0m model_fn model_dir: /opt/ml/model


2

[36mvb2p0ax589-algo-1-h2nqv |[0m 172.18.0.1 - - [17/Jul/2022:12:26:40 +0000] "POST /invocations HTTP/1.1" 200 13 "-" "python-urllib3/1.26.8"


In [69]:
##
# get JSON response from endpoint
print("Predicted class category {} ({})".format(result, categories_map[result]))

Predicted class category 2 (Iris-virginica)


In [None]:
sk_predictor.delete_endpoint()

# SKLearn model training and serving in SageMaker managed environment

We have our script successfully tested in a local environment, and now we are ready to train and serve in the SageMaker managed environment. For this, we only need to change the session type passed to the SKLearn estimator.

In [70]:
##
# train and deploy model with input and output as JSON objects
sk_estimator = SKLearn(
    entry_point=script_file_name,
    source_dir=script_path,
    dependencies=[custom_library_path],
    role=role,
    instance_count=1,
    instance_type='ml.m5.large',
    framework_version="1.0-1",
    hyperparameters={"estimators":10},
)

sk_estimator.fit({"train": s3_train_uri, "test": s3_test_uri})

2022-07-17 12:26:40 Starting - Starting the training job...
2022-07-17 12:27:04 Starting - Preparing the instances for trainingProfilerReport-1658060800: InProgress
......
2022-07-17 12:28:04 Downloading - Downloading input data......
2022-07-17 12:29:04 Training - Downloading the training image...
2022-07-17 12:29:30 Training - Training image download completed. Training in progress.[34m2022-07-17 12:29:33,369 sagemaker-containers INFO     Imported framework sagemaker_sklearn_container.training[0m
[34m2022-07-17 12:29:33,375 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)[0m
[34m2022-07-17 12:29:33,400 sagemaker_sklearn_container.training INFO     Invoking user training script.[0m
[34m2022-07-17 12:29:33,837 sagemaker-training-toolkit INFO     Installing dependencies from requirements.txt:[0m
[34m/miniconda3/bin/python -m pip install -r requirements.txt[0m
[34mCollecting seaborn==0.11.2
  Downloading seaborn-0.11.2-py3-none-any.whl (292 kB

In [72]:
sk_estimator.sagemaker_session

<sagemaker.session.Session at 0x7f6b47fdd880>

In [73]:
sk_predictor = sk_estimator.deploy(
    initial_instance_count=1,
    instance_type='ml.t2.medium'
)

-----------!

In [77]:
##
# send JSON request to endpoint
import json

client = session.sagemaker_runtime_client

request_body = {"Input": [[9.0, 3571, 1976, 0.525]]}
data = json.loads(json.dumps(request_body))
payload = json.dumps(data)

response = client.invoke_endpoint(
    EndpointName=sk_predictor.endpoint_name, 
    ContentType="application/json", 
    Body=payload
)

result = json.loads(response["Body"].read().decode())["Output"]
result

2

In [78]:
##
# get JSON response from endpoint
print("Predicted class category {} ({})".format(result, categories_map[result]))

Predicted class category 2 (Iris-virginica)


In [79]:
sk_predictor.delete_endpoint()