# Amazon SageMaker + WhyLabs

This example shows how to deploy a SageMaker endpoint with WhyLabs integration.

In [1]:
!pip install boto3==1.18.39 python-dotenv==0.19.0 scikit-learn==0.24.2 pandas==1.3.2



In [10]:
import os
import json
import urllib.request as urllib
from joblib import dump
import pandas as pd
from sklearn.svm import SVC
from sklearn.model_selection import train_test_split
import boto3
from dotenv import dotenv_values
from utils import delete_model, delete_endpoint_config, delete_endpoint, is_endpoint_running

## Table of Contents
- [1 - AWS Configuration](#1)
- [2 - Train a Model](#2)
- [3 - Custom image building and pushing to ECR](#3)
- [4 - Create SageMaker Endpoint](#4)
    - [a. Model Creation](#a)
    - [b. Endpoint Config Creation](#b)
    - [c. Endpoint Creation](#c)
- [5 - Test Endpoint](#5)
- [6 - Delete AWS resources](#6)

<a name='1'></a>
## 1 - AWS configuration

In [3]:
AWS_PROFILE_NAME = "default"
session = boto3.session.Session(profile_name=AWS_PROFILE_NAME)
AWS_REGION_NAME = session.region_name

In [4]:
sts = session.client("sts")
sm = session.client('sagemaker', region_name=AWS_REGION_NAME)
AWS_ACCOUNT_ID = sts.get_caller_identity().get("Account")
DOCKER_IMAGE_NAME = "whylabs-sagemaker"

In [5]:
!cd code/ && chmod +x download_iris.sh && ./download_iris.sh

chmod: download_iris.sh: No such file or directory


<a name='2'></a>
## 2 - Train a Model

Download Iris Species dataset:

In [6]:
# Download Iris dataset and save it as csv
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"
raw_data = urllib.urlopen(url)
try:
    os.mkdir("code/dataset/")
except Exception as e:
    print(" 'dataset' directory already existed. Moving forward")
# Save data as csv
with open('code/dataset/Iris.csv', 'wb') as file:
    file.write(raw_data.read())

 'dataset' directory already existed. Moving forward


Split data set into train and test sets

In [11]:
# Load data
data = pd.read_csv('code/dataset/Iris.csv', header=None)
# Separating the independent variables from dependent variables
X = data.iloc[:, 0:4].values
y = data.iloc[:, -1].values
x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.30)

Train the SVM classifier

In [12]:
# Train a classifier
print("Train started.")
model = SVC()
model.fit(x_train, y_train)
print("Train finished.")
# Save the model
dump(model, 'code/model.joblib')
print("Model saved as model.joblib")

Train started.
Train finished.
Model saved as model.joblib


Split data set into train and test sets

<a name='3'></a>
## 3 - Custom image building and pushing to ECR

In [13]:
os.system(f"./build_push.sh {DOCKER_IMAGE_NAME} {AWS_PROFILE_NAME}")

Image name whylabs-sagemaker
Profile name default
{
    "repositories": [
        {
            "repositoryArn": "arn:aws:ecr:us-east-1:377983720232:repository/whylabs-sagemaker",
            "registryId": "377983720232",
            "repositoryName": "whylabs-sagemaker",
            "repositoryUri": "377983720232.dkr.ecr.us-east-1.amazonaws.com/whylabs-sagemaker",
            "createdAt": "2021-09-08T17:32:14-05:00",
            "imageTagMutability": "MUTABLE",
            "imageScanningConfiguration": {
                "scanOnPush": false
            },
            "encryptionConfiguration": {
                "encryptionType": "AES256"
            }
        }
    ]
}
Login Succeeded


#1 [internal] load build definition from Dockerfile
#1 sha256:1d16191eb0edc2a4ae18df11b0c851eedec9e0344d141dee636dba0535e95c11
#1 transferring dockerfile: 37B 0.0s done
#1 DONE 0.0s

#2 [internal] load .dockerignore
#2 sha256:90845661462f7aaba624dd9d89b221f22e446fb81e5b005fdaa8c7f1e21862f0
#2 transferring context: 35B done
#2 DONE 0.0s

#3 [internal] load metadata for docker.io/library/ubuntu:18.04
#3 sha256:ae46bbb1b755529d0da663ca0256a22acd7c9fe21844946c149800baa67c4e4b
#3 DONE 0.6s

#4 [ 1/11] FROM docker.io/library/ubuntu:18.04@sha256:9bc830af2bef73276515a29aa896eedfa7bdf4bdbc5c1063b4c457a4bbb8cd79
#4 sha256:d1750e31869fe5a60e2fad31896f5d8b06a6c26d3a20b7f5836401e641279689
#4 DONE 0.0s

#7 [internal] load build context
#7 sha256:d7bf9fc60478e826a0a0597dae2b9590d24aa052cdce1d8aacabb0bdbe1a4ad7
#7 transferring context: 10.22kB 0.0s done
#7 DONE 0.0s

#10 [ 6/11] RUN mkdir -p /opt/ml/models
#10 sha256:fae6a7def4bb4bcb0ebe5b2b2dcd6e753fc7906fd8eae35ef7ac9fec9302bbcd
#10 CACHED

#5 [ 2/1

The push refers to repository [377983720232.dkr.ecr.us-east-1.amazonaws.com/whylabs-sagemaker]
5f70bf18a086: Preparing
2705541a770f: Preparing
0db6bc60cf67: Preparing
66d55ab066d7: Preparing
94df9b81d20b: Preparing
1d32553c20b9: Preparing
87953760e835: Preparing
42fafac3315e: Preparing
d20859943999: Preparing
cb8615e15e41: Preparing
6babb56be259: Preparing
42fafac3315e: Waiting
cb8615e15e41: Waiting
d20859943999: Waiting
1d32553c20b9: Waiting
6babb56be259: Waiting
87953760e835: Waiting
0db6bc60cf67: Layer already exists
66d55ab066d7: Layer already exists
5f70bf18a086: Layer already exists
94df9b81d20b: Layer already exists
87953760e835: Layer already exists
1d32553c20b9: Layer already exists
42fafac3315e: Layer already exists
d20859943999: Layer already exists
cb8615e15e41: Layer already exists
6babb56be259: Layer already exists
2705541a770f: Pushed
latest: digest: sha256:89468379b705a94b5486f987a3b544f59ee9779daba20a6ccbbd50f5cd8a490a size: 2615


0

<a name='4'></a>
## 4 - Create SageMaker Endpoint

The steps to deploy a SageMaker model are:

1. Create a model
2. Create an endpoint configuration
3. Create a SageMaker endpoint

<a name='a'></a>
### a. Model Creation

**Replace the following Role ARN accordingly.**

In [14]:
EXECUTION_ROLE_ARN = f"arn:aws:iam::{AWS_ACCOUNT_ID}:role/SageMakerExecution"

In [15]:
ECR_IMAGE_URI = f"{AWS_ACCOUNT_ID}.dkr.ecr.{AWS_REGION_NAME}.amazonaws.com/{DOCKER_IMAGE_NAME}:latest"
ENDPOINT_NAME = "whylabs-sagemaker"
INSTANCE_TYPE = "ml.m4.xlarge"

Load variables important for __WhyLabs configuration__ defined inside __.env file__ as dictionary. This values will be settled once the docker container is running within SageMaker.

In [16]:
# Load .env file as dictionary
environment = dotenv_values("code/.env")

In [17]:
# ECR image to be used
PRIMARY_CONTAINER = {
    'Image': ECR_IMAGE_URI, 
    "Environment": environment,
}

In [18]:
try:
    # Create sagemaker model
    r = sm.create_model(
        ModelName=ENDPOINT_NAME,
        ExecutionRoleArn=EXECUTION_ROLE_ARN,
        PrimaryContainer=PRIMARY_CONTAINER,
    )
    print("SageMaker model created.")
except Exception as e:
    print(e.response["Error"])

SageMaker model created.


<a name='b'></a>
### b. Endpoint Config creation

In [19]:
ENDPOINT_CONFIG_NAME = ENDPOINT_NAME + '-config'

In [20]:
try:
    # create endpoint configuration
    _ = sm.create_endpoint_config(
        EndpointConfigName=ENDPOINT_CONFIG_NAME,
        ProductionVariants=[
            {
                'InstanceType': INSTANCE_TYPE,
                'InitialVariantWeight': 1,
                'InitialInstanceCount': 1,
                'ModelName': ENDPOINT_NAME,
                'VariantName': 'AllTraffic'
            }
        ]
    )
    print("Endpoint configuration created.")
except Exception as e:
    print(e.response["Error"])

Endpoint configuration created.


<a name='c'></a>
### c. Endpoint creation

In [21]:
try:
    # create endpoint
    r = sm.create_endpoint(
        EndpointName=ENDPOINT_NAME,
        EndpointConfigName=ENDPOINT_CONFIG_NAME
    )
    print(f"Completed {ENDPOINT_NAME} model endpoint deployment !!!")
except Exception as e:
    print(e.response["Error"])

Completed whylabs-sagemaker model endpoint deployment !!!


<a name='5'></a>
## 5 - Test Endpoint 

In [22]:
# Payload for /invocations endpoint
payload = json.dumps({
    "sepal_length_cm": 5.1,
    "sepal_width_cm": 3.5,
    "petal_length_cm": 1.4,
    "petal_width_cm": 0.2
})

You have to wait to the model to be in "InService" status to test it.

In [24]:
# Invoke the endpoint using
sg = session.client("runtime.sagemaker", region_name=AWS_REGION_NAME)
status = is_endpoint_running(ENDPOINT_NAME, AWS_PROFILE_NAME, AWS_REGION_NAME)
# Check if model was created successfully
if status == "InService":
    response = sg.invoke_endpoint(
        EndpointName=ENDPOINT_NAME,
        Body=payload,
        ContentType='application/json',
    )
    # Decode the response
    print(json.loads(response["Body"].read().decode("utf-8")))
else:
    print(f"Endpoint status is {status}.")

{'data': {'class': 'Iris-setosa'}, 'message': 'Success'}


Response should look like this:
```bash
{'data': {'class': 'Iris-setosa'}, 'message': 'Success'}
```

<a name='6'></a>
## 6 - Delete AWS resources

In [25]:
status = is_endpoint_running(ENDPOINT_NAME, AWS_PROFILE_NAME, AWS_REGION_NAME)

In [26]:
if status == "InService":
    delete_model(sm, ENDPOINT_NAME)
    delete_endpoint_config(sm, ENDPOINT_CONFIG_NAME)
    delete_endpoint(sm, ENDPOINT_NAME)

Model whylabs-sagemaker deleted.
Endpoint configuration whylabs-sagemaker-config deleted.
Endpoint whylabs-sagemaker deleted.
