# Amazon SageMaker + WhyLabs

This example shows how to deploy a SageMaker endpoint with WhyLabs integration.

In [None]:
!pip install boto3==1.18.39 python-dotenv==0.19.0 scikit-learn==0.24.2 pandas==1.3.2

In [None]:
import os
import json
import random
import urllib.request as urllib
from joblib import dump
import pandas as pd
from sklearn.svm import SVC
from sklearn.model_selection import train_test_split
import boto3
from dotenv import dotenv_values
from utils import delete_model, delete_endpoint_config, delete_endpoint, is_endpoint_running

## Table of Contents
- [1 - AWS Configuration](#1)
- [2 - Train a Model](#2)
- [3 - Custom image building and pushing to ECR](#3)
- [4 - Create SageMaker Endpoint](#4)
    - [a. Model Creation](#a)
    - [b. Endpoint Config Creation](#b)
    - [c. Endpoint Creation](#c)
- [5 - Test Endpoint](#5)
- [6 - Delete AWS resources](#6)

<a name='1'></a>
## 1 - AWS configuration

In [None]:
AWS_PROFILE_NAME = "default"
session = boto3.session.Session(profile_name=AWS_PROFILE_NAME)
AWS_REGION_NAME = session.region_name

In [None]:
sts = session.client("sts")
sm = session.client('sagemaker', region_name=AWS_REGION_NAME)
AWS_ACCOUNT_ID = sts.get_caller_identity().get("Account")
DOCKER_IMAGE_NAME = "whylabs-sagemaker"

<a name='2'></a>
## 2 - Train a Model

Download Iris Species dataset:

In [None]:
# Download Iris dataset and save it as csv
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"
raw_data = urllib.urlopen(url)
try:
    os.mkdir("code/dataset/")
    # Save data as csv
    with open('code/dataset/Iris.csv', 'wb') as file:
        file.write(raw_data.read())
        print("Dataset downloaded successfully!")    
except Exception as e:
    print(" 'dataset' directory already existed. Moving forward")

Split data set into train and test sets

In [None]:
# Load data
data = pd.read_csv('code/dataset/Iris.csv', header=None)
# Separating the independent variables from dependent variables
X = data.iloc[:, 0:4].values
y = data.iloc[:, -1].values
x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.30)

Train the SVM classifier

In [None]:
# Train a classifier
print("Train started.")
model = SVC()
model.fit(x_train, y_train)
print("Train finished.")
# Save the model
dump(model, 'code/model.joblib')
print("Model saved as model.joblib")

Split data set into train and test sets

<a name='3'></a>
## 3 - Custom image building and pushing to ECR

In [None]:
os.system(f"./build_push.sh {DOCKER_IMAGE_NAME} {AWS_PROFILE_NAME}")

<a name='4'></a>
## 4 - Create SageMaker Endpoint

The steps to deploy a SageMaker model are:

1. Create a model
2. Create an endpoint configuration
3. Create a SageMaker endpoint

<a name='a'></a>
### a. Model Creation

**Replace the following Role ARN accordingly.**

In [None]:
EXECUTION_ROLE_ARN = f"arn:aws:iam::{AWS_ACCOUNT_ID}:role/SageMakerExecution"

In [None]:
ECR_IMAGE_URI = f"{AWS_ACCOUNT_ID}.dkr.ecr.{AWS_REGION_NAME}.amazonaws.com/{DOCKER_IMAGE_NAME}:latest"
ENDPOINT_NAME = "whylabs-sagemaker"
INSTANCE_TYPE = "ml.m4.xlarge"

Load variables important for __WhyLabs configuration__ defined inside __.env file__ as dictionary. This values will be settled once the docker container is running within SageMaker.

In [None]:
# Load .env file as dictionary
environment = dotenv_values("code/.env")

In [None]:
# ECR image to be used
PRIMARY_CONTAINER = {
    'Image': ECR_IMAGE_URI, 
    "Environment": environment,
}

In [None]:
try:
    # Create sagemaker model
    r = sm.create_model(
        ModelName=ENDPOINT_NAME,
        ExecutionRoleArn=EXECUTION_ROLE_ARN,
        PrimaryContainer=PRIMARY_CONTAINER,
    )
    print("SageMaker model created.")
except Exception as e:
    print(e.response["Error"])

<a name='b'></a>
### b. Endpoint Config creation

In [None]:
ENDPOINT_CONFIG_NAME = ENDPOINT_NAME + '-config'

In [None]:
try:
    # create endpoint configuration
    _ = sm.create_endpoint_config(
        EndpointConfigName=ENDPOINT_CONFIG_NAME,
        ProductionVariants=[
            {
                'InstanceType': INSTANCE_TYPE,
                'InitialVariantWeight': 1,
                'InitialInstanceCount': 1,
                'ModelName': ENDPOINT_NAME,
                'VariantName': 'AllTraffic'
            }
        ]
    )
    print("Endpoint configuration created.")
except Exception as e:
    print(e.response["Error"])

<a name='c'></a>
### c. Endpoint creation

In [None]:
try:
    # create endpoint
    r = sm.create_endpoint(
        EndpointName=ENDPOINT_NAME,
        EndpointConfigName=ENDPOINT_CONFIG_NAME
    )
    print(f"Completed {ENDPOINT_NAME} model endpoint deployment !!!")
except Exception as e:
    print(e.response["Error"])

<a name='5'></a>
## 5 - Test Endpoint 

You have to wait to the model to be in "InService" status to test it.

In [None]:
labels = ["sepal_length_cm", "sepal_width_cm", "petal_length_cm", "petal_width_cm"]

In [None]:
# Invoke the endpoint using
sg = session.client("runtime.sagemaker", region_name=AWS_REGION_NAME)
status = is_endpoint_running(ENDPOINT_NAME, AWS_PROFILE_NAME, AWS_REGION_NAME)
# Check if model was created successfully
if status == "InService":
    while True:
        # Build a payload with random values
        payload = dict(zip(labels, random.choice(x_test)))
        payload = json.dumps(payload)
        # Send payload to sagemaker endpoint
        response = sg.invoke_endpoint(
            EndpointName=ENDPOINT_NAME,
            Body=payload,
            ContentType='application/json',
        )
        # Decode the response
        print(json.loads(response["Body"].read().decode("utf-8")))
else:
    print(f"Endpoint status is {status}.")

Response should look like this:
```bash
{'data': {'class': 'Iris-setosa'}, 'message': 'Success'}
```

<a name='6'></a>
## 6 - Delete AWS resources

In [None]:
status = is_endpoint_running(ENDPOINT_NAME, AWS_PROFILE_NAME, AWS_REGION_NAME)
status

In [None]:
if status in ["InService", "Failed"]:
    delete_model(sm, ENDPOINT_NAME)
    delete_endpoint_config(sm, ENDPOINT_CONFIG_NAME)
    delete_endpoint(sm, ENDPOINT_NAME)