# Amazon SageMaker scikit-learn Bring Your Own Model


---

This notebook's CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook. 

![This us-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-west-2/advanced_functionality|scikit_learn_bring_your_own_model|scikit_learn_bring_your_own_model.ipynb)

---

_**Hosting a pre-trained scikit-learn Model in Amazon SageMaker scikit-learn Container**_

---

---

## Background

Amazon SageMaker includes functionality to support a hosted notebook environment, distributed, serverless training, and real-time hosting. We think it works best when all three of these services are used together, but they can also be used independently.  Some use cases may only require hosting.  Maybe the model was trained prior to Amazon SageMaker existing, in a different service.

This notebook shows how to use a pre-trained scikit-learn model with the Amazon SageMaker scikit-learn container to quickly create a hosted endpoint for that model.
We use the California Housing dataset, present in Scikit-Learn: https://scikit-learn.org/stable/modules/generated/sklearn.datasets.fetch_california_housing.html. The California Housing dataset was originally published in:

> Pace, R. Kelley, and Ronald Barry. "Sparse spatial auto-regressions." Statistics & Probability Letters 33.3 (1997): 291-297.

---
## Setup

Let's start by specifying:

* AWS region.
* The IAM role arn used to give learning and hosting access to your data.
* The S3 bucket that you want to use for training and model data.

In [None]:
!pip install -U sagemaker

In [None]:
import os
import boto3
import re
import json
import pandas as pd
import numpy as np
import sagemaker
from sagemaker import get_execution_role
from sagemaker.sklearn.model import SKLearnModel
from time import strftime, gmtime
import time
from sklearn.model_selection import train_test_split

# import mediapipe as mp
# import cv2
# import matplotlib.pyplot as plt
# from mediapipe import solutions
# from mediapipe.framework.formats import landmark_pb2
# import pickle
# import random
# import time
# import joblib



# GET THE REGION NAME FROM THE SESSION
region = boto3.Session().region_name

# GET THE IAM ROLE DETAILS FOR EXECUTION
role = get_execution_role()

# GET THE DEFAULT BUCKET...WAS SET UP WHEN CREATING NOTEBOOK INSTANCE
bucket = sagemaker.Session().default_bucket()

# SET THE SUB-DIRECTORY FOR THIS INFERENCE MODEL
prefix = "sagemaker/ASL-Test5"


# IGNORE THIS FOR NOW 
# SET UP FROM (https://towardsdatascience.com/deploying-a-pre-trained-sklearn-model-on-amazon-sagemaker-826a2b5ac0b6)
# client = boto3.client(service_name="sagemaker")
# runtime = boto3.client(service_name="sagemaker-runtime")
# boto_session = boto3.session.Session()
# s3 = boto_session.resource('s3')
# region = boto_session.region_name
# sagemaker_session = sagemaker.Session()
# IGNORE THIS FOR NOW 




print(f"role: {role}")
print(f"region: {region}")
print(f"bucket: {bucket}")
print(f"prefix: {prefix}")

In [None]:
import sklearn
print(sklearn.__version__)

In [None]:
# !pip install --upgrade scikit-learn
# !pip install scikit-learn==1.0.2

In [None]:
# import sklearn
# print(sklearn.__version__)

In [None]:
!pip show scikit-learn

In [None]:
# !conda activate conda_python3

In [None]:
# !jupyter kernelspec list

## Prepare data for model inference

Load an image to be sent to the endpoint, use it later to invoke SageMaker Endpoint

## Test the model locally. 

In [None]:
pwd

In [None]:
labels = {"A":"A","B":"B","C":"C","D":"D","E":"E","F":"F","G":"G","H":"H","I":"I","K":"K",
          "L":"L","M":"M","N":"N","O":"O","P":"P","Q":"Q","R":"R","S":"S","T":"T","U":"U",
          "V":"V","W":"W","X":"X","Y":"Y"}

successCode = 0 # 0 = Success/ 1 = Failure

predictedLetter = ''

# LOAD THE TRAINED MODEL FROM THE PATH (**FOR JOBLIB FILES**)
aslModelDict = joblib.load(open('./aslModel.joblib', 'rb'))

# LOAD THE TRAINED MODEL FROM THE PATH (**FOR PICKLE FILES**)
# aslModelDict = pickle.load(open(self.MODEL_PATH,'rb'))

aslModel = aslModelDict['model']
# input("press to continue..")

# SET THE OPTIONS FOR THE LANDMARKER INSTANCE WITH THE IMAGE MODE
options = mp.tasks.vision.HandLandmarkerOptions(
    base_options = mp.tasks.BaseOptions('./code/hand_landmarker.task'),
    running_mode = mp.tasks.vision.RunningMode.IMAGE,
    num_hands=2)

# CREATE A HAND LANDMARKER INSTANCE
detector = mp.tasks.vision.HandLandmarker.create_from_options(options)

# READ IN THE IMAGE FROM THE FILE PATH (THIS IS AN IMAGE OF A LETTER 'A')
# userImage = mp.Image.create_from_file('./Local-Data/A0023_test.jpg')
userImage = mp.Image.create_from_file('./Local-Data/4.jpg')

# DETECT THE LANDMARKS
detection_result = detector.detect(userImage)
# print(f'detection result: {detection_result.hand_landmarks}')

# IF HANDS WERE DETECTED
if detection_result.hand_landmarks:
    # success += 1
    detected = []
    # print("inside the detecttion result loop\n")
    # FOR EACH OF THE HANDS DETECTED, ITERATE THROUGH THEM
    # for idx in range(len(detection_result.hand_landmarks))[:1]:
    for idx in range(len(detection_result.hand_landmarks)):
        # FOR EACH LANDMARK, GET THE X AND Y COORDINATE
        for i in detection_result.hand_landmarks[idx]:
#             print('x is', i.x, 'y is', i.y, 'z is', i.z, 'visibility is', i.visibility)
            x = i.x
            y = i.y

            # STORE X AND Y IN THE TEMP ARRAY
            detected.append(x)
            detected.append(y)



    # RUN THE INFERENCE MODEL AGAINST THE LANDMARKS DETECTED
    prediction = aslModel.predict([np.asarray(detected)])

    # OUTPUT THE RESULT BASED ON MATCHES IN THE LABELS DICTIONARY
    predictedLetter = labels[(prediction[0])]
    
    print(f"The predicted letter is : {predictedLetter}")

    # SET THE LOCAL VAR WITH THE RESULT FROM THE INFERENCE MODEL
    successCode = 0
    
    # CREATE JSON RETURN
    pSon = {'SuccessCode': successCode,
           'InferResult': predictedLetter}
    
    jSon = json.dumps(pSon)
    
    print(f"JSON result string: {jSon}")

# IF NO LANDMARKS WERE DETECTED IN THE IMAGE
else:
    print(f"Inside failed inference classifier")
    predictedLetter = 'None'
    successCode = 1
          
    # CREATE JSON RETURN
    pSon = {'SuccessCode': successCode,
           'InferResult': predictedLetter}
    
    jSon = json.dumps(pSon)
    
    print(f"JSON result string: {jSon}")

### Compress the model file to a GZIP tar archive 

Note that the model file name must satisfy the regular expression pattern: `^[a-zA-Z0-9](-*[a-zA-Z0-9])*;`. The model file needs to be tar-zipped. 

In [None]:
model_file_name = "model.joblib"

In [None]:
!tar cvpzf model.tar.gz $model_file_name

In [None]:
# #IGNORE FOR HOW
# !tar cvpzf aslModel.tar.gz $model_file_name inference.py

## Upload the pre-trained model `model.tar.gz` file to S3

In [None]:
fObj = open("model.tar.gz", "rb")
key = os.path.join(prefix, "model.tar.gz")
boto3.Session().resource("s3").Bucket(bucket).Object(key).upload_fileobj(fObj)

In [None]:
key

## Set up hosting for the model

This involves creating a SageMaker model from the model file previously uploaded to S3.

In [None]:
model_data = "s3://{}/{}".format(bucket, key)
print(f"model data: {model_data}")

### Write the Inference Script

When using endpoints with the Amazon SageMaker managed `Scikit Learn` container, we need to provide an entry point script for inference that will **at least** load the saved model.

After the SageMaker model server has loaded your model by calling `model_fn`, SageMaker will serve your model. Model serving is the process of responding to inference requests, received by SageMaker `InvokeEndpoint` API calls.


We will implement also the `predict_fn()` function that takes the deserialized request object and performs inference against the loaded model.

We will now create this script and call it `inference.py` and store it at the root of a directory called `code`.

**Note:** You would modify the script below to implement your own inferencing logic.

Additional information on model loading and model serving for scikit-learn on SageMaker can be found in the [SageMaker Scikit-learn Model Server documentation](https://sagemaker.readthedocs.io/en/stable/frameworks/sklearn/using_sklearn.html#deploy-a-scikit-learn-model)

There are also several functions for hosting which we won't define,
 - `input_fn()` - Takes request data and deserializes the data into an object for prediction.
 - `output_fn()` - Takes the result of prediction and serializes this according to the response content type.

These will take on their default values as described [SageMaker Scikit-learn Serve a Model documentation](https://sagemaker.readthedocs.io/en/stable/frameworks/sklearn/using_sklearn.html#serve-a-model)

In [None]:
!pygmentize ./code/inference.py

### Installing additional Python dependencies

It also may be necessary to supply a `requirements.txt` file to ensure any necessary dependencies are installed in the container along with the script. For this script, in addition to the Python standard libraries, we showcase how to install the `boto3` `requests`, and `nltk` libraries.

In [None]:
!pygmentize ./code/requirements.txt

Retrieve sklearn image (from https://towardsdatascience.com/deploying-a-pre-trained-sklearn-model-on-amazon-sagemaker-826a2b5ac0b6)

In [None]:
# IGNORE FOR NOW
image_uri = sagemaker.image_uris.retrieve(
    framework="sklearn",
    region=region,
    version="0.23-1",
    py_version="py3",
    instance_type="ml.t2.medium",
)

In [None]:
# IGNORE FOR NOW
client = boto3.client(service_name="sagemaker")
runtime = boto3.client(service_name="sagemaker-runtime")
boto_session = boto3.session.Session()
s3 = boto_session.resource('s3')
region = boto_session.region_name
sagemaker_session = sagemaker.Session()

Model Creation (from https://towardsdatascience.com/deploying-a-pre-trained-sklearn-model-on-amazon-sagemaker-826a2b5ac0b6)

In [None]:
# IGNORE FOR NOW
model_name = "sklearn-test" + strftime("%Y-%m-%d-%H-%M-%S", gmtime())
print("Model name: " + model_name)
create_model_response = client.create_model(
    ModelName=model_name,
    Containers=[
        {
            "Image": image_uri,
            "Mode": "SingleModel",
            "ModelDataUrl": model_data,
            "Environment": {'SAGEMAKER_SUBMIT_DIRECTORY': model_data,
                           'SAGEMAKER_PROGRAM': 'inference.py'} 
        }
    ],
    ExecutionRoleArn=role,
)
print("Model Arn: " + create_model_response["ModelArn"])

### Deploy with Python SDK

Here we showcase the process of creating a model from s3 artifacts, that could be used to deploy a model that was trained in a different session or even out of SageMaker.

In [None]:
# IGNORE FOR NOW
sklearn_epc_name = "sklearn-epc" + strftime("%Y-%m-%d-%H-%M-%S", gmtime())
endpoint_config_response = client.create_endpoint_config(
    EndpointConfigName=sklearn_epc_name,
    ProductionVariants=[
        {
            "VariantName": "sklearnvariant",
            "ModelName": model_name,
            "InstanceType": "ml.t2.medium",
            "InitialInstanceCount": 1
        },
    ],
)
print("Endpoint Configuration Arn: " + endpoint_config_response["EndpointConfigArn"])

#Step 3: EP Creation

In [None]:
# IGNORE FOR NOW
import time

In [None]:
# IGNORE FOR NOW
endpoint_name = "sklearn-local-ep" + strftime("%Y-%m-%d-%H-%M-%S", gmtime())
create_endpoint_response = client.create_endpoint(
    EndpointName=endpoint_name,
    EndpointConfigName=sklearn_epc_name,
)
print("Endpoint Arn: " + create_endpoint_response["EndpointArn"])


#Monitor creation
describe_endpoint_response = client.describe_endpoint(EndpointName=endpoint_name)
while describe_endpoint_response["EndpointStatus"] == "Creating":
    describe_endpoint_response = client.describe_endpoint(EndpointName=endpoint_name)
    print(describe_endpoint_response["EndpointStatus"])
    time.sleep(15)
print(describe_endpoint_response)

In [None]:
model = SKLearnModel(
    role=role,
    model_data=model_data,
    framework_version="1.2-1",
    py_version="py3",
    source_dir="code",
    entry_point="inference.py",
)


In [None]:
pwd

### Create endpoint
Lastly, you create the endpoint that serves up the model, through specifying the name and configuration defined above. The end result is an endpoint that can be validated and incorporated into production applications. This takes 5-10 minutes to complete.

In [None]:
%%time

predictor = model.deploy(instance_type="ml.t2.medium", initial_instance_count=1)

In [None]:
predictor

## Validate the model for use
Now you can obtain the endpoint from the client library using the result from previous operations and generate classifications from the model using that endpoint.

### Invoke with the Python SDK

Let's generate the prediction for a single data point. We'll pick one from the test data generated earlier.

In [None]:
# the SKLearnPredictor does the serialization from pandas for us
predictions = predictor.predict(testX[data.feature_names])
print(predictions)

### Alternative: invoke with `boto3`

This is useful when invoking the model from external clients, e.g. Lambda Functions, or other micro-services.

In [None]:
runtime = boto3.client("sagemaker-runtime")

#### Option 1: `csv` serialization

In [None]:
# csv serialization
response = runtime.invoke_endpoint(
    EndpointName=predictor.endpoint,
    Body=testX[data.feature_names].to_csv(header=False, index=False).encode("utf-8"),
    ContentType="text/csv",
)

print(response["Body"].read())

#### Option 2: `npy` serialization

In [None]:
# npy serialization
from io import BytesIO


# Serialise numpy ndarray as bytes
buffer = BytesIO()
# Assuming testX is a data frame
np.save(buffer, testX[data.feature_names].values)

response = runtime.invoke_endpoint(
    EndpointName=predictor.endpoint, Body=buffer.getvalue(), ContentType="application/x-npy"
)

print(response["Body"].read())

My created option

In [None]:
import base64
import json                    

import requests

# api = 'http://localhost:8080/test'
image_file = './Local-Data/4.jpg'

with open(image_file, "rb") as f:
    im_bytes = f.read()        
im_b64 = base64.b64encode(im_bytes).decode("utf8")

# headers = {'Content-type': 'application/json', 'Accept': 'text/plain'}
  
payload = json.dumps({"image": im_b64})
response = runtime.invoke_endpoint(
    EndpointName=predictor.endpoint, Body=payload,ContentType="application/json")
# response = requests.post(api, data=payload, headers=headers)
try:
    data = response.json()     
    print(data)                
except requests.exceptions.RequestException:
    print(response.text)

### (Optional) Delete the Endpoint

If you're ready to be done with this notebook, please run the delete_endpoint line in the cell below.  This will remove the hosted endpoint you created and avoid any charges from a stray instance being left on.

In [None]:
predictor.delete_endpoint()

## Conclusion

In this notebook you successfully deployed a pre-trained scikit-learn model with the Amazon SageMaker scikit-learn container to quickly create a hosted endpoint for that model.
You then used the Python SDK and `boto3` to invoke the endpoint with `csv` payload, and then with `npy` payload to get predictions from the model.

As next steps you can try to [Automatically Scale Amazon SageMaker Models](https://docs.aws.amazon.com/sagemaker/latest/dg/endpoint-auto-scaling.html), [Register and Deploy Models with Model Registry](https://docs.aws.amazon.com/sagemaker/latest/dg/model-registry.html) or [Train your Model with Amazon SageMaker](https://docs.aws.amazon.com/sagemaker/latest/dg/how-it-works-training.html).


## Notebook CI Test Results

This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.

![This us-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-east-1/advanced_functionality|scikit_learn_bring_your_own_model|scikit_learn_bring_your_own_model.ipynb)

![This us-east-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-east-2/advanced_functionality|scikit_learn_bring_your_own_model|scikit_learn_bring_your_own_model.ipynb)

![This us-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-west-1/advanced_functionality|scikit_learn_bring_your_own_model|scikit_learn_bring_your_own_model.ipynb)

![This ca-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ca-central-1/advanced_functionality|scikit_learn_bring_your_own_model|scikit_learn_bring_your_own_model.ipynb)

![This sa-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/sa-east-1/advanced_functionality|scikit_learn_bring_your_own_model|scikit_learn_bring_your_own_model.ipynb)

![This eu-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-west-1/advanced_functionality|scikit_learn_bring_your_own_model|scikit_learn_bring_your_own_model.ipynb)

![This eu-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-west-2/advanced_functionality|scikit_learn_bring_your_own_model|scikit_learn_bring_your_own_model.ipynb)

![This eu-west-3 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-west-3/advanced_functionality|scikit_learn_bring_your_own_model|scikit_learn_bring_your_own_model.ipynb)

![This eu-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-central-1/advanced_functionality|scikit_learn_bring_your_own_model|scikit_learn_bring_your_own_model.ipynb)

![This eu-north-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-north-1/advanced_functionality|scikit_learn_bring_your_own_model|scikit_learn_bring_your_own_model.ipynb)

![This ap-southeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-southeast-1/advanced_functionality|scikit_learn_bring_your_own_model|scikit_learn_bring_your_own_model.ipynb)

![This ap-southeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-southeast-2/advanced_functionality|scikit_learn_bring_your_own_model|scikit_learn_bring_your_own_model.ipynb)

![This ap-northeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-northeast-1/advanced_functionality|scikit_learn_bring_your_own_model|scikit_learn_bring_your_own_model.ipynb)

![This ap-northeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-northeast-2/advanced_functionality|scikit_learn_bring_your_own_model|scikit_learn_bring_your_own_model.ipynb)

![This ap-south-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-south-1/advanced_functionality|scikit_learn_bring_your_own_model|scikit_learn_bring_your_own_model.ipynb)
