# Background

This notebook provides a sample code on how to deploy Keras Image Classification model (VGG16) on Sagemaker Managed Hosting service. As model server, we use Sagemaker Multi Model Server (MMS). We use pre-trained VGG16 model avialable in Keras Model Zoo.

Sample code is provided "as is" without any guarantees.

### Updating Sagemaker SDK

We use Sagemaker Python SDK to deploy endpoints. Before we beging, we let's update this SDK to the latest version. Note, after this upgrade please restart your Jupyter kernel for change to take effect.

In [None]:
! pip install --upgrade sagemaker

## Build Custom Serving Container

In this example, we buid custom inference container from scratch. As part of this container, we also package our inference code.

### Initiate Sagemaker variables
Below, we import required packaged and define common configuration variables.

In [None]:
import sagemaker, boto3
from sagemaker import get_execution_role

session = sagemaker.Session()
region = session.boto_region_name
role = get_execution_role()
account = boto3.client('sts').get_caller_identity().get('Account')
bucket = session.default_bucket()

model_name="vgg16-model"
endpoint_name= model_name+"-mms-endpoint"
tag = "v1"
image_uri = f"{account}.dkr.ecr.{region}.amazonaws.com/{model_name}:{tag}"

Login to private ECR.

In [None]:
!aws ecr get-login-password --region {region} | docker login --username AWS --password-stdin {account}.dkr.ecr.{region}.amazonaws.com

### Review serving image

In [None]:
! pygmentize Dockerfile

### Review inference code

In case of Sagemaker MMS, you need to create two files:
- `dockerd_entrypoint.py` - a handler service, which will be executed by MMS;
- `model_handler.py` - a handler to load model, run predictions, pre- and post-process the inference inputs.

You can find more details about these files and recruiments here: https://github.com/aws/sagemaker-inference-toolkit/#implementation-steps

In [None]:
! pygmentize serving_src/dockerd_entrypoint.py

In [None]:
! pygmentize serving_src/model_handler.py

### Build and push container to ECR registry

We are ready to build and push our custom serving image to ECR!

In [None]:
!./build_and_push.sh {model_name} {tag}

# Deploy and test Sagemaker Endpoint

Now we have our serving image available in ECR, we are ready to deploy our endpoint to Sagemaker Hosting.

First, we have to create a Sageamaker Model which defines parameters of model such as model data (we skip because we load model using Keras facility) and serving image (we set it to our serving image).

In [None]:
from sagemaker import Model

mms_model = Model(
    image_uri=image_uri,
    model_data=None,
    role=role,
    name=model_name,
    sagemaker_session=session
)

Once model is created, we now deploy this model on the endpoint. Sagemaker Endpoint allows to configure such parameters as number and type of EC2 instances. 

In [None]:
mms_model.deploy(
    initial_instance_count=1,
    instance_type="ml.c5.xlarge", 
    endpoint_name=endpoint_name
)

For testing purposes we take one of the images from ImageNet dataset, and resize it to fit VGG16 model requirements.

In [None]:
! wget https://farm1.static.flickr.com/56/152004091_5bfbc69bb3.jpg

In [None]:
%matplotlib inline
import cv2
import numpy as np
from matplotlib import pyplot as plt


img = cv2.imread('152004091_5bfbc69bb3.jpg')
resized_img = cv2.resize(img, dsize=(224, 224), interpolation=cv2.INTER_CUBIC)
resized_filename = "resized_image.jpg"

cv2.imwrite(resized_filename, resized_img)

plt.imshow(cv2.imread(resized_filename))
plt.show()

To send this image for prediction, we use boto3 `sagemaker-runtime` client. We read the image and send it as payload to Sagemaker endpoint. We expect to get most likely label back.

In [None]:
import boto3

client = boto3.client('sagemaker-runtime')
accept_type = "json"
content_type = 'image/jpeg'
headers = {'content-type': content_type}
payload = open(resized_filename, 'rb')

response = client.invoke_endpoint(
    EndpointName=endpoint_name,
    Body=payload,
    ContentType=content_type,
    Accept = accept_type
)


most_likely_label = response['Body'].read()

print(most_likely_label)