# Train custom segmentation model with `IceVision`, `OpenImages`, and `SageMaker`
## Serving PyTorch Models In Production Natively With Amazon Sagemaker

Sources:
- https://torchserve-on-aws.workshop.aws/en/100-introduction.html
- https://github.com/aws-samples/amazon-sagemaker-endpoint-deployment-of-fastai-model-with-torchserve

## Setup Your Hosting Environment
The focus of this lab is around model serving. In that vain, we have taken care of of the data preparation and model training. 
This lab exercise is using a [HuggingFace Transformer](https://huggingface.co/transformers/) which provides us with a general-purpose architecture for Natural Language Understanding (NLU). Specifically, we are presenting you with a [RoBERTa base](https://huggingface.co/roberta-base) transformer that was fined tuned to perform sentiment analysis. The pre-trained checkpoint loads the additional head layers and will output ``positive``, ``neutral``, and ``negative`` sentiment or text. 

In [44]:
import sagemaker
import boto3
import datetime
from sagemaker.utils import name_from_base
from sagemaker.pytorch import PyTorchModel
from sagemaker.predictor import Predictor, json_serializer, json_deserializer

In [45]:
client = boto3.client('sagemaker')
role = sagemaker.get_execution_role()
sagemaker_session = sagemaker.session.Session()
bucket = sagemaker_session.default_bucket()
name = 'maskrcnn-background-remover'

training_jobs = client.list_training_jobs(
    NameContains='mask-rcnn',
    StatusEquals='Failed',
    SortBy='CreationTime',
    SortOrder='Descending',
)

training_job_name = training_jobs['TrainingJobSummaries'][0]['TrainingJobName']
model_artifact = f's3://{bucket}/{training_job_name}/output/model.tar.gz'

## Create Your Endpoint
We will now create and deploy our model. To begin, we need to construct a new PyTorchModel object which points to the pre-trained model artifacts from the above step and also points to the inference code that we wish to use. We will then call the deploy method to launch the deployment container on our TorchServe powered Amazon SageMaker endpoint.

In [138]:
class ImageSegmenter(Predictor):
    def __init__(self, endpoint_name, sagemaker_session):
        super().__init__(endpoint_name, sagemaker_session=sagemaker_session, 
                         serializer=json_serializer, deserializer=json_deserializer)

# Create SageMaker model and deploy an endpoint
sm_pytorch_compiled_model = PyTorchModel(
    model_data=model_artifact,
    name=name_from_base(f'{name}-torchserve'),
    role=role,
    entry_point='torchserve-predictor.py',
    source_dir='../2_deployment_code/serving_natively_with_amazon_sagemaker',
    framework_version='1.7.1',
    py_version='py36',
    predictor_cls=ImageSegmenter,
)

In [139]:
# It will take around 7 minutes for your TorchServe powered endpoint to spin up on Amazon SageMaker 
endpoint_name = name_from_base(f'{name}-model')
instance_type='ml.m5.xlarge'
# instance_type='local'

predictor = sm_pytorch_compiled_model.deploy(
    initial_instance_count=1, 
    instance_type=instance_type,
    endpoint_name=endpoint_name)

Attaching to hdh0k57bfc-algo-1-rjgi1
[36mhdh0k57bfc-algo-1-rjgi1 |[0m ['torchserve', '--start', '--model-store', '/.sagemaker/ts/models', '--ts-config', '/etc/sagemaker-ts.properties', '--log-config', '/opt/conda/lib/python3.6/site-packages/sagemaker_pytorch_serving_container/etc/log4j.properties', '--models', 'model.mar']
[36mhdh0k57bfc-algo-1-rjgi1 |[0m 2021-04-06 18:48:52,844 [INFO ] main org.pytorch.serve.ModelServer - 
[36mhdh0k57bfc-algo-1-rjgi1 |[0m Torchserve version: 0.3.0
[36mhdh0k57bfc-algo-1-rjgi1 |[0m TS Home: /opt/conda/lib/python3.6/site-packages
[36mhdh0k57bfc-algo-1-rjgi1 |[0m Current directory: /
[36mhdh0k57bfc-algo-1-rjgi1 |[0m Temp directory: /home/model-server/tmp
[36mhdh0k57bfc-algo-1-rjgi1 |[0m Number of GPUs: 0
[36mhdh0k57bfc-algo-1-rjgi1 |[0m Number of CPUs: 4
[36mhdh0k57bfc-algo-1-rjgi1 |[0m Max heap size: 1908 M
[36mhdh0k57bfc-algo-1-rjgi1 |[0m Python executable: /opt/conda/bin/python3.6
[36mhdh0k57bfc-algo-1-rjgi1 |[0m Config file: /etc

In [140]:
import requests
import io
import time
import base64

from PIL import Image
from urllib.request import urlopen

In [144]:
def get_image_bytes(url_or_path:str):
    try:
        response = requests.get(url_or_path)
        data = urlopen(url_or_path)
    except Exception:
        data = open(url_or_path,'rb')
    return data.read()

In [145]:
path_to_image = 'https://df2sm3urulav.cloudfront.net/tenants/ca/uploads/images/0-4999/1601/5d82a21c1abf4.jpg'
# path_to_image = 'test_images/FindID_161098.jpg'

In [148]:
payload = get_image_bytes(path_to_image)
inference_response = predictor.predict(data=base64.b64encode(payload).decode('utf-8'), 
                                       initial_args = {"ContentType": "application/json"})

The json_serializer has been renamed in sagemaker>=2.
See: https://sagemaker.readthedocs.io/en/stable/v2.html for details.
The json_deserializer has been renamed in sagemaker>=2.
See: https://sagemaker.readthedocs.io/en/stable/v2.html for details.


[36mhdh0k57bfc-algo-1-rjgi1 |[0m 2021-04-06 18:49:57,636 [INFO ] W-9003-model_1 org.pytorch.serve.wlm.WorkerThread - Backend response time: 1
[36mhdh0k57bfc-algo-1-rjgi1 |[0m 2021-04-06 18:49:57,636 [INFO ] W-9003-model_1 ACCESS_LOG - /172.18.0.1:49878 "POST /invocations HTTP/1.1" 500 2
[36mhdh0k57bfc-algo-1-rjgi1 |[0m 2021-04-06 18:49:57,636 [INFO ] W-9003-model_1 TS_METRICS - Requests5XX.Count:1|#Level:Host|#hostname:04ce76da4041,timestamp:null
[36mhdh0k57bfc-algo-1-rjgi1 |[0m 2021-04-06 18:49:57,637 [INFO ] W-9003-model_1-stdout MODEL_METRICS - PredictionTime.Milliseconds:0.45|#ModelName:model,Level:Model|#hostname:04ce76da4041,requestID:4865609d-9e4c-4e28-b7fc-f4af234a93fc,timestamp:1617734997
[36mhdh0k57bfc-algo-1-rjgi1 |[0m 2021-04-06 18:49:57,637 [INFO ] W-9003-model_1 TS_METRICS - QueueTime.ms:0|#Level:Host|#hostname:04ce76da4041,timestamp:null
[36mhdh0k57bfc-algo-1-rjgi1 |[0m 2021-04-06 18:49:57,637 [INFO ] W-9003-model_1 TS_METRICS - WorkerThreadTime.ms:1|#Level:H

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

In [152]:
import base64
import json

client = boto3.client('sagemaker-runtime')

with open('test_images/FindID_161098.jpg', "rb") as image_file:
    img_data = base64.b64encode(image_file.read())
    data = {"img_id": 1}
    data["img_data"] = img_data.decode('utf-8')
    body=json.dumps(data).encode('utf-8')
    
response = client.invoke_endpoint(EndpointName='maskrcnn-background-remover-model-2021-04-06-18-48-26-833',
                                  ContentType="application/json",
                                  Accept="application/json",
                                  Body=body)
body=response['Body'].read()
msg=body.decode('utf-8')
data=json.loads(msg)
assert data is not None

ValidationError: An error occurred (ValidationError) when calling the InvokeEndpoint operation: Endpoint maskrcnn-background-remover-model-2021-04-06-18-48-26-833 of account 849118573017 not found.

[36mhdh0k57bfc-algo-1-rjgi1 |[0m 2021-04-06 18:52:56,047 [INFO ] pool-2-thread-1 TS_METRICS - CPUUtilization.Percent:0.0|#Level:Host|#hostname:04ce76da4041,timestamp:1617735176
[36mhdh0k57bfc-algo-1-rjgi1 |[0m 2021-04-06 18:52:56,047 [INFO ] pool-2-thread-1 TS_METRICS - DiskAvailable.Gigabytes:0.5941810607910156|#Level:Host|#hostname:04ce76da4041,timestamp:1617735176
[36mhdh0k57bfc-algo-1-rjgi1 |[0m 2021-04-06 18:52:56,047 [INFO ] pool-2-thread-1 TS_METRICS - DiskUsage.Gigabytes:102.53575134277344|#Level:Host|#hostname:04ce76da4041,timestamp:1617735176
[36mhdh0k57bfc-algo-1-rjgi1 |[0m 2021-04-06 18:52:56,047 [INFO ] pool-2-thread-1 TS_METRICS - DiskUtilization.Percent:99.4|#Level:Host|#hostname:04ce76da4041,timestamp:1617735176
[36mhdh0k57bfc-algo-1-rjgi1 |[0m 2021-04-06 18:52:56,048 [INFO ] pool-2-thread-1 TS_METRICS - MemoryAvailable.Megabytes:4580.80859375|#Level:Host|#hostname:04ce76da4041,timestamp:1617735176
[36mhdh0k57bfc-algo-1-rjgi1 |[0m 2021-04-06 18:52:56,048 [IN

## Environment Cleanup: Delete Endpoint, Endpoint Configuration, and Model
In order to ensure that we are no longer being billed for the endpoint or it's associated resrouces that we have spun up, we use the below steps to tear the environment down. 

In [137]:
predictor.delete_endpoint(delete_endpoint_config=True)
predictor.delete_model()

Gracefully stopping... (press Ctrl+C again to force)


## Congratulations!