# Serving PyTorch Models In Production Natively With Amazon SageMaker

## Setup Your Hosting Environment
The focus of this lab is around model serving. In that vain, we have taken care of of the data preparation and model training. 
This lab exercise is using a [HuggingFace Transformer](https://huggingface.co/transformers/) which provides us with a general-purpose architecture for Natural Language Understanding (NLU). Specifically, we are presenting you with a [RoBERTa base](https://huggingface.co/roberta-base) transformer that was fined tuned to perform sentiment analysis. The pre-trained checkpoint loads the additional head layers and will output ``positive``, ``neutral``, and ``negative`` sentiment or text. 

In [None]:
import sagemaker
from sagemaker import get_execution_role
from sagemaker.utils import name_from_base
from sagemaker.pytorch import PyTorchModel
from sagemaker.predictor import RealTimePredictor, json_serializer, json_deserializer
import boto3

region = boto3.Session().region_name
sm = boto3.Session().client(service_name='sagemaker', region_name=region)
role = sagemaker.get_execution_role()

#This is the fine-tuned roberta-base transformer hosted on an S3 bucket.
# Have we considered a bring-your-own-model example instead of using a pre-trained model?
# That way, we can also showcase the model archiver tool of TorchServe and how we expect model artifacts to be packaged for deployment on SageMaker
model_artifact = 's3://torchserve-workshop/roberta-fine-tuned.tar.gz'

## Create Your Endpoint
We will now create and deploy our model. To begin, we need to construct a new PyTorchModel object which points to the pre-trained model artifacts from the above step and also points to the inference code that we wish to use. We will then call the deploy method to launch the deployment container on our TorchServe powered Amazon SageMaker endpoint.

In [None]:
class SentimentAnalysis(RealTimePredictor):
    def __init__(self, endpoint_name, sagemaker_session):
        super().__init__(endpoint_name, sagemaker_session=sagemaker_session, serializer=json_serializer, 
                         deserializer=json_deserializer, content_type='application/json')

model = PyTorchModel(model_data=model_artifact,
                   name=name_from_base('roberta-model'),
                   role=role, 
                   # is it worth having a cell where we 'cat' this file and explain the programming interface?
                   # that way, customers can modify it as needed based on the model they want to use.
                   entry_point='torchserve-predictor.py',
                   source_dir='source_dir',
                   framework_version='1.5.0',
                   predictor_cls=SentimentAnalysis)

In [None]:
# It will take 6-8 minutes for your TorchServe powered endpoint to spin up on Amazon SageMaker 

# Here we are setting the endpoint name so we can delete it later using Boto3
endpoint_name = name_from_base('roberta-model')
print(endpoint_name)
predictor = model.deploy(initial_instance_count=1, instance_type='ml.m5.xlarge', endpoint_name=endpoint_name)

## Perform Predictions With A TorchServe Backend SageMaker Endpoint
Here, we will pass sample strings of text to the endpoint in order to see the sentiment. We give you one example of each, however, feel free to play around and change the strings yourself! 

In [None]:
# Our endpoint should predict a positive sentiment from the text below
test_data = {"text": "AWS is excited to announce that TorchServe is natively supported in Amazon SageMaker as the default model server for PyTorch inference"}
print(test_data)

In [None]:
prediction = predictor.predict(test_data)

In [None]:
print(f'Review text: {test_data}')
print(f'Sentiment  : {prediction}')

In [None]:
# Our endpoint should predict a neutral sentiment from the text below
test_data = {"text": "TorchServe addresses an industry need."}
print(test_data)

In [None]:
prediction = predictor.predict(test_data)

In [None]:
print(f'Review text: {test_data}')
print(f'Sentiment  : {prediction}')

In [None]:
# Our endpoint should predict a negative sentiment from the text below
test_data = {"text": "I never liked having to convert my models just to deploy them in production!"}
print(test_data)

In [None]:
prediction = predictor.predict(test_data)

In [None]:
print(f'Review text: {test_data}')
print(f'Sentiment  : {prediction}')

## Environment Cleanup: Delete Endpoint
In order to ensure that we are no longer being billed for the endpoint that we have spun up, we use the below step to tear it down. 

In [None]:
# If you see a 'HTTPStatusCode': 200 then your endpoint has been sucessfully deleted. 
# You can also verify this from within the AWS console by navigating to the Amazon SageMaker service and clicking Endpoints.

sm.delete_endpoint(
    EndpointName=endpoint_name
)

## Congratulations!
Please head back to the workshop to learn more about the next lab. 