# Serving PyTorch Models In Production With Amazon SageMaker and TorchServe

# Setup Your Hosting Environment
The focus of this lab is around model serving. In that vain, we have taken care of of the data preparation and model training. 
This lab exercise is using a [HuggingFace Transformer](https://huggingface.co/transformers/) which provides us with a general-purpose architecture for Natural Language Understanding (NLU). Specifically, we are presenting you with a [RoBERTa base](https://huggingface.co/roberta-base) transformer that was fined tuned to perform sentiment analysis. The pre-trained checkpoint loads the additional head layers and will output ``positive``, ``neutral``, and ``negative`` sentiment or text. 

In [1]:
import boto3
import sagemaker
import pandas as pd

from sagemaker import get_execution_role
from sagemaker.utils import name_from_base
from sagemaker.pytorch.model import PyTorchModel

from sagemaker.predictor import Predictor
from sagemaker.serializers import JSONLinesSerializer
from sagemaker.deserializers import JSONLinesDeserializer

sess = sagemaker.Session()
bucket = sess.default_bucket()
role = sagemaker.get_execution_role()
region = boto3.Session().region_name

sm = boto3.Session().client(service_name="sagemaker", region_name=region)

In [2]:
%store -r training_job_name

In [3]:
print(training_job_name)

tensorflow-training-2024-03-03-04-02-13-539


In [4]:
%store -r transformer_pytorch_model_dir_s3_uri

In [5]:
print(transformer_pytorch_model_dir_s3_uri)

s3://sagemaker-us-east-1-211125778552/model/tensorflow-training-2024-03-03-04-02-13-539/transformer-pytorch/


# Create Your Endpoint
We will now create and deploy our model. To begin, we need to construct a new PyTorchModel object which points to the pre-trained model artifacts from the above step and also points to the inference code that we wish to use. We will then call the deploy method to launch the deployment container on our TorchServe powered Amazon SageMaker endpoint.

In [6]:
class StarRatingPredictor(Predictor):
    def __init__(self, endpoint_name, sagemaker_session):
        super().__init__(
            endpoint_name,
            sagemaker_session=sagemaker_session,
            serializer=JSONLinesSerializer(),
            deserializer=JSONLinesDeserializer(),
        )

In [7]:
import time

timestamp = int(time.time())

pytorch_model_name = "{}-{}-{}".format(training_job_name, "pt", timestamp)

print(pytorch_model_name)

tensorflow-training-2024-03-03-04-02-13-539-pt-1709763530


In [8]:
model = PyTorchModel(
    model_data=transformer_pytorch_model_dir_s3_uri + "model.tar.gz",
    name=pytorch_model_name,
    role=role,
    entry_point="inference.py",
    source_dir="code-pytorch",
    framework_version="1.6.0",
    py_version="py3",
    predictor_cls=StarRatingPredictor,
)

In [9]:
import time

pytorch_endpoint_name = "{}-{}-{}".format(training_job_name, "pt", timestamp)

print(pytorch_endpoint_name)

tensorflow-training-2024-03-03-04-02-13-539-pt-1709763530


In [10]:
predictor = model.deploy(
    initial_instance_count=1, instance_type="ml.m4.xlarge", endpoint_name=pytorch_endpoint_name, wait=False
)

In [11]:
print(predictor)

<__main__.StarRatingPredictor object at 0x7f1ede1c6ad0>


In [12]:
from IPython.core.display import display, HTML

display(
    HTML(
        '<b>Review <a target="blank" href="https://console.aws.amazon.com/sagemaker/home?region={}#/endpoints/{}">SageMaker REST Endpoint</a></b>'.format(
            region, pytorch_endpoint_name
        )
    )
)

In [13]:
%%time

waiter = sm.get_waiter("endpoint_in_service")
waiter.wait(EndpointName=pytorch_endpoint_name)

CPU times: user 52.3 ms, sys: 1.05 ms, total: 53.3 ms
Wall time: 4min


# _Wait Until the ^^ Endpoint ^^ is Deployed_

In [14]:
pytorch_endpoint_arn = sm.describe_endpoint(EndpointName=pytorch_endpoint_name)["EndpointArn"]
print(pytorch_endpoint_arn)

arn:aws:sagemaker:us-east-1:211125778552:endpoint/tensorflow-training-2024-03-03-04-02-13-539-pt-1709763530


In [15]:
from sagemaker.lineage.visualizer import LineageTableVisualizer

lineage_table_viz = LineageTableVisualizer(sess)
lineage_table_viz_df = lineage_table_viz.show(endpoint_arn=pytorch_endpoint_arn)
lineage_table_viz_df

Unnamed: 0,Name/Source,Direction,Type,Association Type,Lineage Type
0,tensorflow-training-2024-03-03-04-02-13-539-pt...,Input,ModelDeployment,AssociatedWith,action


# Perform Predictions With A TorchServe Backend Amazon SageMaker Endpoint
Here, we will pass sample strings of text to the endpoint in order to see the sentiment. We give you one example of each, however, feel free to play around and change the strings yourself! 

In [16]:
import json

inputs = [{"features": ["This is great!"]}, {"features": ["This is bad."]}]

predicted_classes = predictor.predict(inputs)

for predicted_class in predicted_classes:
    print("Predicted star_rating: {}".format(predicted_class))

Predicted star_rating: {'predicted_label': 5}
Predicted star_rating: {'predicted_label': 5}


# Predict the `star_rating` with `review_body` Samples from our TSV's


In [17]:
import csv
import pandas as pd

df_reviews = pd.read_csv(
    "./data/amazon_reviews_us_Digital_Software_v1_00.tsv.gz",
    delimiter="\t",
    quoting=csv.QUOTE_NONE,
    compression="gzip",
)

df_sample_reviews = df_reviews[["review_body", "star_rating"]].sample(n=50)
df_sample_reviews = df_sample_reviews.reset_index()
df_sample_reviews.shape

(50, 3)

In [18]:
import pandas as pd


def predict(review_body):
    inputs = [{"features": [review_body]}]
    predicted_classes = predictor.predict(inputs)
    return predicted_classes[0]["predicted_label"]


df_sample_reviews["predicted_class"] = df_sample_reviews["review_body"].map(predict)
df_sample_reviews.head(5)

Unnamed: 0,index,review_body,star_rating,predicted_class
0,40113,Works Awesome!,5,1
1,44304,Just download it....<br />keep it simple!,5,1
2,59740,"I have used McAffey,Norton,and I like avast. I...",5,1
3,5456,Great very helpful.,5,1
4,71354,It is hard to use not as easy to use as Excel....,2,1


# Pass Variables to the Next Notebook(s)

In [19]:
%store pytorch_endpoint_name

Stored 'pytorch_endpoint_name' (str)


In [20]:
%store

Stored variables and their in-db values:
autopilot_endpoint_arn                                -> 'arn:aws:sagemaker:us-east-1:211125778552:endpoint
autopilot_model_arn                                   -> 'arn:aws:sagemaker:us-east-1:211125778552:model/au
autopilot_train_s3_uri                                -> 's3://sagemaker-us-east-1-211125778552/data/amazon
balance_dataset                                       -> True
balanced_bias_data_jsonlines_s3_uri                   -> 's3://sagemaker-us-east-1-211125778552/bias-detect
balanced_bias_data_s3_uri                             -> 's3://sagemaker-us-east-1-211125778552/bias-detect
best_candidate_tuning_job_name                        -> 0    tensorflow-training-240306-1651-002-2601ee0c

bias_data_s3_uri                                      -> 's3://sagemaker-us-east-1-211125778552/bias-detect
comprehend_endpoint_arn                               -> 'arn:aws:comprehend:us-east-1:211125778552:documen
comprehend_train_s3_uri          

# Release Resources

In [21]:
# sm.delete_endpoint(
#     EndpointName=pytorch_endpoint_name
# )

In [22]:
%%html

<p><b>Shutting down your kernel for this notebook to release resources.</b></p>
<button class="sm-command-button" data-commandlinker-command="kernelmenu:shutdown" style="display:none;">Shutdown Kernel</button>
        
<script>
try {
    els = document.getElementsByClassName("sm-command-button");
    els[0].click();
}
catch(err) {
    // NoOp
}    
</script>

In [23]:
%%javascript

try {
    Jupyter.notebook.save_checkpoint();
    Jupyter.notebook.session.delete();
}
catch(err) {
    // NoOp
}

<IPython.core.display.Javascript object>

# Internal - DO NOT RUN - WILL REMOVE SOON

In [24]:
# %%bash

# aws sagemaker-runtime invoke-endpoint \
#     --endpoint-name "tensorflow-training-2021-01-28-01-19-50-987-pt-1611813221" \
#     --content-type application/jsonlines \
#     --accept application/jsonlines \
#     --body $'{"features":["Amazon gift cards are the best"]}\n{"features":["It is the worst"]}' >(cat) 1>/dev/null

In [25]:
# !rm model.tar.gz
# !aws s3 cp s3://sagemaker-us-east-1-835319576252/tensorflow-training-2021-01-28-01-19-50-987/output/model.tar.gz ./

In [26]:
# !rm -rf ./model
# !mkdir -p  ./model
# !tar -xvzf ./model.tar.gz -C model/

In [27]:
# !cp ./code/inference.py model/code/

In [28]:
# !cat model/code/inference.py