# YOLOS Object Deployment with SageMaker Serverless Inference

In this notebook I will show you how to deploy a YOLOS Object Detection model with Hugging Face and AWS SageMaker.
The process mainly follows that of [Phillip Schmid's notebook](https://github.com/huggingface/notebooks/tree/main/sagemaker/19_serverless_inference).

In [None]:
import sagemaker
import boto3

from sagemaker.pytorch.model import PyTorchModel
from sagemaker.huggingface.model import HuggingFaceModel, HuggingFacePredictor
from sagemaker.serverless import ServerlessInferenceConfig
from sagemaker.serializers import DataSerializer

import io
from PIL import Image
import torch
import numpy as np

### AWS Setup
After all the imports, we need to set up some AWS specific variables. I usually run this code locally, if you are using SageMaker Notebooks you will not need to specifiy all of the below.

In [None]:
boto3_sess = boto3.Session(profile_name=os.environ.get["SAGEMAKER_PROFILE"])
sess = sagemaker.Session(boto_session=boto3_sess)

role = os.environ.get["SAGEMAKER_ROLE"]

print(f"sagemaker role arn: {role}")
print(f"sagemaker bucket: {sess.default_bucket()}")
print(f"sagemaker session region: {sess.boto_region_name}")

### Get the YOLOS weights into S3
Next is getting the YOLOS weights from the Hugging Face Hub and ultimately uploading them to an S3 Bucket.

First we define the Hub repository path where the weights are stored and the S3 location where we want to upload them.

In [None]:
repository = "hustvl/yolos-tiny"
model_id=repository.split("/")[-1]
s3_location=f"s3://{sess.default_bucket()}/custom_inference/{model_id}/model.tar.gz"

Then we use git (lfs) to clone the weight repo.

In [None]:
!git lfs install
!git clone https://huggingface.co/$repository

The next line is special for the transformer library version < 4.20.1. As YOLOS was only added there we need to copy the custom inference script, located in *code* to the cloned model repo.

In [None]:
!cp -r code/ $model_id/code/

Then we pack the model weights and all the other files in the repo into a *tar.gz* and finally upload it via AWS CLI

In [None]:
%cd $model_id
!tar zcvf model.tar.gz *

In [None]:
!aws s3 --profile celapp cp model.tar.gz $s3_location
%cd .. 

### Define and Deploy to SageMaker Serverless Inference
In the next step we define everything necessary for the serverless endpoint, starting with the *HuggingFaceModel*. 
Next we define some properties, i.e. the max memory size and number of maximum concurrent calls/processes of the endpoint.
To make it easy to handle image data we also create a *DataSerializer*, which is then passed to the *deploy* method.

In [None]:
huggingface_model = HuggingFaceModel(
    model_data=s3_location, 
    sagemaker_session=sess,
    role=role,                   
    transformers_version="4.17", 
    pytorch_version="1.10",        # pytorch version used
    py_version='py38',            # python version used
)

# Specify MemorySizeInMB and MaxConcurrency in the serverless config object
serverless_config = ServerlessInferenceConfig(
    memory_size_in_mb=4096, max_concurrency=10,
)

image_serializer = DataSerializer(content_type='image/x-image') 

# deploy the endpoint endpoint
yolos_predictor = huggingface_model.deploy(
    endpoint_name="yolos-t-object-detection-serverless",
    serverless_inference_config=serverless_config,
    serializer=image_serializer
)

### Call the endpoint
We can simply call the the endpoint by invoking *predict* on the predictor we got out of the *deploy* method.

In [None]:
res = yolos_predictor.predict(data="example_resized.jpg")

### Clean up
If you're just testing and don't want the model and endpoint to hang around in your SageMaker account, run the following lines to clean them up.

In [None]:
yolos_predictor.delete_model()
yolos_predictor.delete_endpoint()

### Result Visualization
Use the below code to visualize your object detections.
You can also have a look in the *infer_image.py* file, which I used for a more lightweight inference, that does not need to load the YOLOS model from the transformers library.

In [None]:
import matplotlib.pyplot as plt
from transformers import YolosForObjectDetection

# colors for visualization
COLORS = [[0.000, 0.447, 0.741], [0.850, 0.325, 0.098], [0.929, 0.694, 0.125],
          [0.494, 0.184, 0.556], [0.466, 0.674, 0.188], [0.301, 0.745, 0.933]]
model = YolosForObjectDetection.from_pretrained("hustvl/yolos-small")

def plot_results(pil_img, prob, boxes):
    plt.figure(figsize=(16,10))
    plt.imshow(pil_img)
    ax = plt.gca()
    colors = COLORS * 100
    for p, (xmin, ymin, xmax, ymax), c in zip(prob, boxes, colors):
        ax.add_patch(plt.Rectangle((xmin, ymin), xmax - xmin, ymax - ymin,
                                   fill=False, color=c, linewidth=3))
        cl = p.argmax()
        text = f'{model.config.id2label[cl.item()]}: {p[cl]:0.2f}'
        ax.text(xmin, ymin, text, fontsize=15,
                bbox=dict(facecolor='yellow', alpha=0.5))
    plt.axis('off')
    plt.show()
    

In [None]:
image = Image.open("example_resized.jpg")
plot_results(image, np.asarray(res["probabilities"]), res["bounding_boxes"])