In [None]:
!pip install --upgrade pip
!pip -q install sagemaker awscli boto3 pandas --upgrade 

## Example: TorchServe Performance Tuning on Amazon SageMaker

In this example, we’ll show you how you can tune TorchServe performance, build a TorchServe container and host it using Amazon SageMaker. With Amazon SageMaker hosting you get a fully-managed hosting experience. Just specify the type of instance, and the maximum and minimum number desired, and SageMaker takes care of the rest.

Performance tuning parameters in TorchServe:(https://github.com/pytorch/serve/blob/master/docs/configuration.md#other-properties)
* number_of_netty_threads
* netty_client_threads
* async_logging
* minWorkers
* maxWorkers
* batchSize 

## config.properties

In [None]:
!cat config.properties

### Clone the TorchServe repository

In [None]:
!git clone https://github.com/pytorch/serve.git

In [None]:
!cd /home/ec2-user/SageMaker/torchserve_batch/serve && git checkout issue_1107

### Download a PyTorch model 

In [None]:
model_name = "TransformerEn2Fr"
mar_file = f'{model_name}.mar'
mar_url = f'https://torchserve.pytorch.org/mar_files/{mar_file}'
!wget -q {mar_url}
!ls *.mar

### Upload the TransformerEn2Fr.mar archive file to Amazon S3
Create a compressed tar.gz file from the TransformerEn2Fr.mar file since Amazon SageMaker expects that models are in a tar.gz file. 
Uploads the model to your default Amazon SageMaker S3 bucket under the models directory

### Create a boto3 session and get specify a role with SageMaker access

In [None]:
import boto3, time, json
sess    = boto3.Session()
sm      = sess.client('sagemaker')
region  = sess.region_name
account = boto3.client('sts').get_caller_identity().get('Account')

In [None]:
import sagemaker
role = sagemaker.get_execution_role()
sagemaker_session = sagemaker.Session(boto_session=sess)

In [None]:
bucket_name = sagemaker_session.default_bucket()
prefix = 'torchserve'

!tar cvfz {model_name}.tar.gz {mar_file}
!aws s3 cp {model_name}.tar.gz s3://{bucket_name}/{prefix}/models/

### Create an Amazon ECR registry
Create a new docker container registry for your torchserve container images.

In [None]:
registry_name = 'torchserve-perf'
!aws ecr create-repository --repository-name {registry_name}

### Build a TorchServe Docker container and push it to Amazon ECR

In [None]:
image_label = 'v1'
image = f'{account}.dkr.ecr.{region}.amazonaws.com/{registry_name}:{image_label}'

!docker build -t {registry_name}:{image_label} .
!$(aws ecr get-login --no-include-email --region {region})
!docker tag {registry_name}:{image_label} {image}
!docker push {image}

### Deploy endpoint and make prediction using Amazon SageMaker SDK

In [None]:
from sagemaker.model import Model
from sagemaker.predictor import Predictor

model_data = f's3://{bucket_name}/{prefix}/models/{model_name}.tar.gz'
sm_model_name = f'torchserve-{model_name}'

torchserve_model = Model(model_data = model_data, 
                         image_uri = image,
                         role  = role,
                         predictor_cls=Predictor,
                         name  = sm_model_name)

In [None]:
endpoint_name = 'torchserve-endpoint-' + time.strftime("%Y-%m-%d-%H-%M-%S", time.gmtime())

predictor = torchserve_model.deploy(instance_type='ml.g4dn.xlarge',
                                    initial_instance_count=1,
                                    endpoint_name = endpoint_name)

### Test the TorchServe hosted model

In [None]:
payload = "Hi James, when are you coming back home? I am waiting for you. Please come as soon as possible."    
response = predictor.predict(data=payload)
print(response)