# SageMaker Batch Transform with Torchserve

This notebook demonstrate how to use Sagemaker batch transform job and this example uses an open source Machine Translation model form [Flores 101 competition](http://www.statmt.org/wmt21/large-scale-multilingual-translation-task.html?fbclid=IwAR20x8ZIe9DeVYmBW7y-H9nLaTAoKqIfd2_KFzw99ru-JZ4NnkylRBTsfJA,) that focuses on law resources languages to evaluate the model using the dataset provided in the competition. the Torchserve handler code, docker file and evaluation dataset have been borrowed from [Flores competition repo](https://github.com/facebookresearch/flores/blob/main/dynalab/handler.py) as well. Thanks to Guillaume Wenzek and the team.

#### Imports

In [None]:
import base64
import json
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image
import os
import boto3, time, json
import sagemaker

**Initiate session and retrieve region, account details**

In [None]:
sess = boto3.Session()
region = sess.region_name
account = boto3.client("sts").get_caller_identity().get("Account")

In [None]:
sm = sess.client("sagemaker")
role = sagemaker.get_execution_role()

#### Prepare model

In [None]:
model_file_name = "flores_small"
sagemaker_session = sagemaker.Session()
bucket_name = sagemaker_session.default_bucket()
prefix = "Dyna"

In [None]:
!wget https://torchserve.pytorch.org/mar_files/flores_small.mar
!tar cvfz {model_file_name}.tar.gz flores_small.mar
!aws s3 cp {model_file_name}.tar.gz s3://{bucket_name}/{prefix}/models/

In [None]:
model_artifact = f"s3://{bucket_name}/{prefix}/models/flores_small.tar.gz"  # This should be changed to S3 path generated above

In [None]:
model_name = "floressmall-torchserve-sagemaker"

## Build a custom container

In [None]:
%%sh

container_name=flores-torchserve-sagemaker
account=$(aws sts get-caller-identity --query Account --output text)

# Get the region defined in the current configuration (default to us-west-2 if none defined)
region=$(aws configure get region)
region=${region:-us-west-2}

fullname="${account}.dkr.ecr.${region}.amazonaws.com/${container_name}"

# If the repository doesn't exist in ECR, create it.
aws ecr describe-repositories --repository-names "${container_name}" > /dev/null 2>&1
if [ $? -ne 0 ]
then
    aws ecr create-repository --repository-name "${container_name}" > /dev/null
fi

# Get the login command from ECR and execute it directly
$(aws ecr get-login --region ${region} --no-include-email)

# Build the docker image locally with the image name and then push it to ECR
# with the full name.
docker build  -t ${container_name} docker/
docker tag ${container_name} ${fullname}

docker push ${fullname}

#### Create Sagemaker model, deploy and run batch transform

In [None]:
registry_name = "flores-torchserve-sagemaker"
image = f"{account}.dkr.ecr.{region}.amazonaws.com/{registry_name}:latest"

container = {"Image": image, "ModelDataUrl": model_artifact}

create_model_response = sm.create_model(
    ModelName=model_name, ExecutionRoleArn=role, PrimaryContainer=container
)

print(create_model_response["ModelArn"])

### Batch transform jobs

* The s3 bucket is the bucket_name that has been created at the start of the notebook.
* Make sure in the bucket name you create the batch_input and batch_output folders as shown below.
* Make sure the dataset files/ shared input files, are placed in the batch_input folder.

In [None]:
batch_input = f"s3://{bucket_name}/Dyna/batch_transform_flores_torchserve_sagemaker/"

batch_output = f"s3://{bucket_name}/Dyna/batch_transform_flores_torchserve_sagemaker_output/"

#### Data prep
In this notebook, we'll use data from the flores101 dataset that's already been prepped to work with flores model. At a high-level, this data was downloaded from the [flores github repo](https://github.com/facebookresearch/flores#download-flores-101-dev-and-devtest-dataset) and prepped by passing in the path of the data to this [prepare()](https://github.com/facebookresearch/dynabench/blob/main/evaluation/datasets/mt/flores.py#L311) function.

In [None]:
!mkdir -p flores_inputs
!aws s3 cp --recursive s3://sagemaker-sample-files/datasets/text/flores/ flores_inputs
!aws s3 cp --recursive flores_inputs/ {batch_input}

In [None]:
import time

batch_job_name = "flores-batch" + time.strftime("%Y-%m-%d-%H-%M-%S", time.gmtime())
batch_job_name

In [None]:
request = {
    "ModelClientConfig": {
        "InvocationsTimeoutInSeconds": 3600,
        "InvocationsMaxRetries": 1,
    },
    "TransformJobName": batch_job_name,
    "ModelName": model_name,
    "MaxConcurrentTransforms": 1,
    "BatchStrategy": "MultiRecord",
    "TransformOutput": {
        "S3OutputPath": batch_output,
        "AssembleWith": "Line",
        "Accept": "application/json",
    },
    "TransformInput": {
        "DataSource": {"S3DataSource": {"S3DataType": "S3Prefix", "S3Uri": batch_input}},
        "SplitType": "Line",
        "ContentType": "application/json",
    },
    "TransformResources": {"InstanceType": "ml.g4dn.xlarge", "InstanceCount": 1},
}

In [None]:
%%time
sm.create_transform_job(**request)

while True:
    response = sm.describe_transform_job(TransformJobName=batch_job_name)
    status = response["TransformJobStatus"]
    if status == "Completed":
        print("Transform job ended with status: " + status)
        break
    if status == "Failed":
        message = response["FailureReason"]
        print("Transform failed with the following error: {}".format(message))
        raise Exception("Transform job failed")
    print("Transform job is still in status: " + status)
    time.sleep(30)
# The job should complete in approximately 7~10 minutes, depending on the instance type

### Stop transform job, if not completed

In [None]:
sm.stop_transform_job(TransformJobName=batch_job_name)

### Conclusion
This notebook showed the steps to set up a Sagemaker batch trasnsform job that uses Torchserve under the hood for serving the model, this is useful to test production variants, different models or hyperparamters using a test dataset. To adopt this work to other applications, users can write their own custom handlers for Torchserve that decides the model initialization, data pre and post processing and inference logic along with other consideration about setting the batch transform job.