# Deploy a custom inference code for a Transformer model 
# Deploy a tensorflow 2.1 model using a custom inference container

Some sections of this notebook has been inspired by the tutorial and code:

**SageMaker Hello World Inference**

https://medium.com/@marckarp101/sagemaker-hello-world-inference-695655a62193
https://github.com/studiouser/HelloWorld

In [None]:
!pip install dill

In [None]:
import dill as pickle

## Import the libraries

In [1]:
import sagemaker
from sagemaker import get_execution_role
from sagemaker.model import Model
from sagemaker.predictor import RealTimePredictor, csv_serializer, csv_deserializer

## Create a SageMaker Session


In [2]:
sess = sagemaker.Session()
account = sess.boto_session.client('sts').get_caller_identity()['Account']
region = sess.boto_session.region_name
role = get_execution_role()

In [3]:
bucket = 'edumunozsala-ml-sagemaker'
prefix = 'ts-transformer'
model_name='transformer'


## Download our trained model

Previously we trained a Transformer model in Tensorflow 2 and the saved model was saved to AWS S3 folder. Now we want to deploy this trained model in a container and define our own inference code.

First, we need to check if the trained saved model is in a .tar.zip file as SageMaker expects. If not, we download the files containing the saved model , zip it and upload it to S3.

In [9]:
!aws s3 cp s3://$bucket/$prefix/$model_name transformer --recursive

download: s3://edumunozsala-ml-sagemaker/ts-transformer/transformer/variables/variables.index to transformer/variables/variables.index
download: s3://edumunozsala-ml-sagemaker/ts-transformer/transformer/saved_model.pb to transformer/saved_model.pb
download: s3://edumunozsala-ml-sagemaker/ts-transformer/transformer/variables/variables.data-00000-of-00001 to transformer/variables/variables.data-00000-of-00001


### Package our Model to deploy to a SageMaker endpoint
SageMaker requires our Model to be tared and gzipped. 

In [11]:
%%sh

cd transformer
tar -czvf model.tar.gz *

saved_model.pb
variables/
variables/variables.index
variables/variables.data-00000-of-00001


### Upload our Model to S3
Now we can upload our gzipped Model to S3

In [12]:
!aws s3 cp model.tar.gz s3://$bucket/$prefix/model/model.tar.gz

upload: ./model.tar.gz to s3://edumunozsala-ml-sagemaker/ts-transformer/model/model.tar.gz


### Build and Push our container to ECR
We have our custom Model that is now in S3. All we need now is a container that implemenets the hosting requirements and inference logic.
An important file to look at is the predictor.py here we coded the logic to deserialize the Model and make a inference from it. SageMaker fetched our Model from S3 and placed it in /opt/ml/model/. Take a look at the get_model() method which uses the code above to load the model from the SageMaker model path. 

In [12]:
!sed -n '26,31p' Transformer/container/Files/predictor.py

END_TOKEN=[8128]
SUMM_MAX_LENGTH=30



class ScoringService(object):


In [4]:
!pwd

/home/ec2-user/SageMaker/MyNotebooks/Text_Summarization_Enc_Dec_Attention


In [5]:
%%sh

# The name of our algorithm
algorithm_name=ts-transformer-inference

cd Transformer/container


chmod +x Files/serve

account=$(aws sts get-caller-identity --query Account --output text)

# Get the region defined in the current configuration (default to us-west-2 if none defined)
region=$(aws configure get region)
region=${region:-us-east-1}

fullname="${account}.dkr.ecr.${region}.amazonaws.com/${algorithm_name}:latest"

# If the repository doesn't exist in ECR, create it.
aws ecr describe-repositories --repository-names "${algorithm_name}" > /dev/null 2>&1

if [ $? -ne 0 ]
then
    aws ecr create-repository --repository-name "${algorithm_name}" > /dev/null
fi

# Get the login command from ECR and execute it directly
$(aws ecr get-login --region ${region} --no-include-email)

# Build the docker image locally with the image name and then push it to ECR
# with the full name.

docker build  -t ${algorithm_name} .
docker tag ${algorithm_name} ${fullname}

docker push ${fullname}

Login Succeeded
Sending build context to Docker daemon  129.3MB
Step 1/8 : FROM python:3.6
3.6: Pulling from library/python
6c33745f49b4: Pulling fs layer
ef072fc32a84: Pulling fs layer
c0afb8e68e0b: Pulling fs layer
d599c07d28e6: Pulling fs layer
f2ecc74db11a: Pulling fs layer
0e7ac7e3db3f: Pulling fs layer
dfd5461cd34f: Pulling fs layer
e6a2d3233da5: Pulling fs layer
099a5f6e48a0: Pulling fs layer
d599c07d28e6: Waiting
f2ecc74db11a: Waiting
0e7ac7e3db3f: Waiting
dfd5461cd34f: Waiting
e6a2d3233da5: Waiting
099a5f6e48a0: Waiting
ef072fc32a84: Verifying Checksum
ef072fc32a84: Download complete
c0afb8e68e0b: Verifying Checksum
c0afb8e68e0b: Download complete
6c33745f49b4: Verifying Checksum
6c33745f49b4: Download complete
d599c07d28e6: Verifying Checksum
d599c07d28e6: Download complete
0e7ac7e3db3f: Verifying Checksum
0e7ac7e3db3f: Download complete
e6a2d3233da5: Verifying Checksum
e6a2d3233da5: Download complete
099a5f6e48a0: Verifying Checksum
099a5f6e48a0: Download complete
dfd5461c

https://docs.docker.com/engine/reference/commandline/login/#credentials-store



### Deploy our Model to an Endpoint
Our container has been pushed to ECR and our Model is in S3 now we have everything we need to Deploy to a SageMaker Endpoint.

In [6]:
# Create a Predictor so we can use the predict() method to invoke our 'model'.
class Predictor(RealTimePredictor):
    def __init__(self, endpoint_name, sagemaker_session=None):
        super(Predictor, self).__init__(
            endpoint_name, sagemaker_session, csv_serializer, csv_deserializer
        )

#### Create a SageMaker Model

In [13]:
# Set the image name
image = '{}.dkr.ecr.{}.amazonaws.com/{}-inference:latest'.format(account, region, prefix)

sagemaker_model = Model(
                        sagemaker_session= sess,
                        model_data = "s3://"+bucket+"/"+prefix+"/model/model.tar.gz" , 
                        image_uri= image,
                        role=role,
                        predictor_cls= Predictor,
                        name= model_name
                       )

#### Deploy the Model to an Endpoint

In [14]:
predictor = sagemaker_model.deploy(initial_instance_count= 1,instance_type= 'ml.m4.xlarge')

Using already existing model: transformer


---------------------------------*

UnexpectedStatusException: Error hosting endpoint transformer-2021-01-10-19-15-06-109: Failed. Reason:  The primary container for production variant AllTraffic did not pass the ping health check. Please check CloudWatch logs for this endpoint..

#### Get a prediction from our Endpoint

In [None]:
predictor.predict("Say Hello World!")

#### Optional cleanup
When you're done with the endpoint, you'll want to clean it up.

In [None]:
sess.delete_endpoint(predictor.endpoint)

## Batch Transform Job
Now that we have seen the we can deploy our custom pickle file to a RealTime Endpoint and get a prediction, lets now create a Batch Transform Job that will give us batch inference.

### Create the input data and upload it to S3

In [None]:
%%writefile batchdata.csv
Say Hello World!
Say Hello World!
Say Hello World!
Say Hello World!
Say Hello World!
Say Hello World!
Say Hello World!
Say Hello World!
Say Hello World!

In [None]:
! aws s3 cp batchdata.csv s3://$bucket/$prefix/batchdata.csv

### Create the Transfromer from the SageMaker Model and transform the data we created up.

In [None]:
transform_output_folder = "batch-transform-output"
output_path="s3://{}/{}/{}".format(sess.default_bucket(),"DEMO-hello-world",transform_output_folder)

transformer = sagemaker_model.transformer(instance_count=1,
                               instance_type='ml.m4.xlarge',
                               output_path=output_path,
                               assemble_with='Line',
                               accept='text/csv')

In [None]:
input_path="s3://{}/{}/{}".format(sess.default_bucket(),"DEMO-hello-world","batchdata.csv")


transformer.transform(input_path, content_type='text/csv', split_type='Line')
transformer.wait()

### View the Batch Transform results.

In [None]:
s3_client = sess.boto_session.client('s3')
s3_client.download_file(sess.default_bucket(), "DEMO-hello-world/{}/batchdata.csv.out".format(transform_output_folder), '/tmp/batchdata.csv.out')


with open('/tmp/batchdata.csv.out') as f:
    results = f.readlines()   
print("Transform results: \n{}".format(''.join(results)))