# TODO
 
* ## https://medium.com/analytics-vidhya/deploy-huggingface-s-bert-to-production-with-pytorch-serve-27b068026d18 

* ## https://github.com/pytorch/serve/tree/master/examples/Huggingface_Transformers

# Deploying our BERT PyTorch Model as REST EndPoint

In [9]:
!pip install -q transformers==2.8.0
!pip install -q torch==1.5.0 --upgrade --ignore-installed

In [34]:
!pip install torchserve



In [57]:
import boto3
import sagemaker
import pandas as pd

sess   = sagemaker.Session()
bucket = sess.default_bucket()
role = sagemaker.get_execution_role()
region = boto3.Session().region_name
account_id = boto3.client('sts').get_caller_identity().get('Account')

sm = boto3.Session().client(service_name='sagemaker', region_name=region)

# Clone the TorchServe repository and install torch-model-archiver

You'll use `torch-model-archiver` to create a model archive file (.mar). The .mar model archive file contains model checkpoints along with it’s `state_dict` (dictionary object that maps each layer to its parameter tensor).

In [58]:
!pip install ./src_torchserve/serve/model-archiver/

Processing ./src_torchserve/serve/model-archiver
Building wheels for collected packages: torch-model-archiver
  Building wheel for torch-model-archiver (setup.py) ... [?25ldone
[?25h  Created wheel for torch-model-archiver: filename=torch_model_archiver-0.1.1b20200704-py3-none-any.whl size=15785 sha256=2e7321e25af8b60b2eb4a71a4a2cabd0b3ac20540ae6e623f6274e84f385682e
  Stored in directory: /home/ec2-user/.cache/pip/wheels/00/e2/f8/6382e4aa3a1a20fcfdc7aed73512e6bb8bd55ef9cd0a9c099d
Successfully built torch-model-archiver
Installing collected packages: torch-model-archiver
  Attempting uninstall: torch-model-archiver
    Found existing installation: torch-model-archiver 0.1.1b20200704
    Uninstalling torch-model-archiver-0.1.1b20200704:
      Successfully uninstalled torch-model-archiver-0.1.1b20200704
Successfully installed torch-model-archiver-0.1.1b20200704


# Retrieve PyTorch Models

In [59]:
%store -r s3_pytorch_model_path

In [60]:
print(s3_pytorch_model_path)

s3://sagemaker-us-east-1-835319576252/models/pytorch/pytorch_model.pt


In [61]:
%store -r s3_transformer_pytorch_model_path

In [62]:
print(s3_transformer_pytorch_model_path)

s3://sagemaker-us-east-1-835319576252/models/transformer-pytorch/


In [63]:
!aws s3 cp --recursive $s3_transformer_pytorch_model_path ./Transformer_model/

download: s3://sagemaker-us-east-1-835319576252/models/transformer-pytorch/config.json to Transformer_model/config.json
download: s3://sagemaker-us-east-1-835319576252/models/transformer-pytorch/pytorch_model.bin to Transformer_model/pytorch_model.bin


# Create TorchServe Model Archive File

Once, setup_config.json, sample_text.txt and index_to_name.json are set properly, we can go ahead and package the model and start serving it. The artifacts realted to each operation mode (such as sample_text.txt, index_to_name.json) can be place in their respective folder. 

In [64]:
# !torch-model-archiver 
#    --model-name "bert" \
#    --version 1.0 \
#    --serialized-file ./bert_model/pytorch_model.bin \
#    --extra-files "./bert_model/config.json" \
#    --handler "./transformers_classifier_torchserve_handler.py"

In [65]:
model_name = 'DistilBertForSequenceClassification'

In [66]:
!torch-model-archiver \
    --model-name $model_name \
    --version 1.0 \
    --serialized-file Transformer_model/pytorch_model.bin \
    --handler ./src_torchserve/Transformer_handler_generalized.py \
    --extra-files "./Transformer_model/config.json,./src_torchserve/setup_config.json,./src_torchserve/Seq_classification_artifacts/index_to_name.json"

In [67]:
!ls ./*.mar

./DistilBertForSequenceClassification.mar


# Registering the Model on TorchServe and Running Inference

To register the model on TorchServe using the above model archive file, we run the following commands:

In [68]:
!mkdir -p ./model_store
!mv ./DistilBertForSequenceClassification.mar ./model_store/

# TorchServe requires Java 11 which is not installed by default in SageMaker Notebook Instances
https://tecadmin.net/install-java-on-amazon-linux/

In [69]:
# %%bash

# sudo amazon-linux-extras install java-openjdk11

In [70]:
# %%bash 

# torchserve \
# --start \
# --model-store ./model_store \
# --models distilbert-pytorch=DistilBertForSequenceClassification.mar &

## To run the inference using our registered model, open a new terminal and run: 

In [71]:
# !curl -X POST http://127.0.0.1:8080/predictions/distilbert-pytorch -T ./src_torchserve/Seq_classification_artifacts/sample_text.txt

# Prepare the Model for SageMaker Deployment

## Upload .mar to S3

In [72]:
torchserve_mar = 'DistilBertForSequenceClassification.mar'

In [73]:
s3_torchserve_mar = 's3://{}/models/torchserve/{}'.format(bucket, torchserve_mar)
print(s3_torchserve_mar)

s3://sagemaker-us-east-1-835319576252/models/torchserve/DistilBertForSequenceClassification.mar


In [74]:
!aws s3 cp ./model_store/$torchserve_mar $s3_torchserve_mar

upload: model_store/DistilBertForSequenceClassification.mar to s3://sagemaker-us-east-1-835319576252/models/torchserve/DistilBertForSequenceClassification.mar


In [75]:
%store s3_torchserve_mar

Stored 's3_torchserve_mar' (str)


In [76]:
!tar cvfz ./DistilBertForSequenceClassification.tar.gz \
    ./model_store/DistilBertForSequenceClassification.mar


./model_store/DistilBertForSequenceClassification.mar


In [77]:
s3_torchserve_tar = 's3://{}/models/torchserve/DistilBertForSequenceClassification.tar.gz'.format(bucket)

In [78]:
!aws s3 cp ./DistilBertForSequenceClassification.tar.gz $s3_torchserve_tar

upload: ./DistilBertForSequenceClassification.tar.gz to s3://sagemaker-us-east-1-835319576252/models/torchserve/DistilBertForSequenceClassification.tar.gz


In [79]:
%store s3_torchserve_tar

Stored 's3_torchserve_tar' (str)


### Create an Amazon ECR registry
Create a new docker container registry for your torchserve container images.

In [80]:
registry_name = 'torchserve'
!aws ecr create-repository --repository-name {registry_name}


An error occurred (RepositoryAlreadyExistsException) when calling the CreateRepository operation: The repository with name 'torchserve' already exists in the registry with id '835319576252'


### Build a TorchServe Docker container and push it to Amazon ECR

In [81]:
image_label = 'v1'
image = f'{account_id}.dkr.ecr.{region}.amazonaws.com/{registry_name}:{image_label}'

In [82]:
!docker build -t {registry_name}:{image_label} -f ./src_torchserve/Dockerfile ./src_torchserve
!$(aws ecr get-login --no-include-email --region {region})
!docker tag {registry_name}:{image_label} {image}
!docker push {image}

Sending build context to Docker daemon  22.71MB
Step 1/16 : FROM ubuntu:18.04
 ---> 8e4ce0a6ce69
Step 2/16 : ENV PYTHONUNBUFFERED TRUE
 ---> Using cache
 ---> c98cb06ff9fc
Step 3/16 : RUN apt-get update &&     DEBIAN_FRONTEND=noninteractive apt-get install --no-install-recommends -y     fakeroot     ca-certificates     dpkg-dev     g++     python3-dev     openjdk-11-jdk     curl     vim     && rm -rf /var/lib/apt/lists/*     && cd /tmp     && curl -O https://bootstrap.pypa.io/get-pip.py     && python3 get-pip.py
 ---> Using cache
 ---> 953456cd5d70
Step 4/16 : RUN update-alternatives --install /usr/bin/python python /usr/bin/python3 1
 ---> Using cache
 ---> 7131decc103f
Step 5/16 : RUN update-alternatives --install /usr/local/bin/pip pip /usr/local/bin/pip3 1
 ---> Using cache
 ---> 0192cbc4a2d5
Step 6/16 : RUN pip install --no-cache-dir psutil                 --no-cache-dir torch                 --no-cache-dir torchvision
 ---> Using cache
 ---> 5563022f6670
Step 7/16 : ADD serve ser

### Deploy endpoint and make prediction using Amazon SageMaker SDK

In [None]:
print(s3_torchserve_tar)

In [84]:
from sagemaker.model import Model
from sagemaker.predictor import RealTimePredictor

sm_model_name = 'distilbert-pytorch'

torchserve_model = Model(model_data = s3_torchserve_tar, 
                         image = image,
                         role  = role,
                         predictor_cls=RealTimePredictor,
                         name  = sm_model_name)

In [None]:
import time

endpoint_name = 'torchserve-endpoint-' + time.strftime("%Y-%m-%d-%H-%M-%S", time.gmtime())

predictor = torchserve_model.deploy(instance_type='ml.c5.4xlarge',
                                    initial_instance_count=1,
                                    endpoint_name = endpoint_name)

Using already existing model: distilbert-pytorch


--

# _Wait Until the ^^ Endpoint ^^ is Deployed_

## Test the TorchServe hosted model

In [91]:
#file_name = './src_torchserve/sample_text.txt'
#with open(file_name, 'rb') as f:
#    payload = f.read()
#    payload = payload
#    
#response = predictor.predict(data=payload)
#print(*json.loads(response), sep = '\n')

In [53]:
import json
    
#reviews = ["This is great!", 
#           "This is terrible."]

predicted_classes = predictor.predict("This is great!")

for predicted_class, review in zip(predicted_classes, reviews):
    print('[Predicted Star Rating: {}]'.format(predicted_class), review)

ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from model with message "{
  "code": 400,
  "type": "BadRequestException",
  "message": "Parameter model_name is required."
}
". See https://us-east-1.console.aws.amazon.com/cloudwatch/home?region=us-east-1#logEventViewer:group=/aws/sagemaker/Endpoints/torchserve-endpoint-2020-07-04-20-42-31 in account 835319576252 for more information.