# Multi Model Endpoint Deployment
In this notebook I import translator models hosted on Hugging Face's pre-trained model hub  and we save them to an S3 bucket where they can be called on by a single SageMaker endpoint, a multi model endpoint. I selected the top 20 languages based on 2 weeks of streaming Twitter data. We will import a model for 19 of the top 20 languages. Skipping Portuguese because it didn't have its own Portuguese to English model.

## Table of contents: <a class="anchor" id="home"></a>

* [1. Set up enviornment](#env)
  * [1.1 Import modules](#modules)
  * [1.2 Initialize session objects](#sess)
* [2. Import models](#import)
  * [2.1 Prepare model names](#names)
  * [2.2 Download, compress and send models to S3 bucket](#package)
* [3. Create SageMaker multi model endpoint](#mme)
  * [3.1 Create SageMaker multi data model](#multidatamodel)
  * [3.2 Deploy multi model endpoint](#deploy)

## 1. Set up environment <a class="anchor" id="env"></a>
[Back to top](#home)

#### 1.1 Import modules <a class="anchor" id="modules"></a>
[Back to top](#home)

In [17]:
# Install packages if not installed
!pip install -U transformers
!pip install -U sagemaker
!pip install sentencepiece



In [18]:
# Import packages
# Utility
import tarfile
import sentencepiece

# AWS
import boto3
# AWS SageMaker
import sagemaker
from sagemaker.huggingface import HuggingFaceModel
from sagemaker.serializers import JSONSerializer
from sagemaker.deserializers import JSONDeserializer
from sagemaker.multidatamodel import MultiDataModel
from sagemaker.predictor import Predictor

# Hugging Face
import transformers
from transformers import MarianMTModel, MarianTokenizer



#### 1.2 Initialize session objects <a class="anchor" id="sess"></a>
[Back to top](#home)

In [19]:
# Set up SageMaker session object
sess = sagemaker.Session()


# Set up sagemaker session bucket. This is used for uploading data, models, and logs
sagemaker_session_bucket = None
# Set up role
try:
    role = sagemaker.get_execution_role()
except ValueError:
    iam = boto3.client('iam')
    role = iam.get_role(RoleName = 'sagemaker_execution_role')['Role']['Arn']

if sagemaker_session_bucket is None and sess is not None:
    # Set to default bucket if no bucket name given
    sagemaker_session_bucket = sess.default_bucket()

# specify bucket location for session
sess = sagemaker.Session(default_bucket = sagemaker_session_bucket)

# Specify region for session
region = sess.boto_region_name

# Initialize sagemaker client
sm_client = boto3.client('sagemaker')


print(f'sagemaker role arn: {role}')
print(f'sagemaker bucket: {sess.default_bucket()}')
print(f'sagemaker session region: {region}')

sagemaker role arn: arn:aws:iam::034604206629:role/service-role/AmazonSageMaker-ExecutionRole-20220520T212632
sagemaker bucket: sagemaker-ca-central-1-034604206629
sagemaker session region: ca-central-1


## 2. Import models <a class="anchor" id="import"></a>
[Back to top](#home)

#### 2.1 Prepare model variables <a class="anchor" id="names"></a>
(Portuguese is omitted from top 20)<br>
[Back to top](#home)

In [20]:
# Non english language ISO codes 
non_en_langs = ['de', 'fr', 'it', 'es', 'ru', 'tr', 'pl', 'ja', 'uk',
               'sv', 'fi', 'tl', 'nl', 'cs', 'da', 'zh', 'ar', ]
# Get Hugging Face model hub names
dest_lang = 'en'

# Create list of Hugging Face model IDs to retrieve from the hub
model_dict = {lang:{'HF_MODEL_ID':f'Helsinki-NLP/opus-mt-{lang}-{dest_lang}'} for lang in non_en_langs}



#### 2.2 Download, compress and send models to S3 bucket <a class="anchor" id="package"></a>
[Back to top](#home)

In [21]:
# Download models
s3_folder_prefix = 'hugging_face/translator'
dest_lang = 'en'

# Loop through each language in the language dict, download the model and tokenizer from the HF hub, save as a tae file, transfer the tar file to an S3 bucket
for lang in non_en_langs[12:]:
    # Load pre-trained model from hugging face hub for language lang
    model = MarianMTModel.from_pretrained(model_dict[lang]['HF_MODEL_ID'])
    tokenizer = MarianTokenizer.from_pretrained(model_dict[lang]['HF_MODEL_ID'])

    # Save model and tokenizer to local directory
    model.save_pretrained(f'translator_model_{lang}_to_{dest_lang}/')
    tokenizer.save_pretrained(f'translator_model_{lang}_to_{dest_lang}/')
    
    # Creat file name for tar file
    tarfile_name = f'translator_{lang}_to_{dest_lang}.tar.gz'
    
    # Save the model directory as a tar file
    with tarfile.open(tarfile_name, 'w:gz') as f:
        f.add(f'translator_model_{lang}_to_{dest_lang}/', arcname='.')
    f.close
    
    # Transfer tar file to session's bucket with selected s3 bucket folder
    ! aws s3 cp "$tarfile_name" s3://"$sagemaker_session_bucket"/"$s3_folder_prefix"/"$tarfile_name"

upload: ./translator_nl_to_en.tar.gz to s3://sagemaker-ca-central-1-034604206629/hugging_face/translator/translator_nl_to_en.tar.gz
upload: ./translator_cs_to_en.tar.gz to s3://sagemaker-ca-central-1-034604206629/hugging_face/translator/translator_cs_to_en.tar.gz
upload: ./translator_da_to_en.tar.gz to s3://sagemaker-ca-central-1-034604206629/hugging_face/translator/translator_da_to_en.tar.gz
upload: ./translator_zh_to_en.tar.gz to s3://sagemaker-ca-central-1-034604206629/hugging_face/translator/translator_zh_to_en.tar.gz
upload: ./translator_ar_to_en.tar.gz to s3://sagemaker-ca-central-1-034604206629/hugging_face/translator/translator_ar_to_en.tar.gz


## 3. Create SageMaker multi model endpoint <a class="anchor" id="mme"></a>
[Back to top](#home)

#### 3.1 Create SageMaker multi data model  <a class="anchor" id="multidatamodel"></a>
[Back to top](#home)

In [22]:
# Image URI to host model
# I believe this thing is a reference to the Docker container image that I want to use. The URI address components can be found at this link: 
# https://github.com/aws/sagemaker-python-sdk/blob/e0b9d38e1e3b48647a02af23c4be54980e53dc61/src/sagemaker/image_uri_config/huggingface.json
URI = '763104351884.dkr.ecr.ca-central-1.amazonaws.com/huggingface-pytorch-inference:1.10.2-transformers4.17.0-cpu-py38-ubuntu20.04'

# Not sure what this one is for
HUB = {'HF_TASK':'translation'}

#  Give a name to the model to retrieve later
MODEL_NAME = 'hf-translators'

# Create Multi data model
mme = MultiDataModel(
    name = MODEL_NAME,
    model_data_prefix = f's3://{sagemaker_session_bucket}/{s3_folder_prefix}/',
    image_uri = URI,
    env = HUB,
    predictor_cls = Predictor,
    role = role,
    sagemaker_session = sess
    )

<i>Review files registered in multi data model</i>

In [23]:
for model in mme.list_models():
    print(model)

translator_ar_to_en.tar.gz
translator_cs_to_en.tar.gz
translator_da_to_en.tar.gz
translator_de_to_en.tar.gz
translator_es_to_en.tar.gz
translator_fr_to_en.tar.gz
translator_it_to_en.tar.gz
translator_ja_to_en.tar.gz
translator_nl_to_en.tar.gz
translator_pl_to_en.tar.gz
translator_ru_to_en.tar.gz
translator_tl_to_en.tar.gz
translator_tr_to_en.tar.gz
translator_uk_to_en.tar.gz
translator_zh_to_en.tar.gz


#### 3.2 Deploy multi model endpoint  <a class="anchor" id="deploy"></a>
[Back to top](#home)

In [24]:
# Deploy multi data model as a multi model endpoint (MME)
predictor = mme.deploy(
    initial_instance_count = 1,
    instance_type = 'ml.m5.large',
    serializer = JSONSerializer,
    deserializer = JSONDeserializer,
    endpoint_name = MODEL_NAME,
    wait = False
)