# FAIRSeq in Amazon SageMaker: Pre-trained English to French translation model

The Facebook AI Research (FAIR) Lab made available through the [FAIRSeq toolkit](https://github.com/pytorch/fairseq) their state-of-the-art Sequence to Sequence models. 

In this notebook, we will show you how to serve a pre-trained English to French translation model using a fully convolutional architecture. For more information on this, please refer to [FAIRSeq documentation](https://github.com/pytorch/fairseq#translation). 

## Download pre-trained model

FAIRSeq stores the pre-trained models in their own Amazon S3 buckets [here](https://github.com/pytorch/fairseq#pre-trained-models). As the models are archived in .bz2 format, we need to convert them to .tar.gz as this is the format supported by Amazon SageMaker.

### Convert archive

In [None]:
%%sh

wget https://s3.amazonaws.com/fairseq-py/models/wmt14.v2.en-fr.fconv-py.tar.bz2 

tar xvjf wmt14.v2.en-fr.fconv-py.tar.bz2 > /dev/null
cd wmt14.en-fr.fconv-py
mv model.pt checkpoint_best.pt

tar czvf wmt14.en-fr.fconv-py.tar.gz checkpoint_best.pt dict.en.txt dict.fr.txt bpecodes README.md > /dev/null

The pre-trained model has been downloaded and converted. As we are using Amazon SageMaker, we'll upload the data to Amazon S3 first. 

### Upload data to Amazon S3

In [None]:
import sagemaker

sagemaker_session = sagemaker.Session()

# Extraction of region and account variables from boto3 objects
region =  sagemaker_session.boto_session.region_name
account = sagemaker_session.boto_session.client('sts').get_caller_identity().get('Account')

bucket = sagemaker_session.default_bucket()
prefix = 'sagemaker/DEMO-pytorch-fairseq/pre-trained-models'

role = sagemaker.get_execution_role()

In [None]:
trained_model_location = sagemaker_session.upload_data(
    path='wmt14.en-fr.fconv-py/wmt14.en-fr.fconv-py.tar.gz',
    bucket=bucket,
    key_prefix=prefix)

## Build FAIRSeq serving container

Next we need to register a Docker image in Amazon SageMaker that will contain the FAIRSeq code and that will be pulled at inference time to perform the of the precitions from the pre-trained model we downloaded. 

In [None]:
%%sh
chmod +x create_container.sh 

./create_container.sh pytorch-fairseq-serve

The FAIRSeq serving image has been pushed into Amazon ECR, the registry from which Amazon SageMaker will be able to pull that image and launch both training and prediction. 

## Hosting the pre-trained model for inference

We first needs to define a base JSONPredictor class that will help us with sending predictions to the model once it's hosted on the Amazon SageMaker endpoint. 

In [None]:
from sagemaker.predictor import RealTimePredictor, json_serializer, json_deserializer

class JSONPredictor(RealTimePredictor):
    def __init__(self, endpoint_name, sagemaker_session):
        super(JSONPredictor, self).__init__(endpoint_name, sagemaker_session, json_serializer, json_deserializer)

We can now use the Model class to deploy the model artificats (the pre-trained model), and deploy it on a CPU instance. Let's use a `ml.m5.xlarge`. 

In [None]:
from sagemaker import Model

image = account + ".dkr.ecr." + region +".amazonaws.com/pytorch-fairseq-serve:latest"

model = Model(model_data=trained_model_location,
              role=role,
              image=image,
              predictor_cls=JSONPredictor,
             )

In [None]:
predictor = model.deploy(initial_instance_count=1, instance_type='ml.m5.xlarge')

Now it's your time to play. Input a sentence in English and get the translation in French by just calling predict! 

In [None]:
import html

result = predictor.predict("I love translation")
# Need to unescape as some characters are escaped HTML-style
print(html.unescape(result))

Once you're done with getting predictions, remember to shut down your endpoint as you no longer need it. 

## Delete endpoint

In [None]:
model.sagemaker_session.delete_endpoint(predictor.endpoint)

Voila! For more information, you can check out the [FAIRSeq toolkit homepage](https://github.com/pytorch/fairseq). 