# AWS Machine Learning Purpose-built Accelerators Tutorial
## Learn how to use [AWS Trainium](https://aws.amazon.com/machine-learning/trainium/) and [AWS Inferentia](https://aws.amazon.com/machine-learning/inferentia/) with [Amazon SageMaker](https://aws.amazon.com/sagemaker/), to optimize your ML workload
## Part 3/3 - Compiling and deploying a Bert model to AWS Inferentia 2 with SageMaker + [Hugging Face Optimum Neuron](https://huggingface.co/docs/optimum-neuron/index)

**SageMaker studio Kernel: PyTorch 1.13 Python 3.9 CPU - ml.t3.medium** 

In this tutorial, you'll learn how to compile a model to AWS Inferentia and then deploy it to a SageMaker real-time endpoint powered by AWS Inferentia2. First we'll kick-off a SageMaker job to compile the model. We need to do this once. After that, we can deploy our model to a SageMaker endpoint and finally get some predictions.

In section 02, you extract some metadata from the Optimum Neuron API and render a table with the current tested/supported models (similar models not listed there can also be compatible, but you need to check by yourself). This table is important for you to understand which models can be selected for deployment. However, if you also need to fine-tune your model, check a similar table in the notebook **Part 2** to see which models can be fine-tuned with AWS Trainium using HF Optimum Neuron. That way you can plan your end2end solution and start implementing it right now.

## 1) Install some required packages

In [1]:
%pip install -r requirements.txt

Note: you may need to restart the kernel to use updated packages.


In [2]:
%pip install -U sagemaker

Collecting sagemaker
  Downloading sagemaker-2.216.0-py3-none-any.whl.metadata (14 kB)
Downloading sagemaker-2.216.0-py3-none-any.whl (1.5 MB)
[2K   [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m1.5/1.5 MB[0m [31m51.4 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: sagemaker
  Attempting uninstall: sagemaker
    Found existing installation: sagemaker 2.212.0
    Uninstalling sagemaker-2.212.0:
      Successfully uninstalled sagemaker-2.212.0
Successfully installed sagemaker-2.216.0
Note: you may need to restart the kernel to use updated packages.


## 2) Supported models/tasks

Models with **[TP]** after the name support Tensor Parallelism

In [3]:
from IPython.display import Markdown, display

display(Markdown("../docs/optimum_neuron_models.md"))

# HF Optimum Neuron - Supported Models
**version: 0.0.21**  
Models marked with <font style='color: red;'><b>[TP]</b></font> support **Tensor Parallelism** for training and inference
## Models/tasks for training
| Model                                                                                                                                                                 | PreTraining                                                                                                                                             | SequenceClassification                                                                                                                                                     | MultipleChoice                                                                                                                                                     | TokenClassification                                                                                                                                                     | QuestionAnswering                                                                                                                                                     | MaskedLM                                                                                                                                                     | ConditionalGeneration                                                                                                                                       | CausalLM                                                                                                                                                     | NextSentencePrediction                                                                                                                                       | MaskedImageModeling                                                                                                                                     | ImageClassification                                                                                                                                     |
|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------|
| <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?sort=trending&search=albert">albert</a>                                                | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/albert#transformers.AlbertForPreTraining">doc</a>   | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/albert#transformers.AlbertForSequenceClassification">doc</a>           | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/albert#transformers.AlbertForMultipleChoice">doc</a>           | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/albert#transformers.AlbertForTokenClassification">doc</a>           | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/albert#transformers.AlbertForQuestionAnswering">doc</a>           | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/albert#transformers.AlbertForMaskedLM">doc</a>           |                                                                                                                                                             |                                                                                                                                                              |                                                                                                                                                              |                                                                                                                                                         |                                                                                                                                                         |
| <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?sort=trending&search=bart">bart</a>                                                    |                                                                                                                                                         | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/bart#transformers.BartForSequenceClassification">doc</a>               |                                                                                                                                                                    |                                                                                                                                                                         | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/bart#transformers.BartForQuestionAnswering">doc</a>               |                                                                                                                                                              | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/bart#transformers.BartForConditionalGeneration">doc</a> | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/bart#transformers.BartForCausalLM">doc</a>               |                                                                                                                                                              |                                                                                                                                                         |                                                                                                                                                         |
| <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?sort=trending&search=bert">bert</a> <font style='color: red;'><b>[TP]</b></font>       | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/bert#transformers.BertForPreTraining">doc</a>       | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/bert#transformers.BertForSequenceClassification">doc</a>               | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/bert#transformers.BertForMultipleChoice">doc</a>               | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/bert#transformers.BertForTokenClassification">doc</a>               | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/bert#transformers.BertForQuestionAnswering">doc</a>               | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/bert#transformers.BertForMaskedLM">doc</a>               |                                                                                                                                                             |                                                                                                                                                              | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/bert#transformers.BertForNextSentencePrediction">doc</a> |                                                                                                                                                         |                                                                                                                                                         |
| <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?sort=trending&search=camembert">camembert</a>                                          |                                                                                                                                                         | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/camembert#transformers.CamembertForSequenceClassification">doc</a>     | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/camembert#transformers.CamembertForMultipleChoice">doc</a>     | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/camembert#transformers.CamembertForTokenClassification">doc</a>     | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/camembert#transformers.CamembertForQuestionAnswering">doc</a>     | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/camembert#transformers.CamembertForMaskedLM">doc</a>     |                                                                                                                                                             | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/camembert#transformers.CamembertForCausalLM">doc</a>     |                                                                                                                                                              |                                                                                                                                                         |                                                                                                                                                         |
| <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?sort=trending&search=distilbert">distilbert</a>                                        |                                                                                                                                                         | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/distilbert#transformers.DistilbertForSequenceClassification">doc</a>   | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/distilbert#transformers.DistilbertForMultipleChoice">doc</a>   | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/distilbert#transformers.DistilbertForTokenClassification">doc</a>   | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/distilbert#transformers.DistilbertForQuestionAnswering">doc</a>   | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/distilbert#transformers.DistilbertForMaskedLM">doc</a>   |                                                                                                                                                             |                                                                                                                                                              |                                                                                                                                                              |                                                                                                                                                         |                                                                                                                                                         |
| <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?sort=trending&search=electra">electra</a>                                              | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/electra#transformers.ElectraForPreTraining">doc</a> | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/electra#transformers.ElectraForSequenceClassification">doc</a>         | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/electra#transformers.ElectraForMultipleChoice">doc</a>         | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/electra#transformers.ElectraForTokenClassification">doc</a>         | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/electra#transformers.ElectraForQuestionAnswering">doc</a>         | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/electra#transformers.ElectraForMaskedLM">doc</a>         |                                                                                                                                                             | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/electra#transformers.ElectraForCausalLM">doc</a>         |                                                                                                                                                              |                                                                                                                                                         |                                                                                                                                                         |
| <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?sort=trending&search=gpt2">gpt2</a>                                                    |                                                                                                                                                         | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/gpt2#transformers.Gpt2ForSequenceClassification">doc</a>               |                                                                                                                                                                    | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/gpt2#transformers.Gpt2ForTokenClassification">doc</a>               | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/gpt2#transformers.Gpt2ForQuestionAnswering">doc</a>               |                                                                                                                                                              |                                                                                                                                                             |                                                                                                                                                              |                                                                                                                                                              |                                                                                                                                                         |                                                                                                                                                         |
| <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?sort=trending&search=gpt_neo">gpt_neo</a> <font style='color: red;'><b>[TP]</b></font> |                                                                                                                                                         | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/gpt_neo#transformers.Gpt_NeoForSequenceClassification">doc</a>         |                                                                                                                                                                    | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/gpt_neo#transformers.Gpt_NeoForTokenClassification">doc</a>         | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/gpt_neo#transformers.Gpt_NeoForQuestionAnswering">doc</a>         |                                                                                                                                                              |                                                                                                                                                             | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/gpt_neo#transformers.Gpt_NeoForCausalLM">doc</a>         |                                                                                                                                                              |                                                                                                                                                         |                                                                                                                                                         |
| <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?sort=trending&search=marian">marian</a>                                                |                                                                                                                                                         |                                                                                                                                                                            |                                                                                                                                                                    |                                                                                                                                                                         |                                                                                                                                                                       |                                                                                                                                                              |                                                                                                                                                             | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/marian#transformers.MarianForCausalLM">doc</a>           |                                                                                                                                                              |                                                                                                                                                         |                                                                                                                                                         |
| <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?sort=trending&search=roberta">roberta</a> <font style='color: red;'><b>[TP]</b></font> |                                                                                                                                                         | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/roberta#transformers.RobertaForSequenceClassification">doc</a>         | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/roberta#transformers.RobertaForMultipleChoice">doc</a>         | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/roberta#transformers.RobertaForTokenClassification">doc</a>         | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/roberta#transformers.RobertaForQuestionAnswering">doc</a>         | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/roberta#transformers.RobertaForMaskedLM">doc</a>         |                                                                                                                                                             | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/roberta#transformers.RobertaForCausalLM">doc</a>         |                                                                                                                                                              |                                                                                                                                                         |                                                                                                                                                         |
| <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?sort=trending&search=t5">t5</a> <font style='color: red;'><b>[TP]</b></font>           |                                                                                                                                                         | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/t5#transformers.T5ForSequenceClassification">doc</a>                   |                                                                                                                                                                    |                                                                                                                                                                         | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/t5#transformers.T5ForQuestionAnswering">doc</a>                   |                                                                                                                                                              | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/t5#transformers.T5ForConditionalGeneration">doc</a>     |                                                                                                                                                              |                                                                                                                                                              |                                                                                                                                                         |                                                                                                                                                         |
| <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?sort=trending&search=vit">vit</a>                                                      |                                                                                                                                                         |                                                                                                                                                                            |                                                                                                                                                                    |                                                                                                                                                                         |                                                                                                                                                                       |                                                                                                                                                              |                                                                                                                                                             |                                                                                                                                                              |                                                                                                                                                              | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/vit#transformers.VitForMaskedImageModeling">doc</a> | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/vit#transformers.VitForImageClassification">doc</a> |
| <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?sort=trending&search=xlm-roberta">xlm-roberta</a>                                      |                                                                                                                                                         | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/xlm-roberta#transformers.Xlm-RobertaForSequenceClassification">doc</a> | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/xlm-roberta#transformers.Xlm-RobertaForMultipleChoice">doc</a> | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/xlm-roberta#transformers.Xlm-RobertaForTokenClassification">doc</a> | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/xlm-roberta#transformers.Xlm-RobertaForQuestionAnswering">doc</a> | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/xlm-roberta#transformers.Xlm-RobertaForMaskedLM">doc</a> |                                                                                                                                                             | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/docs/transformers/model_doc/xlm-roberta#transformers.Xlm-RobertaForCausalLM">doc</a> |                                                                                                                                                              |                                                                                                                                                         |                                                                                                                                                         |
## Models/tasks for inference
| Model                                                                                                                                                                 | feature-extraction                                                                                                                                       | fill-mask                                                                                                                                       | multiple-choice                                                                                                                                       | question-answering                                                                                                                                       | text-classification                                                                                                                                       | token-classification                                                                                                                                       | text-generation                                                                                                                                   | zero-shot-image-classification                                                                                                                                | text2text-generation                                                                                                                                      |
|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------|
| <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?sort=trending&search=albert">albert</a>                                                | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=feature-extraction&sort=trending&search=albert">doc</a>      | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=fill-mask&sort=trending&search=albert">doc</a>      | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=multiple-choice&sort=trending&search=albert">doc</a>      | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=question-answering&sort=trending&search=albert">doc</a>      | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=text-classification&sort=trending&search=albert">doc</a>      | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=token-classification&sort=trending&search=albert">doc</a>      |                                                                                                                                                   |                                                                                                                                                               |                                                                                                                                                           |
| <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?sort=trending&search=bert">bert</a> <font style='color: red;'><b>[TP]</b></font>       | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=feature-extraction&sort=trending&search=bert">doc</a>        | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=fill-mask&sort=trending&search=bert">doc</a>        | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=multiple-choice&sort=trending&search=bert">doc</a>        | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=question-answering&sort=trending&search=bert">doc</a>        | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=text-classification&sort=trending&search=bert">doc</a>        | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=token-classification&sort=trending&search=bert">doc</a>        |                                                                                                                                                   |                                                                                                                                                               |                                                                                                                                                           |
| <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?sort=trending&search=bloom">bloom</a>                                                  |                                                                                                                                                          |                                                                                                                                                 |                                                                                                                                                       |                                                                                                                                                          |                                                                                                                                                           |                                                                                                                                                            | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=text-generation&sort=trending&search=bloom">doc</a>   |                                                                                                                                                               |                                                                                                                                                           |
| <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?sort=trending&search=camembert">camembert</a>                                          | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=feature-extraction&sort=trending&search=camembert">doc</a>   | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=fill-mask&sort=trending&search=camembert">doc</a>   | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=multiple-choice&sort=trending&search=camembert">doc</a>   | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=question-answering&sort=trending&search=camembert">doc</a>   | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=text-classification&sort=trending&search=camembert">doc</a>   | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=token-classification&sort=trending&search=camembert">doc</a>   |                                                                                                                                                   |                                                                                                                                                               |                                                                                                                                                           |
| <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?sort=trending&search=clip">clip</a>                                                    | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=feature-extraction&sort=trending&search=clip">doc</a>        |                                                                                                                                                 |                                                                                                                                                       |                                                                                                                                                          |                                                                                                                                                           |                                                                                                                                                            |                                                                                                                                                   | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=zero-shot-image-classification&sort=trending&search=clip">doc</a> |                                                                                                                                                           |
| <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?sort=trending&search=convbert">convbert</a>                                            | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=feature-extraction&sort=trending&search=convbert">doc</a>    | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=fill-mask&sort=trending&search=convbert">doc</a>    | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=multiple-choice&sort=trending&search=convbert">doc</a>    | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=question-answering&sort=trending&search=convbert">doc</a>    | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=text-classification&sort=trending&search=convbert">doc</a>    | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=token-classification&sort=trending&search=convbert">doc</a>    |                                                                                                                                                   |                                                                                                                                                               |                                                                                                                                                           |
| <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?sort=trending&search=deberta">deberta</a>                                              | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=feature-extraction&sort=trending&search=deberta">doc</a>     | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=fill-mask&sort=trending&search=deberta">doc</a>     |                                                                                                                                                       | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=question-answering&sort=trending&search=deberta">doc</a>     | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=text-classification&sort=trending&search=deberta">doc</a>     | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=token-classification&sort=trending&search=deberta">doc</a>     |                                                                                                                                                   |                                                                                                                                                               |                                                                                                                                                           |
| <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?sort=trending&search=deberta-v2">deberta-v2</a>                                        | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=feature-extraction&sort=trending&search=deberta-v2">doc</a>  | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=fill-mask&sort=trending&search=deberta-v2">doc</a>  |                                                                                                                                                       | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=question-answering&sort=trending&search=deberta-v2">doc</a>  | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=text-classification&sort=trending&search=deberta-v2">doc</a>  | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=token-classification&sort=trending&search=deberta-v2">doc</a>  |                                                                                                                                                   |                                                                                                                                                               |                                                                                                                                                           |
| <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?sort=trending&search=distilbert">distilbert</a>                                        | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=feature-extraction&sort=trending&search=distilbert">doc</a>  | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=fill-mask&sort=trending&search=distilbert">doc</a>  | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=multiple-choice&sort=trending&search=distilbert">doc</a>  | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=question-answering&sort=trending&search=distilbert">doc</a>  | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=text-classification&sort=trending&search=distilbert">doc</a>  | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=token-classification&sort=trending&search=distilbert">doc</a>  |                                                                                                                                                   |                                                                                                                                                               |                                                                                                                                                           |
| <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?sort=trending&search=electra">electra</a>                                              | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=feature-extraction&sort=trending&search=electra">doc</a>     | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=fill-mask&sort=trending&search=electra">doc</a>     | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=multiple-choice&sort=trending&search=electra">doc</a>     | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=question-answering&sort=trending&search=electra">doc</a>     | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=text-classification&sort=trending&search=electra">doc</a>     | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=token-classification&sort=trending&search=electra">doc</a>     |                                                                                                                                                   |                                                                                                                                                               |                                                                                                                                                           |
| <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?sort=trending&search=esm">esm</a>                                                      | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=feature-extraction&sort=trending&search=esm">doc</a>         | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=fill-mask&sort=trending&search=esm">doc</a>         |                                                                                                                                                       |                                                                                                                                                          | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=text-classification&sort=trending&search=esm">doc</a>         | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=token-classification&sort=trending&search=esm">doc</a>         |                                                                                                                                                   |                                                                                                                                                               |                                                                                                                                                           |
| <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?sort=trending&search=flaubert">flaubert</a>                                            | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=feature-extraction&sort=trending&search=flaubert">doc</a>    | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=fill-mask&sort=trending&search=flaubert">doc</a>    | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=multiple-choice&sort=trending&search=flaubert">doc</a>    | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=question-answering&sort=trending&search=flaubert">doc</a>    | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=text-classification&sort=trending&search=flaubert">doc</a>    | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=token-classification&sort=trending&search=flaubert">doc</a>    |                                                                                                                                                   |                                                                                                                                                               |                                                                                                                                                           |
| <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?sort=trending&search=gpt2">gpt2</a>                                                    |                                                                                                                                                          |                                                                                                                                                 |                                                                                                                                                       |                                                                                                                                                          |                                                                                                                                                           |                                                                                                                                                            | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=text-generation&sort=trending&search=gpt2">doc</a>    |                                                                                                                                                               |                                                                                                                                                           |
| <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?sort=trending&search=mistral">mistral</a> <font style='color: red;'><b>[TP]</b></font> |                                                                                                                                                          |                                                                                                                                                 |                                                                                                                                                       |                                                                                                                                                          |                                                                                                                                                           |                                                                                                                                                            | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=text-generation&sort=trending&search=mistral">doc</a> |                                                                                                                                                               |                                                                                                                                                           |
| <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?sort=trending&search=mobilebert">mobilebert</a>                                        | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=feature-extraction&sort=trending&search=mobilebert">doc</a>  | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=fill-mask&sort=trending&search=mobilebert">doc</a>  | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=multiple-choice&sort=trending&search=mobilebert">doc</a>  | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=question-answering&sort=trending&search=mobilebert">doc</a>  | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=text-classification&sort=trending&search=mobilebert">doc</a>  | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=token-classification&sort=trending&search=mobilebert">doc</a>  |                                                                                                                                                   |                                                                                                                                                               |                                                                                                                                                           |
| <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?sort=trending&search=mpnet">mpnet</a>                                                  | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=feature-extraction&sort=trending&search=mpnet">doc</a>       | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=fill-mask&sort=trending&search=mpnet">doc</a>       | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=multiple-choice&sort=trending&search=mpnet">doc</a>       | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=question-answering&sort=trending&search=mpnet">doc</a>       | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=text-classification&sort=trending&search=mpnet">doc</a>       | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=token-classification&sort=trending&search=mpnet">doc</a>       |                                                                                                                                                   |                                                                                                                                                               |                                                                                                                                                           |
| <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?sort=trending&search=opt">opt</a>                                                      |                                                                                                                                                          |                                                                                                                                                 |                                                                                                                                                       |                                                                                                                                                          |                                                                                                                                                           |                                                                                                                                                            | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=text-generation&sort=trending&search=opt">doc</a>     |                                                                                                                                                               |                                                                                                                                                           |
| <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?sort=trending&search=llama">llama</a> <font style='color: red;'><b>[TP]</b></font>     |                                                                                                                                                          |                                                                                                                                                 |                                                                                                                                                       |                                                                                                                                                          |                                                                                                                                                           |                                                                                                                                                            | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=text-generation&sort=trending&search=llama">doc</a>   |                                                                                                                                                               |                                                                                                                                                           |
| <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?sort=trending&search=phi">phi</a>                                                      | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=feature-extraction&sort=trending&search=phi">doc</a>         |                                                                                                                                                 |                                                                                                                                                       |                                                                                                                                                          | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=text-classification&sort=trending&search=phi">doc</a>         | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=token-classification&sort=trending&search=phi">doc</a>         |                                                                                                                                                   |                                                                                                                                                               |                                                                                                                                                           |
| <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?sort=trending&search=roberta">roberta</a> <font style='color: red;'><b>[TP]</b></font> | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=feature-extraction&sort=trending&search=roberta">doc</a>     | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=fill-mask&sort=trending&search=roberta">doc</a>     | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=multiple-choice&sort=trending&search=roberta">doc</a>     | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=question-answering&sort=trending&search=roberta">doc</a>     | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=text-classification&sort=trending&search=roberta">doc</a>     | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=token-classification&sort=trending&search=roberta">doc</a>     |                                                                                                                                                   |                                                                                                                                                               |                                                                                                                                                           |
| <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?sort=trending&search=roformer">roformer</a>                                            | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=feature-extraction&sort=trending&search=roformer">doc</a>    | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=fill-mask&sort=trending&search=roformer">doc</a>    | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=multiple-choice&sort=trending&search=roformer">doc</a>    | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=question-answering&sort=trending&search=roformer">doc</a>    | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=text-classification&sort=trending&search=roformer">doc</a>    | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=token-classification&sort=trending&search=roformer">doc</a>    |                                                                                                                                                   |                                                                                                                                                               |                                                                                                                                                           |
| <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?sort=trending&search=xlm">xlm</a>                                                      | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=feature-extraction&sort=trending&search=xlm">doc</a>         | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=fill-mask&sort=trending&search=xlm">doc</a>         | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=multiple-choice&sort=trending&search=xlm">doc</a>         | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=question-answering&sort=trending&search=xlm">doc</a>         | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=text-classification&sort=trending&search=xlm">doc</a>         | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=token-classification&sort=trending&search=xlm">doc</a>         |                                                                                                                                                   |                                                                                                                                                               |                                                                                                                                                           |
| <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?sort=trending&search=xlm-roberta">xlm-roberta</a>                                      | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=feature-extraction&sort=trending&search=xlm-roberta">doc</a> | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=fill-mask&sort=trending&search=xlm-roberta">doc</a> | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=multiple-choice&sort=trending&search=xlm-roberta">doc</a> | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=question-answering&sort=trending&search=xlm-roberta">doc</a> | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=text-classification&sort=trending&search=xlm-roberta">doc</a> | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=token-classification&sort=trending&search=xlm-roberta">doc</a> |                                                                                                                                                   |                                                                                                                                                               |                                                                                                                                                           |
| <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?sort=trending&search=t5-encoder">t5-encoder</a>                                        |                                                                                                                                                          |                                                                                                                                                 |                                                                                                                                                       |                                                                                                                                                          |                                                                                                                                                           |                                                                                                                                                            |                                                                                                                                                   |                                                                                                                                                               | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=text2text-generation&sort=trending&search=t5-encoder">doc</a> |
| <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?sort=trending&search=t5-decoder">t5-decoder</a>                                        |                                                                                                                                                          |                                                                                                                                                 |                                                                                                                                                       |                                                                                                                                                          |                                                                                                                                                           |                                                                                                                                                            |                                                                                                                                                   |                                                                                                                                                               | <a rel="noopener noreferrer" target="_new" href="https://huggingface.co/models?pipeline_tag=text2text-generation&sort=trending&search=t5-decoder">doc</a> |


## 3) Compiling a pre-trained model to AWS Inferentia

**IMPORTANT:** Copy the **SageMaker training job name** from the previous notebook **02_ModelFineTuning** or from your AWS Console/SageMaker and set the variable **training_job_name**. It is necessary because we'll use the fine-tuned model as the input for the compilation job.

In [4]:
import os
import boto3
import sagemaker

print(sagemaker.__version__)
if not sagemaker.__version__ >= "2.146.0": print("You need to upgrade or restart the kernel if you already upgraded")

sess = sagemaker.Session()
role = sagemaker.get_execution_role()
bucket = sess.default_bucket()
region = sess.boto_region_name

training_job_name=""

if os.path.isfile("training_job_name.txt"): training_job_name = open("training_job_name.txt", "r").read().strip()
assert len(training_job_name)>0, "Please copy the name of the training_job you ran in the previous notebook and set training_job_name"
checkpoint_s3_uri=f"s3://{bucket}/output/{training_job_name}/output/model.tar.gz"

if not os.path.isdir('src'): os.makedirs('src', exist_ok=True)

print(f"sagemaker role arn: {role}")
print(f"sagemaker bucket: {bucket}")
print(f"sagemaker session region: {region}")
print(f"Training job name: {training_job_name}")
print(f"Model S3 URI: {checkpoint_s3_uri}")

sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /home/sagemaker-user/.config/sagemaker/config.yaml
2.216.0
sagemaker role arn: arn:aws:iam::201700346094:role/sagemaker-immersion-day-SageMakerExecutionRole-N5BbGa9t2QPo
sagemaker bucket: sagemaker-us-east-1-201700346094
sagemaker session region: us-east-1
Training job name: pytorch-training-neuronx-2024-04-19-15-03-06-309
Model S3 URI: s3://sagemaker-us-east-1-201700346094/output/pytorch-training-neuronx-2024-04-19-15-03-06-309/output/model.tar.gz


### 3.1) Compilation script that will be invoked by SageMaker

In [5]:
!pygmentize src/compile.py

[37m# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.[39;49;00m[37m[39;49;00m
[37m# SPDX-License-Identifier: MIT-0[39;49;00m[37m[39;49;00m
[37m[39;49;00m
[34mimport[39;49;00m [04m[36mos[39;49;00m[37m[39;49;00m
os.environ[[33m'[39;49;00m[33mNEURON_RT_NUM_CORES[39;49;00m[33m'[39;49;00m] = [33m'[39;49;00m[33m1[39;49;00m[33m'[39;49;00m[37m[39;49;00m
[34mimport[39;49;00m [04m[36msys[39;49;00m[37m[39;49;00m
[34mimport[39;49;00m [04m[36mglob[39;49;00m[37m[39;49;00m
[34mimport[39;49;00m [04m[36mjson[39;49;00m[37m[39;49;00m
[34mimport[39;49;00m [04m[36mtorch[39;49;00m[37m[39;49;00m
[34mimport[39;49;00m [04m[36mshutil[39;49;00m[37m[39;49;00m
[34mimport[39;49;00m [04m[36mtarfile[39;49;00m[37m[39;49;00m
[34mimport[39;49;00m [04m[36mlogging[39;49;00m[37m[39;49;00m
[34mimport[39;49;00m [04m[36margparse[39;49;00m[37m[39;49;00m
[34mimport[39;49;00m [04m[36mtraceback[39;49;00m[37m[39;49;00m


In [6]:
!pygmentize src/requirements.txt

--extra-index-url https://pip.repos.neuron.amazonaws.com
evaluate==0.4.1
accelerate==0.23.0
transformers==4.36.2
optimum-neuron==0.0.21
neuronx-distributed==0.7.0
transformers-neuronx==0.10.0.21


### 3.2) SageMaker Estimator
This object will help you to configure the compilation job (SageMaker Training Job).

This job will invoke **compile.py** script, which will compile our model to Inferentia2 and than save the artifacts for deployment.

In [7]:
task="SequenceClassification"
# Source: https://huggingface.co/docs/optimum-neuron/guides/export_model#exporting-a-model-to-neuron-via-neuronmodel
input_shapes={"batch_size": 1, "sequence_length": 512}

In [8]:
import json
import logging
from sagemaker.pytorch import PyTorch

estimator = PyTorch(
    entry_point="compile.py", # Specify your train script
    source_dir="src",
    role=role,
    sagemaker_session=sess,
    container_log_level=logging.DEBUG,
    instance_count=1,
    instance_type='ml.trn1.2xlarge',
    output_path=f"s3://{bucket}/output",
    disable_profiler=True,
    
    image_uri=f"763104351884.dkr.ecr.{region}.amazonaws.com/pytorch-training-neuronx:1.13.1-neuronx-py310-sdk2.18.0-ubuntu20.04",
    
    volume_size = 512,
    hyperparameters={     
        "task": task,
        "input_shapes": f"'{json.dumps(input_shapes)}'",
        "dynamic_batch_size": True
    }
)
estimator.framework_version = '1.13.1' # workround when using image_uri

In [9]:
estimator.fit({"checkpoint": checkpoint_s3_uri})

INFO:sagemaker:Creating training-job with name: pytorch-training-neuronx-2024-04-19-15-17-58-380


2024-04-19 15:17:58 Starting - Starting the training job...
2024-04-19 15:18:14 Starting - Preparing the instances for training...
2024-04-19 15:18:49 Downloading - Downloading input data...
2024-04-19 15:19:10 Downloading - Downloading the training image.....................
2024-04-19 15:22:41 Training - Training image download completed. Training in progress....[34mbash: cannot set terminal process group (-1): Inappropriate ioctl for device[0m
[34mbash: no job control in this shell[0m
[34m2024-04-19 15:23:15,523 sagemaker-training-toolkit INFO     Imported framework sagemaker_pytorch_container.training[0m
[34m2024-04-19 15:23:15,524 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)[0m
[34m2024-04-19 15:23:15,552 sagemaker-training-toolkit INFO     Found 2 neurons on this instance[0m
[34m2024-04-19 15:23:15,553 botocore.hooks DEBUG    Changing event name from creating-client-class.iot-data to creating-client-class.iot-data-plane[0m
[34m20

## 4) Deploy a SageMaker real-time endpoint

In [10]:
import logging
from sagemaker.utils import name_from_base
from sagemaker.pytorch.model import PyTorchModel

# depending on the inf2 instance you deploy the model you'll have more or less accelerators
# we'll ask SageMaker to launch 1 worker per core

model_data=estimator.model_data
print(f"Model data: {model_data}")

instance_type_idx=0 # default ml.inf2.xlarge
instance_types=['ml.inf2.xlarge', 'ml.inf2.8xlarge', 'ml.inf2.24xlarge','ml.inf2.48xlarge']
num_workers=[2,2,12,24]

print(f"Instance type: {instance_types[instance_type_idx]}. Num SM workers: {num_workers[instance_type_idx]}")
pytorch_model = PyTorchModel(
    image_uri=f"763104351884.dkr.ecr.{region}.amazonaws.com/pytorch-inference-neuronx:1.13.1-neuronx-py310-sdk2.18.0-ubuntu20.04",
    model_data=model_data,
    role=role,    
    name=name_from_base('bert-spam-classifier'),
    sagemaker_session=sess,
    container_log_level=logging.DEBUG,
    model_server_workers=num_workers[instance_type_idx], # 1 worker per inferentia chip
    framework_version="1.13.1",
    env = {
        'SAGEMAKER_MODEL_SERVER_TIMEOUT': '3600',
        'TASK': task
    }
    # for production it is important to define vpc_config and use a vpc_endpoint
    #vpc_config={
    #    'Subnets': ['<SUBNET1>', '<SUBNET2>'],
    #    'SecurityGroupIds': ['<SECURITYGROUP1>', '<DEFAULTSECURITYGROUP>']
    #}
)
pytorch_model._is_compiled_model = True

Model data: s3://sagemaker-us-east-1-201700346094/output/pytorch-training-neuronx-2024-04-19-15-17-58-380/output/model.tar.gz
Instance type: ml.inf2.xlarge. Num SM workers: 2


In [11]:
predictor = pytorch_model.deploy(
    initial_instance_count=1,
    instance_type=instance_types[instance_type_idx],
    model_data_download_timeout=3600, # it takes some time to download all the artifacts and load the model
    container_startup_health_check_timeout=1800
)

INFO:sagemaker:Creating model with name: bert-spam-classifier-2024-04-19-15-25-13-643
INFO:sagemaker:Creating endpoint-config with name bert-spam-classifier-ml-inf2-2024-04-19-15-25-14-446
INFO:sagemaker:Creating endpoint with name bert-spam-classifier-ml-inf2-2024-04-19-15-25-14-446


ResourceLimitExceeded: An error occurred (ResourceLimitExceeded) when calling the CreateEndpoint operation: The account-level service limit 'ml.inf2.xlarge for endpoint usage' is 0 Instances, with current utilization of 0 Instances and a request delta of 1 Instances. Please use AWS Service Quotas to request an increase for this quota. If AWS Service Quotas is not available, contact AWS support to request an increase for this quota.

## 5) Run a simple test

In [None]:
from sagemaker.serializers import JSONSerializer
from sagemaker.deserializers import JSONDeserializer
predictor.serializer = JSONSerializer()
predictor.deserializer = JSONDeserializer()

In [None]:
import time

labels={0: "not spam", 1: "spam"}
not_spam=" Deezer.com 10,406,168 Artist DB\n\nWe have scraped the Deezer Artist DB, right now there are 10,406,168 listings according to Deezer.com\n\nPlease note in going through part of the list, it is obvious there are mistakes inside their system.\n\nExamples include and Artist with &amp; in its name might also be found with "and" but the Albums for each have different totals etc. Have no clue if there are duplicate albums etc do this error in their system. Even a comma in a name could mean the Artist shows up more than once, I saw in 1 instance that 1 Artist had 6 different ArtistIDs due to spelling errors.\n\nSo what is this DB, very simple, it gives you the ArtistID and the actual name of the Artist in another column. If you want to see the artist you add the baseurl to the ArtistID\n\nAn example is ArtistID 115 is AC/DC\n\n[https://www.deezer.com/us/artist/115](https://www.deezer.com/us/artist/115)\n\nYou do not have to use [https://www.deezer.com/us/artist/](https://www.deezer.com/us/artist/) if your first language is other than English, just see if Deezer supports your language and use that baseref\n\nFrench for example is [https://www.deezer.com/fr/artist/115](https://www.deezer.com/fr/artist/115)\n\nI am providing the DB in 3 different formats:\n\n \n\nI tried posting download links here but it seems Reddit does not like that so get them here:\n\n[https://pastebin\\[DOT\\]com/V3KJbgif](https://pastebin.com/V3KJbgif)\n\n&amp;#x200B;\n\n**Special thanks go to** [**/user/KoalaBear84**](https://www.reddit.com/user/KoalaBear84) **for writing the scraper.**\n\n&amp;#x200B;\n\n**Cross Posted to related Reddit Groups**"
spam="üö® ATTENTION ALL USERS! üö®\n\nüÜò Are you looking for a way to GET RICH QUICK? üÜò\n\nüí∞ Don't waste your time with boring old jobs! üí∞\n\nüí∏ Join our CRAZY MONEY-MAKING SYSTEM today! üí∏\n\nü§ë Just sign up and start earning BIG BUCKS right away! ü§ë\n\nüëâ Plus, if you refer your friends, you'll get even MORE CASH! üëà\n\nüî• This is the HOTTEST OFFER of the year! üî•\n\nüëç Don't wait"

for i,text in enumerate([not_spam, spam]):
    t=time.time()
    pred = predictor.predict({"prompt": text})
    elapsed = (time.time()-t)*1000
    print(f"Elapsed time: {elapsed}")
    print(f"Pred: {i} - {labels[pred[0][0]]} / score: {pred[0][1]}")

## 5) Cleanup

In [None]:
predictor.delete_model()
predictor.delete_endpoint()