# AWS Trainium Distributed Training - "bert-base-cased" for Sentiment Analysis
This notebook shows how to fine-tune a "bert base cased" PyTorch model with AWS Trainium (Trn1 instances) using Neuron SDK. The original implementation is provided by HuggingFace.

Our goal is building a Machine Learning model that will predict whether the tweet is offensive, neutral, or positive (<b>Sentiment Analysis</b>).

The target variable is the **Sentiment**, which can be:
* Neutral
* Positive
* Negative

In this exercise you will do:
 - Run a Distributed Training using all the available Neuron Cores

The example code referenced for this example is [trainium-distributed-training](./code/02-trainium-distributed-training/train.py)

It has been tested and run on a **trn1.32xlarge**

**Reference:** https://huggingface.co/bert-base-cased

***

Verify that this Jupyter notebook is running the Python kernel environment that was set up according to the [PyTorch Installation Guide](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/general/setup/torch-neuronx.html#setup-torch-neuronx). You can select the kernel from the 'Kernel -> Change Kernel' option on the top of this Jupyter notebook page.

## Step 1 - Install dependencies

Let's install some required dependencies for our environment.

In [None]:
%env TOKENIZERS_PARALLELISM=True #Supresses tokenizer warnings making errors easier to detect
!pip install neuronx-cc==2.* torch-neuronx torchvision datasets transformers==4.40.1

***

## Step 2 - Fine-Tune the model

Let's take a look to our train.py code

In [None]:
! pygmentize ./code/02-trainium-distributed-training/train.py

In order to run distributed training by using AWS Trainium Neuron Core, we have to define the number of Cores we want to use for distribution. The number of Neuron Nodes N can be 1, 2, 8, or 32.

In [None]:
nproc_per_node = 32

In [None]:
import subprocess
print("Train model")
RUN_CMD = f"""torchrun --nproc_per_node={nproc_per_node} ./code/02-trainium-distributed-training/train.py"""

print(f'Running command: \n{RUN_CMD}')
if subprocess.check_call(RUN_CMD,shell=True):
   print("There was an error with the fine-tune command")
else:
   print("Fine-tune Successful!!!")