# AWS Trainium Distributed Training - "bert-base-cased" for Sentiment Analysis
This notebook shows how to fine-tune a "bert base cased" PyTorch model with AWS Trainium (Trn1 instances) using Neuron SDK. The original implementation is provided by HuggingFace.

Our goal is building a Machine Learning model that will predict whether the tweet is offensive, neutral, or positive (<b>Sentiment Analysis</b>).

The target variable is the **Sentiment**, which can be:
* Neutral
* Positive
* Negative

In this exercise you will do:
 - Run a Distributed Training using all the available Neuron Cores

The example code referenced for this example is [trainium-distributed-training](./code/02-trainium-distributed-training/train.py)

It has been tested and run on a **trn1.32xlarge**

**Reference:** https://huggingface.co/bert-base-cased

***

## Step 1 - Install dependencies

Let's install some required dependencies for our environment.

In [None]:
! pip install datasets numpy<=1.20.0

***

## Step 2 - Fine-Tune the model

Let's take a look to our train.py code

In [None]:
! pygmentize ./code/02-trainium-distributed-training/train.py

In order to run distributed training by using AWS Trainium Neuron Core, we have to define the number of Cores we want to use for distribution. The number of Neuron Nodes N can be 1, 2, 8, or 32.

In [None]:
nproc_per_node = 32

In [None]:
print("Train model")
RUN_CMD = f"""torchrun --nproc_per_node={nproc_per_node} train.py"""

print(f'Running command: \n{RUN_CMD}')
! {RUN_CMD}