# XlmRobertaBase- Pytorch
This notebook shows how to fine-tune a "xlm roberta base" PyTorch model with AWS Trainium (trn1 instances) using NeuronSDK. The original implementation is provided by HuggingFace.

The example has 2 stages:
1. First compile the model using the utility `neuron_parallel_compile` to compile the model to run on the AWS Trainium device.
1. Run the fine-tuning script to train the model based on the associated task (e.g. mrpc). The training job will use 2 workers with data parallel to speed up the training. If you have a larger instance (trn1.32xlarge) you can increase the worker count to 8 or 32.

It has been tested and run on a trn1.32xlarge

**Reference:** https://huggingface.co/xlm-roberta-base

## 1) Install dependencies

Verify that this Jupyter notebook is running the Python kernel environment that was set up according to the [PyTorch Installation Guide](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/general/setup/torch-neuronx.html#setup-torch-neuronx). You can select the kernel from the 'Kernel -> Change Kernel' option on the top of this Jupyter notebook page.

In [None]:
%env TOKENIZERS_PARALLELISM=True #Supresses tokenizer warnings making errors easier to detect
#Install Neuron Compiler and Neuron/XLA packages
%pip install -U "protobuf<4" "transformers==4.27.3" datasets scikit-learn evaluate
# use --force-reinstall if you're facing some issues while loading the modules
# now restart the kernel again

In [None]:
# Clone transformers from Gighub
!git clone https://github.com/huggingface/transformers --branch v4.27.3

## 2) Set the parameters

In [None]:
model_name = "xlm-roberta-base"
env_var_options = "XLA_USE_BF16=1 NEURON_CC_FLAGS=\'--model-type=transformer --verbose=info\'"
num_workers = 32
task_name = "mrpc"
batch_size = 16
max_seq_length = 512
learning_rate = 2e-05
num_train_epochs = 100
model_base_name = model_name

## 3) Compile the model with neuron_parallel_compile

In [None]:
import subprocess
print("Compile model")
COMPILE_CMD = f"""{env_var_options} neuron_parallel_compile \
torchrun --nproc_per_node={num_workers} \
transformers/examples/pytorch/text-classification/run_glue.py \
--model_name_or_path {model_name} \
--task_name {task_name} \
--do_train \
--max_seq_length {max_seq_length} \
--per_device_train_batch_size {batch_size} \
--learning_rate {learning_rate} \
--max_train_samples 128 \
--overwrite_output_dir \
--output_dir {model_base_name}-{task_name}-{batch_size}bs"""

print(f'Running command: \n{COMPILE_CMD}')
if subprocess.check_call(COMPILE_CMD,shell=True):
   print("There was an error with the compilation command")
else:
   print("Compilation Success!!!")

## 4) Fine-tune the model

In [None]:
print("Train model")
RUN_CMD = f"""{env_var_options} torchrun --nproc_per_node={num_workers} \
transformers/examples/pytorch/text-classification/run_glue.py \
--model_name_or_path {model_name} \
--task_name {task_name} \
--do_train \
--do_eval \
--max_seq_length {max_seq_length} \
--per_device_train_batch_size {batch_size} \
--learning_rate {learning_rate} \
--num_train_epochs {num_train_epochs} \
--overwrite_output_dir \
--output_dir {model_base_name}-{task_name}-{num_workers}w-{batch_size}bs"""

print(f'Running command: \n{RUN_CMD}')
if subprocess.check_call(RUN_CMD,shell=True):
   print("There was an error with the fine-tune command")
else:
   print("Fine-tune Successful!!!")