# Fine-tune with Transformers 🤝 BentoML

In this Jupyter notebook file, we will fine-tune a version of [distilroberta-base](https://huggingface.co/distilroberta-base) for emotion detection (sentiment analysis) from text.

We can then save it to BentoML local modelstore for transfer learning. Refers to [Transformers' docs](https://huggingface.co/docs/transformers/custom_datasets#finetune-with-the-trainer-api) on fine-tuning with `Trainer` API.

## Variable definition
Feel free to change the following constant:

In [13]:
# transformers #
# ------------ #
TASKS = "text-classification"
MODEL = "j-hartmann/emotion-english-distilroberta-base"

# BentoML #
# ------- #
FT_MODEL_NAME = "drobert_ft"

# training parameters #
# ------------------- #
NUM_LABELS = 6
NUM_EPOCHS = 1
NUM_EXAMPLES = 200
BATCH_SIZE = 32
LR = 2e-5
WDECAY = 0.01

LABELS = ["sadness", "joy", "love", "anger", "fear", "surprise"]

## Fine-tuning for multi-class sentiment analysis with different domains
In this section, we will fine tune a version of [distilroberta-base](https://huggingface.co/distilroberta-base)

### Install requirements

In [None]:
!pip install -r requirements.txt

### Setup pretrained model

In [2]:
import bentoml
import transformers

classifier = transformers.pipeline(TASKS, model=MODEL, return_all_scores=True)  # type: ignore
tag = bentoml.transformers.save("emotion_distilroberta_base", classifier)



In [3]:
import torch
import psutil

from transformers.trainer_utils import set_seed
from datasets.load import load_dataset

torch.set_num_threads(psutil.cpu_count())
set_seed(420)

### Load Dataset and Prepare for training

We will use [emotion](https://huggingface.co/datasets/emotion) via [huggingface/datasets](https://huggingface.co/docs/datasets/)

In [4]:
emotion = load_dataset("emotion")



  0%|          | 0/3 [00:00<?, ?it/s]

We will load tokenizer from BentoML local Modelstore.

In [11]:
pipeline = bentoml.transformers.load(
    "emotion_distilroberta_base:latest", return_all_scores=True
)
tokenizer = getattr(pipeline, "tokenizer")

The following `preprocess_function` will [map](https://huggingface.co/docs/datasets/package_reference/main_classes.html#datasets.Dataset.map)
all given text in the dataset to a tokenized version. We can then later use for training

In [9]:
def preprocess_function(examples):
    return tokenizer(examples["text"], truncation=True, padding=True)


tokenized_emotion = emotion.map(preprocess_function, batched=True)

  0%|          | 0/16 [00:00<?, ?ba/s]

  0%|          | 0/2 [00:00<?, ?ba/s]

  0%|          | 0/2 [00:00<?, ?ba/s]

We will use `f1` and `recall` as our metrics for the model performance.

In [10]:
from sklearn.metrics import precision_recall_fscore_support, accuracy_score


def compute_metrics(pred):
    labels = pred.label_ids
    preds = pred.predictions.argmax(-1)
    precision, recall, f1, _ = precision_recall_fscore_support(
        labels, preds, average="macro"
    )
    acc = accuracy_score(labels, preds)
    return {"accuracy": acc, "f1": f1, "precision": precision, "recall": recall}

In [14]:
model = transformers.AutoModelForSequenceClassification.from_pretrained(
    MODEL, num_labels=NUM_LABELS, ignore_mismatched_sizes=True
)

Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at j-hartmann/emotion-english-distilroberta-base and are newly initialized because the shapes did not match:
- classifier.out_proj.weight: found shape torch.Size([7, 768]) in the checkpoint and torch.Size([6, 768]) in the model instantiated
- classifier.out_proj.bias: found shape torch.Size([7]) in the checkpoint and torch.Size([6]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [16]:
tokenized_emotion.set_format("torch", columns=["input_ids", "attention_mask", "label"])
tokenized_emotion["train"].features  # type: ignore

{'text': Value(dtype='string', id=None),
 'label': ClassLabel(num_classes=6, names=['sadness', 'joy', 'love', 'anger', 'fear', 'surprise'], names_file=None, id=None),
 'input_ids': Sequence(feature=Value(dtype='int32', id=None), length=-1, id=None),
 'attention_mask': Sequence(feature=Value(dtype='int8', id=None), length=-1, id=None)}

Lastly, pad your text so they are a uniform length. 
While it is possible to pad your text in the tokenizer function by setting `padding=True`, 
it is more efficient to only pad the text to the length of the longest element in its batch.

In [18]:
data_collator = transformers.DataCollatorWithPadding(tokenizer=tokenizer)

### Fine-tune with `Trainer` API

In [22]:
training_args = transformers.TrainingArguments(
    output_dir="./models",
    num_train_epochs=NUM_EPOCHS,
    learning_rate=LR,
    per_device_train_batch_size=BATCH_SIZE,
    per_device_eval_batch_size=BATCH_SIZE,
    metric_for_best_model="f1",
    weight_decay=WDECAY,
    evaluation_strategy="epoch",
)

In [23]:
trainer = transformers.Trainer(
    model=model,
    args=training_args,
    compute_metrics=compute_metrics,
    train_dataset=tokenized_emotion["train"],
    eval_dataset=tokenized_emotion["validation"],
    data_collator=data_collator,
)

In [24]:
trainer_output = trainer.train()
trainer_output

The following columns in the training set  don't have a corresponding argument in `RobertaForSequenceClassification.forward` and have been ignored: text. If text are not expected by `RobertaForSequenceClassification.forward`,  you can safely ignore this message.
***** Running training *****
  Num examples = 16000
  Num Epochs = 1
  Instantaneous batch size per device = 32
  Total train batch size (w. parallel, distributed & accumulation) = 32
  Gradient Accumulation steps = 1
  Total optimization steps = 500


Epoch,Training Loss,Validation Loss,Accuracy,F1,Precision,Recall
1,0.3633,0.218942,0.925,0.898465,0.89543,0.901926


2022-02-22T15:54:43.505022-0500 - Timed out waiting for syncing to complete.
2022-02-22T15:55:56.374226-0500 - Timed out waiting for syncing to complete.
2022-02-22T15:56:26.140017-0500 - Timed out waiting for syncing to complete.
2022-02-22T15:57:14.278544-0500 - Timed out waiting for syncing to complete.


Saving model checkpoint to ./models/checkpoint-500
Configuration saved in ./models/checkpoint-500/config.json
Model weights saved in ./models/checkpoint-500/pytorch_model.bin
The following columns in the evaluation set  don't have a corresponding argument in `RobertaForSequenceClassification.forward` and have been ignored: text. If text are not expected by `RobertaForSequenceClassification.forward`,  you can safely ignore this message.
***** Running Evaluation *****
  Num examples = 2000
  Batch size = 32


Training completed. Do not forget to share your model on huggingface.co/models =)




TrainOutput(global_step=500, training_loss=0.36329165649414064, metrics={'train_runtime': 3477.2754, 'train_samples_per_second': 4.601, 'train_steps_per_second': 0.144, 'total_flos': 356851229841792.0, 'train_loss': 0.36329165649414064, 'epoch': 1.0})

### Evaluate model performance.

In [25]:
results = trainer.evaluate()
results

The following columns in the evaluation set  don't have a corresponding argument in `RobertaForSequenceClassification.forward` and have been ignored: text. If text are not expected by `RobertaForSequenceClassification.forward`,  you can safely ignore this message.
***** Running Evaluation *****
  Num examples = 2000
  Batch size = 32


{'eval_loss': 0.21894210577011108,
 'eval_accuracy': 0.925,
 'eval_f1': 0.8984648914439984,
 'eval_precision': 0.8954296915535233,
 'eval_recall': 0.901925970657509,
 'eval_runtime': 90.4788,
 'eval_samples_per_second': 22.105,
 'eval_steps_per_second': 0.696,
 'epoch': 1.0}

### Validation

In [None]:
preds_output = trainer.predict(tokenized_emotion["validation"])
preds_output.metrics

The following columns in the test set  don't have a corresponding argument in `RobertaForSequenceClassification.forward` and have been ignored: text. If text are not expected by `RobertaForSequenceClassification.forward`,  you can safely ignore this message.
***** Running Prediction *****
  Num examples = 2000
  Batch size = 32


We also need to update labels for the fine-tune version

In [9]:
config = getattr(model, "config")
ID2L = config.id2label
L2ID = config.label2id
config.id2label = {k: LABELS[i] for i, k in enumerate(ID2L.keys())}
config.label2id = {LABELS[i]: v for i, v in enumerate(L2ID.values())}
config

RobertaConfig {
  "_name_or_path": "/Users/aarnphm/bentoml/models/drobert_ft/de6a7geufow2jgxi",
  "architectures": [
    "RobertaForSequenceClassification"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": 0,
  "classifier_dropout": null,
  "eos_token_id": 2,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "sadness",
    "1": "joy",
    "2": "love",
    "3": "anger",
    "4": "fear",
    "5": "surprise"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "label2id": {
    "anger": 3,
    "fear": 4,
    "joy": 1,
    "love": 2,
    "sadness": 0,
    "surprise": 5
  },
  "layer_norm_eps": 1e-05,
  "max_position_embeddings": 514,
  "model_type": "roberta",
  "num_attention_heads": 12,
  "num_hidden_layers": 6,
  "pad_token_id": 1,
  "position_embedding_type": "absolute",
  "problem_type": "single_label_classification",
  "torch_dtype": "float32",
  "transformers_version": "4.17

### Save our fine-tune model to BentoML modelstore

In [14]:
metadata = results.update({"fine_tune": True})
tag = bentoml.transformers.save(
    FT_MODEL_NAME, model, tokenizer=tokenizer, metadata=metadata
)

## What's next?
Go to [Transfer Learning's Notebook](./transfer_learning_roberta.ipynb) for seeing how to perform transfer learning with BentoML