# Lightweight Fine-Tuning Project

TODO: In this cell, describe your choices for each of the following

* PEFT technique: Using the LORA technique with task type as SEQ_CLS to specify a classification model
* Model: GPT2 for Sequence Classification
* Evaluation approach: Accuracy 
* Fine-tuning dataset: dair-ai/emotion dataset from huggingface containing six emotion classes which are: sadness, joy, love, anger, fear, surprise

In [2]:
# install required libraries
!pip install peft

Collecting peft
  Downloading peft-0.10.0-py3-none-any.whl.metadata (13 kB)
Downloading peft-0.10.0-py3-none-any.whl (199 kB)
   ---------------------------------------- 0.0/199.1 kB ? eta -:--:--
   ---------------------------------------- 199.1/199.1 kB 5.9 MB/s eta 0:00:00
Installing collected packages: peft
Successfully installed peft-0.10.0



[notice] A new release of pip is available: 23.3.1 -> 24.0
[notice] To update, run: python.exe -m pip install --upgrade pip


## Loading and Evaluating a Foundation Model


In [2]:
# import required libraries
import torch
from datasets import load_dataset

  from .autonotebook import tqdm as notebook_tqdm


In [3]:
# due to not enough computing power only loading subset of the train, validation and test dataset
emotion_train = load_dataset("dair-ai/emotion", split="train[:2000]")
emotion_validation = load_dataset("dair-ai/emotion", split="validation[:100]")
emotion_test = load_dataset("dair-ai/emotion", split="test[:100]")

In [4]:
# check train dataset
emotion_train

Dataset({
    features: ['text', 'label'],
    num_rows: 2000
})

In [5]:
# the foundation model for evaluation
model_name = "openai-community/gpt2"

In [44]:
# define the model and the tokenizer
from transformers import AutoTokenizer, GPT2ForSequenceClassification, GPT2Config

configuration = GPT2Config()
tokenizer = AutoTokenizer.from_pretrained(model_name)
tokenizer.pad_token = tokenizer.eos_token
model = GPT2ForSequenceClassification(configuration).from_pretrained(model_name, num_labels = 6)
model.config.pad_token_id = model.config.eos_token_id

Some weights of GPT2ForSequenceClassification were not initialized from the model checkpoint at openai-community/gpt2 and are newly initialized: ['score.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [6]:
# define function to carry out tokenization of the training data
def preprocess_function(examples):
    return tokenizer(examples["text"], padding = "max_length", truncation = True, return_tensors="pt")

In [46]:
# get the model output on the test dataset
inputs = tokenizer(emotion_test["text"], padding = "max_length", truncation = True, return_tensors="pt")

with torch.no_grad():
    logits = model(**inputs).logits

predicted_class_id = logits.argmax(dim = 1)

In [48]:
# accuracy of foundation model on test dataset without fine-tuning
from sklearn.metrics import accuracy_score
accuracy = accuracy_score(emotion_test['label'], predicted_class_id)
print(f"accuracy without any finetuning on the test dataset {accuracy * 100} %")

accuracy without any finetuning on the test dataset 5.0 %


## Performing Parameter-Efficient Fine-Tuning

In [49]:
# tokenize the train and validation dataset
tokenized_emotion_train = emotion_train.map(preprocess_function, batched = True)
tokenized_emotion_validation = emotion_validation.map(preprocess_function, batched = True)

Map: 100%|██████████| 2000/2000 [00:01<00:00, 1670.18 examples/s]
Map: 100%|██████████| 100/100 [00:00<00:00, 1602.71 examples/s]


In [50]:
# remove the text list as it is not required for training
tokenized_emotion_train = tokenized_emotion_train.remove_columns(["text"])
tokenized_emotion_validation = tokenized_emotion_validation.remove_columns(["text"])

In [51]:
# LORA accepts "labels" instead of "label" so rename accordingly
train_lora = tokenized_emotion_train.rename_column('label', 'labels')
valid_lora = tokenized_emotion_validation.rename_column('label', 'labels')

In [52]:
# import the accuracy function from evaluate library
import evaluate
import numpy as np

accuracy = evaluate.load("accuracy")

def compute_metrics(eval_pred):
    predictions, labels = eval_pred
    predictions = np.argmax(predictions, axis=1)
    return accuracy.compute(predictions=predictions, references=labels)

In [53]:
# create a PEFT config with appropriate hyperparameters for your the model.
from peft import LoraConfig, get_peft_model, TaskType
config = LoraConfig(lora_alpha = 32,
                   lora_dropout = 0.01,
                   r=8,
                   bias="none",
                   task_type=TaskType.SEQ_CLS)

In [54]:
# using the PEFT config and foundation model, create a PEFT model.
lora_model = get_peft_model(model, config)



In [55]:
# check the trainable parameters
lora_model.print_trainable_parameters()

trainable params: 299,520 || all params: 124,743,936 || trainable%: 0.24010786384037136


In [56]:
# define the training arguments
from transformers import TrainingArguments, Trainer

training_args = TrainingArguments(
    output_dir="my_awesome_model_1",
    learning_rate=2e-5,
    per_device_train_batch_size=4,
    per_device_eval_batch_size=4,
    num_train_epochs=3,
    weight_decay=0.01,
    evaluation_strategy="epoch",
    save_strategy="epoch",
    load_best_model_at_end=True
)

trainer = Trainer(
    model=lora_model,
    args=training_args,
    train_dataset=train_lora,
    eval_dataset=valid_lora,
    tokenizer=tokenizer,
    compute_metrics=compute_metrics,
)

In [57]:
# run the training loop
train_results = trainer.train()

  0%|          | 0/1500 [00:00<?, ?it/s]You're using a GPT2TokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
 33%|███▎      | 500/1500 [2:14:41<4:25:40, 15.94s/it]

{'loss': 2.0298, 'learning_rate': 1.3333333333333333e-05, 'epoch': 1.0}



 33%|███▎      | 500/1500 [2:18:04<4:25:40, 15.94s/it]

{'eval_loss': 1.593591570854187, 'eval_accuracy': 0.3, 'eval_runtime': 203.0701, 'eval_samples_per_second': 0.492, 'eval_steps_per_second': 0.123, 'epoch': 1.0}


 67%|██████▋   | 1000/1500 [4:25:35<2:02:00, 14.64s/it]

{'loss': 1.6244, 'learning_rate': 6.666666666666667e-06, 'epoch': 2.0}



 67%|██████▋   | 1000/1500 [4:28:42<2:02:00, 14.64s/it]

{'eval_loss': 1.563624620437622, 'eval_accuracy': 0.32, 'eval_runtime': 186.3912, 'eval_samples_per_second': 0.537, 'eval_steps_per_second': 0.134, 'epoch': 2.0}


100%|██████████| 1500/1500 [6:41:18<00:00, 15.83s/it]  

{'loss': 1.5997, 'learning_rate': 0.0, 'epoch': 3.0}



100%|██████████| 1500/1500 [6:44:52<00:00, 15.83s/it]

{'eval_loss': 1.5585345029830933, 'eval_accuracy': 0.32, 'eval_runtime': 213.4295, 'eval_samples_per_second': 0.469, 'eval_steps_per_second': 0.117, 'epoch': 3.0}


100%|██████████| 1500/1500 [6:44:53<00:00, 16.20s/it]

{'train_runtime': 24293.1375, 'train_samples_per_second': 0.247, 'train_steps_per_second': 0.062, 'train_loss': 1.751273681640625, 'epoch': 3.0}





In [58]:
# save the trained model
lora_model.save_pretrained("gpt2-lora")

## Performing Inference with a PEFT Model

First loading the PeftConfig and then initializing the PeftModel using the config(specifies the local path where pre-trained model weights are present) and the model(In our GPT2Sequence Classification)

In [10]:
# load the fine-tuned the PEFT model for inference and get the model predictions on test dataset
from peft import AutoPeftModelForSequenceClassification, PeftConfig, PeftModel
from transformers import AutoTokenizer, GPT2ForSequenceClassification, GPT2Config


config = PeftConfig.from_pretrained("gpt2-lora")
tokenizer = AutoTokenizer.from_pretrained(model_name)
tokenizer.pad_token = tokenizer.eos_token
model = GPT2ForSequenceClassification.from_pretrained(config.base_model_name_or_path, num_labels = 6)
model.config.pad_token_id = model.config.eos_token_id

lora_model_saved = PeftModel.from_pretrained(model, "gpt2-lora")

inputs = tokenizer(emotion_test["text"], padding = "max_length", truncation = True, return_tensors="pt")

with torch.no_grad():
    logits = lora_model_saved(**inputs).logits

predicted_class_id_lora = logits.argmax(dim = 1)

Some weights of GPT2ForSequenceClassification were not initialized from the model checkpoint at openai-community/gpt2 and are newly initialized: ['score.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [11]:
# check the accuracy of the fine-tuned model on the test dataset
from sklearn.metrics import accuracy_score
accuracy = accuracy_score(emotion_test['label'], predicted_class_id_lora)
print(f"accuracy with finetuning on the test dataset {accuracy * 100} %")

accuracy with finetuning on the test dataset 35.0 %


In [2]:
# the accuracy increased from 5% to 35% with just a sample of dataset and can be further improved by hyper-paramter tuning and
# increasing the training dataset

In [1]:
# Based on the references below:

# https://huggingface.co/docs/peft/main/en/task_guides/ptuning-seq-classification

# in the inference section loading the GPT2SequenceClassification and then using this model to load the PEFT model for inference also produces results from the locally fineuned model.
# this method also loads the locally trained model.
