# Parameter Efficient Fine Tuning

##### Description

The parameter-efficient fine-tuning is applied here using the Hugging Face peft library

1. Load a pre-trained model and evaluate its performance
2. Perform parameter-efficient fine tuning using the pre-trained model
4. Evaluate the fine-tuned model and compare its performance to the original model


##### Main Components

- PEFT technique: LoRA
- Model: gpt2
- Evaluation approach: Transformer trainer
- Dataset: `sms_spam`

 


# Load libraries

In [1]:
from transformers import (
                AutoModelForSequenceClassification, 
                AutoTokenizer,
                Trainer, 
                TrainingArguments, 
                DataCollatorWithPadding
)
from datasets import load_dataset
from peft import LoraConfig, get_peft_model, TaskType, AutoPeftModelForSequenceClassification
import numpy as np
import torch


# 1) Prepare the Foundation Model

## Load a pretrained HF model

In [2]:
# Load the sms_spam dataset
# Source: https://huggingface.co/datasets/sms_spam

# The sms_spam dataset only has a train split, so we use the train_test_split 
# method to split it into train and test
dataset = load_dataset("sms_spam", split="train").train_test_split(
    test_size=0.2, shuffle=True, seed=23
)

## Load the pretrained HF model and preprocess the dataset

In [3]:

tokenizer = AutoTokenizer.from_pretrained("gpt2")
# Set pad token as eos
tokenizer.pad_token = tokenizer.eos_token

# Tokenize the dataset function
def tokenize(batch):
    return tokenizer(batch["sms"], padding="max_length", truncation=True)

# Tokenize the train and test sets
train_dataset = dataset["train"].map(tokenize, batched=True)
test_dataset = dataset["test"].map(tokenize, batched=True)



In [4]:
train_dataset

Dataset({
    features: ['sms', 'label', 'input_ids', 'attention_mask'],
    num_rows: 4459
})

In [5]:
# Inspect the first row
print(train_dataset[0]["sms"])
print(train_dataset[0]["label"])

Had your mobile 10 mths? Update to the latest Camera/Video phones for FREE. KEEP UR SAME NUMBER, Get extra free mins/texts. Text YES for a call

1


In [6]:
test_dataset

Dataset({
    features: ['sms', 'label', 'input_ids', 'attention_mask'],
    num_rows: 1115
})

In [7]:
# Inspect the first row
print(test_dataset[0]["sms"])
print(test_dataset[0]["label"])

Yup... Hey then one day on fri we can ask miwa and jiayin take leave go karaoke 

0


In [8]:

foundation_model = AutoModelForSequenceClassification.from_pretrained(
    "gpt2", 
    num_labels=2,
    id2label={0: "not spam", 1: "spam"},
    label2id={"not spam": 0, "spam": 1},
)

foundation_model.config.pad_token_id = tokenizer.pad_token_id

Some weights of GPT2ForSequenceClassification were not initialized from the model checkpoint at gpt2 and are newly initialized: ['score.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [9]:
#print(foundation_model)

## Evaluate the pretrained model

In [10]:
def compute_metrics(eval_pred):
    """
    Function for compute tha accuracy metric
    :eval_pred: a tuple with predictions and labels
    
    :returns: a dictionary with the mean accuracy
    """
    predictions, labels = eval_pred
    # Convert the predictions to discrete labels by taking the argmax,
    # which is the index of the highest value in the prediction (logits).
    predictions = np.argmax(predictions, axis=1)
    # Calculate and return the accuracy as the mean of the instances where
    # predictions match the true labels.
    return {"accuracy": (predictions == labels).mean()}

In [11]:
# The HuggingFace Trainer class handles the training and eval loop for PyTorch.
# Initialize the Trainer, a high-level API for training transformer models.
training_args = TrainingArguments(
    output_dir="./model_output", # Directory where the model outputs will be saved.
    learning_rate=2e-5, # Learning rate for the optimizer.
    per_device_train_batch_size=16, # Batch size for training per device.
    per_device_eval_batch_size=16, # Batch size for evaluation per device.
    num_train_epochs=1, # Number of training epochs.
    weight_decay=0.01, # Weight decay for regularization.
    evaluation_strategy="epoch", # Evaluation is performed at the end of each epoch.
    save_strategy="epoch", # Model is saved at the end of each epoch.
    load_best_model_at_end=True, # Load the best model at the end of training.
)

pretrain_trainer = Trainer(
    model=foundation_model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=test_dataset,
    tokenizer=tokenizer,
    data_collator=DataCollatorWithPadding(tokenizer=tokenizer),
    compute_metrics=compute_metrics,
)

In [12]:
# Evaluate the model on the validation set before fine-tuning
pretrain_results = pretrain_trainer.evaluate()

The following columns in the evaluation set don't have a corresponding argument in `GPT2ForSequenceClassification.forward` and have been ignored: sms. If sms are not expected by `GPT2ForSequenceClassification.forward`,  you can safely ignore this message.
***** Running Evaluation *****
  Num examples = 1115
  Batch size = 16
You're using a GPT2TokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.


In [13]:
print(pretrain_results)

{'eval_loss': 2.675537109375, 'eval_accuracy': 0.12914798206278028, 'eval_runtime': 1033.2615, 'eval_samples_per_second': 1.079, 'eval_steps_per_second': 0.068}


# 2) Perform Lightweight Fine-Tuning

## Create a PEFT config

In [14]:
config = LoraConfig(task_type=TaskType.SEQ_CLS,
    inference_mode=False,
    r=4,
    lora_alpha=16,
    lora_dropout=0.1)

## Create a PEFT model

## Get a trainable PEFT model

In [15]:
lora_model = get_peft_model(foundation_model, config)



In [16]:
lora_model.print_trainable_parameters()

trainable params: 148,992 || all params: 124,590,336 || trainable%: 0.1196


## Train the PEFT model

In [17]:
# Initialize the Trainer's arguments
peft_training_args = TrainingArguments(
    output_dir="./results/peft_model",
    learning_rate=2e-5,
    per_device_train_batch_size=16,
    per_device_eval_batch_size=16,
    num_train_epochs=1,
    weight_decay=0.01,
    evaluation_strategy="epoch",
    save_strategy="epoch",
    load_best_model_at_end=True,
)

# Initialize the Trainer with compute_metrics
peft_trainer = Trainer(
    model=lora_model,
    args=peft_training_args,
    train_dataset=train_dataset,
    eval_dataset=test_dataset,
    compute_metrics=compute_metrics,
    tokenizer=tokenizer,
    data_collator=DataCollatorWithPadding(tokenizer=tokenizer),
)


PyTorch: setting up devices
The default value for the training argument `--report_to` will change in v5 (from all installed integrations to none). In v5, you will need to use `--report_to all` to get the same behavior as now. You should start updating your code and make this info disappear :-).


In [18]:
# Run the trainer
peft_trainer.train()

The following columns in the training set don't have a corresponding argument in `PeftModelForSequenceClassification.forward` and have been ignored: sms. If sms are not expected by `PeftModelForSequenceClassification.forward`,  you can safely ignore this message.
***** Running training *****
  Num examples = 4459
  Num Epochs = 1
  Instantaneous batch size per device = 16
  Total train batch size (w. parallel, distributed & accumulation) = 16
  Gradient Accumulation steps = 1
  Total optimization steps = 279
  Number of trainable parameters = 148992


Epoch,Training Loss,Validation Loss,Accuracy
1,No log,0.684835,0.730045


The following columns in the evaluation set don't have a corresponding argument in `PeftModelForSequenceClassification.forward` and have been ignored: sms. If sms are not expected by `PeftModelForSequenceClassification.forward`,  you can safely ignore this message.
***** Running Evaluation *****
  Num examples = 1115
  Batch size = 16
Saving model checkpoint to ./results/peft_model/checkpoint-279
Trainer.model is not a `PreTrainedModel`, only saving its state dict.
tokenizer config file saved in ./results/peft_model/checkpoint-279/tokenizer_config.json
Special tokens file saved in ./results/peft_model/checkpoint-279/special_tokens_map.json


Training completed. Do not forget to share your model on huggingface.co/models =)


Loading best model from ./results/peft_model/checkpoint-279 (score: 0.6848347187042236).


TrainOutput(global_step=279, training_loss=1.5568949381510417, metrics={'train_runtime': 21741.7659, 'train_samples_per_second': 0.205, 'train_steps_per_second': 0.013, 'total_flos': 2334326220914688.0, 'train_loss': 1.5568949381510417, 'epoch': 1.0})

## Save the PEFT model

In [19]:
lora_model.save_pretrained("gpt-lora")

loading configuration file config.json from cache at /Users/badiaa/.cache/huggingface/hub/models--gpt2/snapshots/607a30d783dfa663caf39e06633721c8d4cfcd7e/config.json
Model config GPT2Config {
  "activation_function": "gelu_new",
  "architectures": [
    "GPT2LMHeadModel"
  ],
  "attn_pdrop": 0.1,
  "bos_token_id": 50256,
  "embd_pdrop": 0.1,
  "eos_token_id": 50256,
  "initializer_range": 0.02,
  "layer_norm_epsilon": 1e-05,
  "model_type": "gpt2",
  "n_ctx": 1024,
  "n_embd": 768,
  "n_head": 12,
  "n_inner": null,
  "n_layer": 12,
  "n_positions": 1024,
  "reorder_and_upcast_attn": false,
  "resid_pdrop": 0.1,
  "scale_attn_by_inverse_layer_idx": false,
  "scale_attn_weights": true,
  "summary_activation": null,
  "summary_first_dropout": 0.1,
  "summary_proj_to_labels": true,
  "summary_type": "cls_index",
  "summary_use_proj": true,
  "task_specific_params": {
    "text-generation": {
      "do_sample": true,
      "max_length": 50
    }
  },
  "transformers_version": "4.25.1",
  "

# 3) Perform Inference Using the Fine-Tuned Model

## Load the saved PEFT model

In [20]:
finetuned_model = AutoPeftModelForSequenceClassification.from_pretrained(
    "gpt-lora",
    num_labels=2,
    id2label={0: "not spam", 1: "spam"},
    label2id={"not spam": 0, "spam": 1},
)

finetuned_model.config.pad_token_id = finetuned_model.config.eos_token_id

# finetuned_model.print_trainable_parameters()

loading configuration file config.json from cache at /Users/badiaa/.cache/huggingface/hub/models--gpt2/snapshots/607a30d783dfa663caf39e06633721c8d4cfcd7e/config.json
Model config GPT2Config {
  "_name_or_path": "gpt2",
  "activation_function": "gelu_new",
  "architectures": [
    "GPT2LMHeadModel"
  ],
  "attn_pdrop": 0.1,
  "bos_token_id": 50256,
  "embd_pdrop": 0.1,
  "eos_token_id": 50256,
  "id2label": {
    "0": "not spam",
    "1": "spam"
  },
  "initializer_range": 0.02,
  "label2id": {
    "not spam": 0,
    "spam": 1
  },
  "layer_norm_epsilon": 1e-05,
  "model_type": "gpt2",
  "n_ctx": 1024,
  "n_embd": 768,
  "n_head": 12,
  "n_inner": null,
  "n_layer": 12,
  "n_positions": 1024,
  "reorder_and_upcast_attn": false,
  "resid_pdrop": 0.1,
  "scale_attn_by_inverse_layer_idx": false,
  "scale_attn_weights": true,
  "summary_activation": null,
  "summary_first_dropout": 0.1,
  "summary_proj_to_labels": true,
  "summary_type": "cls_index",
  "summary_use_proj": true,
  "task_spec

## Evaluate the fine-tuned model

In [21]:
peft_training_args = TrainingArguments(
    output_dir="./results/inference_model",
    learning_rate=2e-5,
    per_device_train_batch_size=32,
    per_device_eval_batch_size=32,
    num_train_epochs=1,
    weight_decay=0.01,
    evaluation_strategy="epoch",
    save_strategy="epoch",
    load_best_model_at_end=True,
)

trainer = Trainer(
    model=finetuned_model, # using the fine-tuned model
    args=peft_training_args,
    # train_dataset=train_dataset,
    eval_dataset=test_dataset,
    compute_metrics=compute_metrics,
    tokenizer=tokenizer,
    data_collator=DataCollatorWithPadding(tokenizer=tokenizer),
)


PyTorch: setting up devices
The default value for the training argument `--report_to` will change in v5 (from all installed integrations to none). In v5, you will need to use `--report_to all` to get the same behavior as now. You should start updating your code and make this info disappear :-).


In [22]:
# Evaluate the fine-tuned model
evaluation_results_peft = trainer.evaluate()
print("Evaluation Results:", evaluation_results_peft)

The following columns in the evaluation set don't have a corresponding argument in `PeftModelForSequenceClassification.forward` and have been ignored: sms. If sms are not expected by `PeftModelForSequenceClassification.forward`,  you can safely ignore this message.
***** Running Evaluation *****
  Num examples = 1115
  Batch size = 32


Evaluation Results: {'eval_loss': 0.6848347187042236, 'eval_accuracy': 0.7300448430493274, 'eval_runtime': 1054.7201, 'eval_samples_per_second': 1.057, 'eval_steps_per_second': 0.033}


# 4) Conclusion

The accuracy of the foundation model shows up as 0.13 while the PEFT fine-tuned model has an accuracy of 0.73 which is a clear improvement.