# Lightweight PEFT for Sequence Classification Task

The project involved using Lightweight Parameter-Efficient Fine-Tuning (PEFT) with the LoRA technique to perform a sequence classification task. The RoBERTa model was fine-tuned on the sms-spam-classification task. The evaluation was conducted using the evaluate method with the HuggingFace Trainer, ensuring efficient and targeted fine-tuning for spam sms classification while maintaining model performance with minimal computational resources.

* PEFT technique:   LoRA
* Model: RoBERTa
* Evaluation approach: evaluate method with a Hugging Face Trainer.
* Fine-tuning dataset: sms_spam

### Dependencies Installation

In [1]:
def installer():
    !pip install huggingface_hub
    !huggingface-cli login
    !git config --global credential.helper store
    !pip install -q "datasets==2.15.0"
    !pip install peft
installer()


    _|    _|  _|    _|    _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|_|_|_|    _|_|      _|_|_|  _|_|_|_|
    _|    _|  _|    _|  _|        _|          _|    _|_|    _|  _|            _|        _|    _|  _|        _|
    _|_|_|_|  _|    _|  _|  _|_|  _|  _|_|    _|    _|  _|  _|  _|  _|_|      _|_|_|    _|_|_|_|  _|        _|_|_|
    _|    _|  _|    _|  _|    _|  _|    _|    _|    _|    _|_|  _|    _|      _|        _|    _|  _|        _|
    _|    _|    _|_|      _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|        _|    _|    _|_|_|  _|_|_|_|

    To login, `huggingface_hub` requires a token generated from https://huggingface.co/settings/tokens .
Enter your token (input will not be visible): 
Add token as git credential? (Y/n) Y
Token is valid (permission: fineGrained).
[1m[31mCannot authenticate through git-credential as no helper is defined on your machine.
You might have to re-authenticate when pushing to the Hugging Face Hub.
Run the following command in yo

In [29]:
import torch
import numpy as np
import pandas as pd
from transformers import DataCollatorWithPadding, Trainer, TrainingArguments
from transformers import AutoTokenizer, RobertaForSequenceClassification
import peft
from peft import LoraConfig, TaskType
from peft import AutoPeftModelForSequenceClassification
from tqdm import tqdm
from huggingface_hub import notebook_login
notebook_login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

# Text Classification via Foundation model (RoBERTa)

## Loading and Evaluating a Foundation Model

[RoBERTa](https://huggingface.co/FacebookAI/roberta-base) is selected as the foundation pre-trained model from [🤗](https://huggingface.co/) for finetuning. We evaluate the foundation model performance on text classification before fine-tuning.  \
This step includes loading an appropriate tokenizer and dataset: \
we import `sms_spam` dataset and call `AutoTokenizer` specified for `FacebookAI/roberta-base`

In [9]:
def Preprocess():
    from datasets import load_dataset

    # The sms_spam dataset only has a train split, so we use the train_test_split method to split it into train and test
    dataset = load_dataset("sms_spam", split="train").train_test_split(
        test_size=0.2, shuffle=True, seed=23
    )
    splits = ["train", "test"]

    tokenizer = AutoTokenizer.from_pretrained("FacebookAI/roberta-base")

    tokenized_dataset = {}
    for split in splits:
        tokenized_dataset[split] = dataset[split].map(
            lambda x: tokenizer(x['sms'], truncation=True), batched=True
            )
    return tokenized_dataset[splits[0]], tokenized_dataset[splits[1]], tokenizer

In [10]:
###### We define Evaluator helper function to assess the model performance (accuracy)
def Evaluator(model, tokenized_dataset):
    total_correct = 0
    for idx, data in enumerate(tqdm(tokenized_dataset)):

        outputs = model(torch.tensor(data['input_ids']).view(1,-1))
        predictions = np.argmax(outputs.logits.detach().numpy(), axis=1)
        label = data['label']

        total_correct += (predictions == label).sum().item()
    accuracy = 100 * total_correct / idx
    return accuracy

## Evaluation on Foundation Model

The evaluation result on RoBERTa-base` using accuracy metric shows that the performance is very low without fine-tuning. accuracy is 12.926391382405745`

In [11]:
model_NotTuned = RobertaForSequenceClassification.from_pretrained("FacebookAI/roberta-base")
model_NotTuned.eval()
#### get preprocessed dataset and tokenizer instance.
tokenized_dataset_train, tokenized_dataset_test, tokenizer= Preprocess()
print(f'model_NotTuned accuracy is {Evaluator(model_NotTuned, tokenized_dataset_test)}')

Map:   0%|          | 0/1115 [00:00<?, ? examples/s]

Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at FacebookAI/roberta-base and are newly initialized: ['classifier.dense.bias', 'classifier.dense.weight', 'classifier.out_proj.bias', 'classifier.out_proj.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
100%|██████████| 1115/1115 [01:02<00:00, 17.83it/s]

RobertaForSequenceClassification(
  (roberta): RobertaModel(
    (embeddings): RobertaEmbeddings(
      (word_embeddings): Embedding(50265, 768, padding_idx=1)
      (position_embeddings): Embedding(514, 768, padding_idx=1)
      (token_type_embeddings): Embedding(1, 768)
      (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
      (dropout): Dropout(p=0.1, inplace=False)
    )
    (encoder): RobertaEncoder(
      (layer): ModuleList(
        (0-11): 12 x RobertaLayer(
          (attention): RobertaAttention(
            (self): RobertaSelfAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (output): RobertaSelfOutput(
              (dense): Linear(in_features=768, out_features=768, bias=True)
             




# Performing Parameter-Efficient Fine-Tuning (LoRA)

In the cells below, we create a PEFT model from the loaded base-model, run a training loop, and save the PEFT model weights.

### Creating a PEFT Config
The PEFT config specifies the adapter configuration for your parameter-efficient fine-tuning process. The base class for this is a `PeftConfig`, but this example will use a LoraConfig, the subclass used for low rank adaptation (LoRA).

A LoRA config can be instantiated like this:

In [12]:
peft_config = LoraConfig(task_type=TaskType.SEQ_CLS, inference_mode=False, r=8, lora_alpha=32, lora_dropout=0.1)

### Converting a Transformers Model into a PEFT Model
Once you have a PEFT config object, you can load a 🤗 transformers model as a PEFT model by first loading the pre-trained model as usual

In [13]:
model_base = RobertaForSequenceClassification.from_pretrained("FacebookAI/roberta-base")

Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at FacebookAI/roberta-base and are newly initialized: ['classifier.dense.bias', 'classifier.dense.weight', 'classifier.out_proj.bias', 'classifier.out_proj.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Then using `get_peft_model()` to get a trainable PEFT model (using the LoRA config instantiated previously):

In [14]:
from peft import get_peft_model
lora_model = get_peft_model(model_base, peft_config)

## Fine-tuning with a PEFT Model
After calling `get_peft_model()`, you can then use the resulting `lora_model` in a training process of your choice (PyTorch training loop or Hugging Face Trainer).

### Checking Trainable Parameters of a PEFT Model
A helpful way to check the number of trainable parameters with the current config is the `print_trainable_parameters()` method:

In [15]:
lora_model.print_trainable_parameters()

trainable params: 887,042 || all params: 125,534,212 || trainable%: 0.7066


## Training
We can use `Trainer` from 🤗 libraries to train and evaluate the fine-tuned model.

In [16]:
def compute_metrics(eval_pred):
    predictions, labels = eval_pred
    predictions = np.argmax(predictions, axis=1)
    return {"accuracy": (predictions == labels).mean()}

trainer = Trainer(
    model=lora_model,
    args=TrainingArguments(
        output_dir="./data/spam_not_spam",
        # Set the learning rate
        learning_rate = 2e-5,
        # Set the per device train batch size and eval batch size
        per_device_train_batch_size=16,
        per_device_eval_batch_size=64,
        # Evaluate and save the model after each epoch
        evaluation_strategy = "epoch",
        save_strategy = "epoch",
        num_train_epochs=5,
        weight_decay=0.01,
        load_best_model_at_end=True,
    ),
    train_dataset= tokenized_dataset_train,
    eval_dataset= tokenized_dataset_test,
    tokenizer=tokenizer,
    data_collator=DataCollatorWithPadding(tokenizer=tokenizer),
    compute_metrics=compute_metrics,
)



In [17]:
trainer.train()

Epoch,Training Loss,Validation Loss,Accuracy
1,No log,0.041363,0.983857
2,0.198200,0.043331,0.986547
3,0.198200,0.046471,0.989238
4,0.041500,0.0426,0.987444
5,0.041500,0.041538,0.988341


TrainOutput(global_step=1395, training_loss=0.09636847246505026, metrics={'train_runtime': 107.1515, 'train_samples_per_second': 208.07, 'train_steps_per_second': 13.019, 'total_flos': 730531998868128.0, 'train_loss': 0.09636847246505026, 'epoch': 5.0})

In [18]:
trainer.evaluate()

{'eval_loss': 0.04136340320110321,
 'eval_accuracy': 0.9838565022421525,
 'eval_runtime': 3.241,
 'eval_samples_per_second': 344.031,
 'eval_steps_per_second': 5.554,
 'epoch': 5.0}

## Performing Inference with a PEFT Model

We load the saved PEFT model weights and evaluate the performance of the trained PEFT model.

### Saving a Trained PEFT Model
Once a PEFT model has been trained, the standard Hugging Face `save_pretrained()` method can be used to save the weights locally.

In [19]:
lora_model.save_pretrained("roBERTa-lora")

### Inference with PEFT
Loading a Saved PEFT Model

In [20]:
model_finetuned = AutoPeftModelForSequenceClassification.from_pretrained("roBERTa-lora")

Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at FacebookAI/roberta-base and are newly initialized: ['classifier.dense.bias', 'classifier.dense.weight', 'classifier.out_proj.bias', 'classifier.out_proj.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [21]:
model_finetuned

PeftModelForSequenceClassification(
  (base_model): LoraModel(
    (model): RobertaForSequenceClassification(
      (roberta): RobertaModel(
        (embeddings): RobertaEmbeddings(
          (word_embeddings): Embedding(50265, 768, padding_idx=1)
          (position_embeddings): Embedding(514, 768, padding_idx=1)
          (token_type_embeddings): Embedding(1, 768)
          (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
          (dropout): Dropout(p=0.1, inplace=False)
        )
        (encoder): RobertaEncoder(
          (layer): ModuleList(
            (0-11): 12 x RobertaLayer(
              (attention): RobertaAttention(
                (self): RobertaSelfAttention(
                  (query): lora.Linear(
                    (base_layer): Linear(in_features=768, out_features=768, bias=True)
                    (lora_dropout): ModuleDict(
                      (default): Dropout(p=0.1, inplace=False)
                    )
                    (lora_A): ModuleD

## Evaluation of fine-tuned model using LoRA

As you can see in the below cell, the accuracy of the base-model increased from about **13%** all the way up to **98.5%** after a lightway fine-tuning using **LoRA** technique.

In [25]:
print(f'model_LoRA accuracy is {Evaluator(model_finetuned, tokenized_dataset_test)}')

100%|██████████| 1115/1115 [01:00<00:00, 18.44it/s]

model_NotTuned accuracy is 98.47396768402155





### Visualization


In [24]:
# Make a dataframe with the predictions and the text and the labels

items_for_manual_review = tokenized_dataset_test.select(
    [0, 1, 22, 31, 43, 292, 448, 487]
)

results = trainer.predict(items_for_manual_review)
df = pd.DataFrame(
    {
        "sms": [item["sms"] for item in items_for_manual_review],
        "predictions": results.predictions.argmax(axis=1),
        "labels": results.label_ids,
    }
)
# Show all the cell
pd.set_option("display.max_colwidth", None)
df

Unnamed: 0,sms,predictions,labels
0,Yup... Hey then one day on fri we can ask miwa and jiayin take leave go karaoke \n,0,0
1,Happy new years melody!\n,0,0
2,PRIVATE! Your 2003 Account Statement for shows 800 un-redeemed S. I. M. points. Call 08715203652 Identifier Code: 42810 Expires 29/10/0\n,1,1
3,URGENT! We are trying to contact U. Todays draw shows that you have won a £800 prize GUARANTEED. Call 09050003091 from land line. Claim C52. Valid 12hrs only\n,1,1
4,I had askd u a question some hours before. Its answer\n,0,0
5,"SMS. ac JSco: Energy is high, but u may not know where 2channel it. 2day ur leadership skills r strong. Psychic? Reply ANS w/question. End? Reply END JSCO\n",0,1
6,"Yun ah.the ubi one say if ü wan call by tomorrow.call 67441233 look for irene.ere only got bus8,22,65,61,66,382. Ubi cres,ubi tech park.6ph for 1st 5wkg days.èn\n",0,0
7,Burger King - Wanna play footy at a top stadium? Get 2 Burger King before 1st Sept and go Large or Super with Coca-Cola and walk out a winner\n,1,1


In [30]:
## Pushed to HF hub
trainer.push_to_hub()

events.out.tfevents.1724707021.004237d149d5.492.1:   0%|          | 0.00/411 [00:00<?, ?B/s]

Upload 4 LFS files:   0%|          | 0/4 [00:00<?, ?it/s]

events.out.tfevents.1724706817.004237d149d5.492.0:   0%|          | 0.00/7.25k [00:00<?, ?B/s]

adapter_model.safetensors:   0%|          | 0.00/3.56M [00:00<?, ?B/s]

training_args.bin:   0%|          | 0.00/5.11k [00:00<?, ?B/s]

CommitInfo(commit_url='https://huggingface.co/amirhossein20n/spam_not_spam/commit/ebe7a2d4220b04d57d0a6763ea5dba0abc0b497d', commit_message='End of training', commit_description='', oid='ebe7a2d4220b04d57d0a6763ea5dba0abc0b497d', pr_url=None, pr_revision=None, pr_num=None)

### Fine-tuned Model on 🤗:
The script below gives you my LoRA fine-tuned RoBERTa model ready to use for Sequence Classification! Enjoy! 🍾
```
from peft import PeftModel, PeftConfig
from transformers import AutoModelForSequenceClassification

config = PeftConfig.from_pretrained("amirhossein20n/spam_not_spam")
base_model = AutoModelForSequenceClassification.from_pretrained("FacebookAI/roberta-base")
model = PeftModel.from_pretrained(base_model, "amirhossein20n/spam_not_spam")

```

