# Lightweight Fine-Tuning Project

TODO: In this cell, describe your choices for each of the following

* PEFT technique: LoRA
* Model: BERT base uncased
* Evaluation approach: Hugging Face Trainer 
* Fine-tuning dataset: SetFit/bbc-news

## Loading and Evaluating a Foundation Model

TODO: In the cells below, load your chosen pre-trained Hugging Face model and evaluate its performance prior to fine-tuning. This step includes loading an appropriate tokenizer and dataset.

In [1]:
from peft import LoraConfig, get_peft_model, AutoPeftModelForSequenceClassification
from transformers import AutoModelForSequenceClassification, AutoTokenizer, Trainer, TrainingArguments, DataCollatorWithPadding
from datasets import load_dataset
import evaluate
import numpy as np

In [2]:
dataset = load_dataset("SetFit/bbc-news")
dataset


DatasetDict({
    train: Dataset({
        features: ['text', 'label', 'label_text'],
        num_rows: 1225
    })
    test: Dataset({
        features: ['text', 'label', 'label_text'],
        num_rows: 1000
    })
})

In [3]:
model = AutoModelForSequenceClassification.from_pretrained("bert-base-uncased", num_labels=5)
model

Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


BertForSequenceClassification(
  (bert): BertModel(
    (embeddings): BertEmbeddings(
      (word_embeddings): Embedding(30522, 768, padding_idx=0)
      (position_embeddings): Embedding(512, 768)
      (token_type_embeddings): Embedding(2, 768)
      (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
      (dropout): Dropout(p=0.1, inplace=False)
    )
    (encoder): BertEncoder(
      (layer): ModuleList(
        (0-11): 12 x BertLayer(
          (attention): BertAttention(
            (self): BertSdpaSelfAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (output): BertSelfOutput(
              (dense): Linear(in_features=768, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e

In [4]:
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")

def tokenize_function(examples):
    print(examples)
    return tokenizer(examples['text'], padding="max_length", truncation=True)

tokenized_datasets = {
    "test": dataset["test"].map(tokenize_function, batched=True).select(range(100)),
    "train": dataset["train"].map(tokenize_function, batched=True).select(range(100))
}

tokenized_datasets["test"][0]

{'text': 'carry on star patsy rowlands dies actress patsy rowlands  known to millions for her roles in the carry on films  has died at the age of 71.  rowlands starred in nine of the popular carry on films  alongside fellow regulars sid james  kenneth williams and barbara windsor. she also carved out a successful television career  appearing for many years in itv s well-loved comedy bless this house. rowlands died in hove on saturday morning  her agent said.  born in january 1934  rowlands won a scholarship to the guildhall school of speech and drama scholarship when she was just 15.  after spending several years at the players theatre in london  she made her film debut in 1963 in tom jones  directed by tony richardson. she made her first carry on film in 1969 where she appeared in carry on again doctor. rowlands played the hard-done-by wife or the put-upon employee as a regular carry on star. she also appeared in carry on at your convenience  carry on matron and carry on loving  as we

In [5]:

# Load evaluation metrics
accuracy_metric = evaluate.load("accuracy")

def compute_metrics(eval_pred):
    logits, labels = eval_pred
    predictions = np.argmax(logits, axis=-1)
    accuracy = accuracy_metric.compute(predictions=predictions, references=labels)

    return { **accuracy }

In [6]:
eval_trainer = Trainer(
    model=model,
    args=TrainingArguments(
        output_dir="evaluation_results",
        per_device_eval_batch_size=32,
        evaluation_strategy="epoch",
    ),
    eval_dataset=tokenized_datasets["test"],
    compute_metrics=compute_metrics,
    data_collator=DataCollatorWithPadding(tokenizer=tokenizer),
)   

eval_trainer.evaluate()



{'eval_loss': 1.6189169883728027,
 'eval_model_preparation_time': 0.0008,
 'eval_accuracy': 0.17,
 'eval_runtime': 3.8295,
 'eval_samples_per_second': 26.113,
 'eval_steps_per_second': 1.045}

In [7]:
import pandas as pd
df = pd.DataFrame(tokenized_datasets["test"])
df = df[["text", "label"]]
predictions = eval_trainer.predict(tokenized_datasets["test"])
df["predicted_label"] = np.argmax(predictions[0], axis=1)
df.head(20)

Unnamed: 0,text,label,predicted_label
0,carry on star patsy rowlands dies actress pats...,3,3
1,sydney to host north v south game sydney will ...,2,1
2,uk coal plunges into deeper loss shares in uk ...,1,1
3,blair joins school sailing trip the prime mini...,4,1
4,bath faced with tindall ultimatum mike tindall...,2,1
5,banker loses sexism claim a former executive a...,1,1
6,hewitt survives nalbandian epic home favourite...,2,3
7,saab to build cadillacs in sweden general moto...,1,3
8,blair pledges unity to labour mps tony blair h...,4,1
9,minimum rate for foster parents foster carers ...,4,1


## Performing Parameter-Efficient Fine-Tuning

TODO: In the cells below, create a PEFT model from your loaded model, run a training loop, and save the PEFT model weights.

In [8]:
lora_config = LoraConfig(
    task_type="SEQ_CLS",
    r=4,
    lora_alpha=32,
    lora_dropout=0.01,
    target_modules=["query"]
)
lora_model = get_peft_model(model, lora_config)

lora_model.print_trainable_parameters()

trainer = Trainer(
    model=lora_model,
    args=TrainingArguments(
        output_dir="./results",
        learning_rate=1e-2,
        per_device_train_batch_size=16,
        per_device_eval_batch_size=16,
        num_train_epochs=5,
        weight_decay=0.01,
        evaluation_strategy="epoch",
        save_strategy="epoch",
        load_best_model_at_end=True
    ),
    train_dataset=tokenized_datasets["train"],
    eval_dataset=tokenized_datasets["test"],
    tokenizer=tokenizer,
    compute_metrics=compute_metrics,
    data_collator=DataCollatorWithPadding(tokenizer),
)

trainer.train()

  trainer = Trainer(
No label_names provided for model class `PeftModelForSequenceClassification`. Since `PeftModel` hides base models input arguments, if label_names is not given, label_names can't be set automatically within `Trainer`. Note that empty label_names list will be used instead.


trainable params: 77,573 || all params: 109,563,658 || trainable%: 0.0708


Epoch,Training Loss,Validation Loss,Accuracy
1,No log,1.854709,0.23
2,No log,1.842651,0.23
3,No log,1.183451,0.52
4,No log,0.942878,0.66
5,No log,0.848748,0.71


TrainOutput(global_step=35, training_loss=1.6449669974190848, metrics={'train_runtime': 952.6286, 'train_samples_per_second': 0.525, 'train_steps_per_second': 0.037, 'total_flos': 131678223360000.0, 'train_loss': 1.6449669974190848, 'epoch': 5.0})

In [9]:
lora_model.save_pretrained("bert-bbc-lora", save_adapter=True, save_config=True)

## Performing Inference with a PEFT Model

TODO: In the cells below, load the saved PEFT model weights and evaluate the performance of the trained PEFT model. Be sure to compare the results to the results from prior to fine-tuning.

In [10]:
loaded_lora_model = AutoPeftModelForSequenceClassification.from_pretrained("bert-bbc-lora",  num_labels=5)

Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [11]:
eval_lora_trainer = Trainer(
    model=loaded_lora_model,
    args=TrainingArguments(
        output_dir="evaluation_results",
        per_device_eval_batch_size=32,
        evaluation_strategy="epoch",
    ),
    eval_dataset=tokenized_datasets["test"],
    compute_metrics=compute_metrics,
    data_collator=DataCollatorWithPadding(tokenizer=tokenizer),
)   

eval_lora_trainer.evaluate()

No label_names provided for model class `PeftModelForSequenceClassification`. Since `PeftModel` hides base models input arguments, if label_names is not given, label_names can't be set automatically within `Trainer`. Note that empty label_names list will be used instead.


{'eval_loss': 0.8487485647201538,
 'eval_model_preparation_time': 0.0062,
 'eval_accuracy': 0.71,
 'eval_runtime': 117.194,
 'eval_samples_per_second': 0.853,
 'eval_steps_per_second': 0.034}

In [12]:
import pandas as pd
df = pd.DataFrame(tokenized_datasets["test"])
df = df[["text", "label"]]
predictions = eval_lora_trainer.predict(tokenized_datasets["test"])
df["predicted_label"] = np.argmax(predictions[0], axis=1)
df.head(20)

Unnamed: 0,text,label,predicted_label
0,carry on star patsy rowlands dies actress pats...,3,3
1,sydney to host north v south game sydney will ...,2,2
2,uk coal plunges into deeper loss shares in uk ...,1,1
3,blair joins school sailing trip the prime mini...,4,4
4,bath faced with tindall ultimatum mike tindall...,2,2
5,banker loses sexism claim a former executive a...,1,0
6,hewitt survives nalbandian epic home favourite...,2,2
7,saab to build cadillacs in sweden general moto...,1,3
8,blair pledges unity to labour mps tony blair h...,4,4
9,minimum rate for foster parents foster carers ...,4,3
