# Lightweight Fine-Tuning Project

TODO: In this cell, describe your choices for each of the following

* PEFT technique: LoRA (Low-Rank Adaptation): Chosen for its balance between maintaining the pre-trained model's performance and offering significant efficiency improvements, LoRA allows fine-tuning with minimal computational overhead.
* Model: BERT-base-uncased: A widely-used transformer model that offers a good trade-off between size and performance, making it suitable for a wide range of text classification tasks.
* Evaluation approach: Accuracy and Loss Metrics: These metrics provide a straightforward way to assess the model's performance, with accuracy indicating how often the model predicts correctly and loss offering insight into how well the model's predictions align with the true labels.
* Fine-tuning dataset: GLUE MRPC (Microsoft Research Paraphrase Corpus): This dataset is selected for its relevance to natural language understanding tasks and its moderate size, making it suitable for demonstrating the effectiveness of PEFT techniques without requiring extensive computational resources.

In [1]:
! pip install scikit-learn peft datasets

/bin/bash: /home/alex/anaconda3/envs/base311/lib/libtinfo.so.6: no version information available (required by /bin/bash)
Defaulting to user installation because normal site-packages is not writeable


## Loading and Evaluating a Foundation Model

TODO: In the cells below, load your chosen pre-trained Hugging Face model and evaluate its performance prior to fine-tuning. This step includes loading an appropriate tokenizer and dataset.

In [2]:
from transformers import AutoModelForSequenceClassification, AutoTokenizer, Trainer, TrainingArguments, TrainingArguments, EvalPrediction
from datasets import load_dataset
import numpy as np
from sklearn.metrics import accuracy_score

dataset_name = "glue"
task = "mrpc"
dataset = load_dataset(dataset_name, task)

model_name = "bert-base-uncased"
model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=2)

tokenizer = AutoTokenizer.from_pretrained(model_name)

Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [3]:
def evaluate_model(trainer, dataset):
    print("Evaluating...")
    results = trainer.evaluate(eval_dataset=dataset)
    print(results)

In [4]:
def compute_metrics(p: EvalPrediction):
    preds = np.argmax(p.predictions, axis=1)
    return {"accuracy": accuracy_score(p.label_ids, preds)}

In [5]:
def preprocess_function(examples):
    return tokenizer(examples["sentence1"], examples["sentence2"], padding="max_length", truncation=True, max_length=128)

tokenized_datasets = dataset.map(preprocess_function, batched=True)

Map:   0%|          | 0/1725 [00:00<?, ? examples/s]

In [6]:
training_args = TrainingArguments(
    output_dir="./results",
    evaluation_strategy="epoch",
    learning_rate=2e-5,
    per_device_train_batch_size=32,
    num_train_epochs=6,
    weight_decay=0.01,
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_datasets["train"],
    eval_dataset=tokenized_datasets["validation"],
    compute_metrics=compute_metrics,
)

dataloader_config = DataLoaderConfiguration(dispatch_batches=None, split_batches=False, even_batches=True, use_seedable_sampler=True)


In [7]:
evaluate_model(trainer, tokenized_datasets["validation"])

Evaluating...


{'eval_loss': 0.6499798893928528, 'eval_accuracy': 0.6838235294117647, 'eval_runtime': 1.9621, 'eval_samples_per_second': 207.94, 'eval_steps_per_second': 25.993}


## Performing Parameter-Efficient Fine-Tuning

TODO: In the cells below, create a PEFT model from your loaded model, run a training loop, and save the PEFT model weights.

In [8]:
from peft import LoraConfig, TaskType, get_peft_model

peft_config = LoraConfig(task_type=TaskType.SEQ_CLS,
                         inference_mode=False,
                         r=8,
                         lora_alpha=32,
                         lora_dropout=0.1)

peft_model = get_peft_model(model, peft_config)
peft_model.print_trainable_parameters()

trainable params: 296,450 || all params: 109,780,228 || trainable%: 0.2700395193203643


In [9]:
from transformers import DataCollatorWithPadding

training_args = TrainingArguments(
    output_dir="./peft_results",
    learning_rate=2e-5,
    per_device_train_batch_size=32,
    per_device_eval_batch_size=32,
    num_train_epochs=6,
    weight_decay=0.01,
    evaluation_strategy="epoch",
    save_strategy="epoch",
    load_best_model_at_end=True,
)

data_collator = DataCollatorWithPadding(tokenizer=tokenizer)

trainer = Trainer(
    model=peft_model,
    args=training_args,
    train_dataset=tokenized_datasets["train"],
    eval_dataset=tokenized_datasets["validation"],
    compute_metrics=compute_metrics,
    data_collator=data_collator
)

dataloader_config = DataLoaderConfiguration(dispatch_batches=None, split_batches=False, even_batches=True, use_seedable_sampler=True)


In [None]:
trainer.train()

Epoch,Training Loss,Validation Loss,Accuracy
1,No log,0.623004,0.683824
2,No log,0.618727,0.683824
3,No log,0.605378,0.683824
4,No log,0.591128,0.696078


In [None]:
model_path = "./trained_model"
peft_model.save_pretrained(model_path)
tokenizer.save_pretrained(model_path)

## Performing Inference with a PEFT Model

TODO: In the cells below, load the saved PEFT model weights and evaluate the performance of the trained PEFT model. Be sure to compare the results to the results from prior to fine-tuning.

In [None]:
from transformers import AutoTokenizer
from peft import AutoPeftModelForSequenceClassification

loaded_model = AutoPeftModelForSequenceClassification.from_pretrained(model_path)
loaded_tokenizer = AutoTokenizer.from_pretrained(model_path)

In [None]:
from transformers import Trainer, TrainingArguments

peft_trainer = Trainer(
    model=loaded_model,
    args=TrainingArguments(output_dir="./peft_results", per_device_eval_batch_size=16),
    compute_metrics=compute_metrics,
)

evaluate_model(peft_trainer, tokenized_datasets["validation"])