# Lightweight Fine-Tuning Project

TODO: In this cell, describe your choices for each of the following

* PEFT technique: LoRA
* Model: gpt2
* Evaluation approach: the evaluate method with a Hugging Face Trainer and comparing the original foundation model's performance and the fine-tuned model's performance.
* Fine-tuning dataset: imdb

## Loading and Evaluating a Foundation Model

TODO: In the cells below, load your chosen pre-trained Hugging Face model and evaluate its performance prior to fine-tuning. This step includes loading an appropriate tokenizer and dataset.

In [1]:
!pip install transformers datasets peft torch evaluate scikit-learn

Defaulting to user installation because normal site-packages is not writeable


In [2]:
import logging
import torch
import random
import numpy as np
from transformers import (
    GPT2ForSequenceClassification, 
    GPT2Tokenizer, 
    Trainer, 
    TrainingArguments, 
)
from datasets import load_dataset
from peft import (
    LoraConfig,
    PeftModel,
    get_peft_model, 
    TaskType
)
from evaluate import load as load_metric

In [3]:
# -----------------------------
# Configure Logging and Seeds
# -----------------------------
logging.basicConfig(level=logging.INFO)
torch.manual_seed(42)
random.seed(42)
np.random.seed(42)

In [4]:
# -----------------------------
# Step 1: Load the Model and Tokenizer
# -----------------------------

model_name = "gpt2"
model = GPT2ForSequenceClassification.from_pretrained(model_name, num_labels=2)
tokenizer = GPT2Tokenizer.from_pretrained(model_name)

# Fix the padding token issue (GPT-2 doesn't have a pad_token by default)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.pad_token_id = tokenizer.eos_token_id
model.config.pad_token_id = tokenizer.pad_token_id

Some weights of GPT2ForSequenceClassification were not initialized from the model checkpoint at gpt2 and are newly initialized: ['score.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [5]:
# -----------------------------
# Step 2: Load the Dataset
# -----------------------------
dataset = load_dataset("imdb")

In [6]:
# -----------------------------
# Step 3: Tokenize the Datasets
# -----------------------------
def tokenize_data(example):
    return tokenizer(example['text'], padding=True, truncation=True, max_length=128)

tokenized_dataset = dataset.map(tokenize_data, batched=True)

In [7]:
# -----------------------------
# Step 4: Split datasets
# -----------------------------
train_dataset = tokenized_dataset['train'].shuffle(seed=42) #.select(range(2000))
val_dataset = tokenized_dataset['train'].shuffle(seed=42) #.select(range(2000, 2500))
test_dataset = tokenized_dataset['test']

In [8]:
# -----------------------------
# Step 5: Define the evaluation metric
# -----------------------------
accuracy_metric = load_metric("accuracy")

def compute_metrics(eval_pred):
    logits, labels = eval_pred
    predictions = np.argmax(logits, axis=-1)
    return accuracy_metric.compute(predictions=predictions, references=labels)

In [9]:
# -----------------------------
# Step 6: Evaluate Pre-Trained Model
# -----------------------------
pretrained_args = TrainingArguments(
    output_dir="./results",
    per_device_eval_batch_size=8,
    do_train=False,
    do_eval=True,
    logging_steps=10,
)

pretrained_trainer = Trainer(
    model=model,
    args=pretrained_args,
    eval_dataset=test_dataset,
    compute_metrics=compute_metrics,
    tokenizer=tokenizer,
)

In [10]:
# -----------------------------
# Step 7: Evaluating the pre-trained model
# -----------------------------
print("Evaluating the pre-trained model:")
pretrained_eval = pretrained_trainer.evaluate()
print(f"Pre-trained model accuracy: {pretrained_eval['eval_accuracy']:.4f}")

Evaluating the pre-trained model:


Pre-trained model accuracy: 0.4991


## Performing Parameter-Efficient Fine-Tuning

TODO: In the cells below, create a PEFT model from your loaded model, run a training loop, and save the PEFT model weights.

In [11]:
# -----------------------------
# Step 6: Configure LoRA (Low-Rank Adaptation)
# -----------------------------
lora_config = LoraConfig(
    task_type=TaskType.SEQ_CLS,
    inference_mode=False,
    r=8,
    lora_alpha=32,
    lora_dropout=0.1,
)
peft_model = get_peft_model(model, lora_config)  # Apply LoRA to the GPT-2 model
peft_model.print_trainable_parameters()          # Print trainable parameters to verify LoRA setup



trainable params: 297,984 || all params: 124,737,792 || trainable%: 0.23888830740245906


## Performing Inference with a PEFT Model

TODO: In the cells below, load the saved PEFT model weights and evaluate the performance of the trained PEFT model. Be sure to compare the results to the results from prior to fine-tuning.

In [12]:
# -----------------------------
# Step 8: Fine-Tuning LoRA Model
# -----------------------------
fine_tuning_args = TrainingArguments(
    output_dir="./results",
    evaluation_strategy="epoch",
    per_device_train_batch_size=8,
    per_device_eval_batch_size=8,
    num_train_epochs=3,
    logging_steps=10,
    save_strategy="epoch",
    save_total_limit=2,
    learning_rate=5e-5,
    lr_scheduler_type="cosine",
    load_best_model_at_end=True,
    metric_for_best_model="accuracy",
    seed=42,    
)

trainer = Trainer(
    model=peft_model,
    args=fine_tuning_args,
    train_dataset=train_dataset,
    eval_dataset=test_dataset,
    compute_metrics=compute_metrics,
    tokenizer=tokenizer,
)


In [13]:
# -----------------------------
# Step 8: Fine Tuning the Model with LoRA
# -----------------------------
trainer.train()

Epoch,Training Loss,Validation Loss,Accuracy
1,0.5574,0.366842,0.85752
2,0.207,0.322548,0.87056
3,0.4633,0.327165,0.87084


TrainOutput(global_step=9375, training_loss=0.4249788884480794, metrics={'train_runtime': 2100.0283, 'train_samples_per_second': 35.714, 'train_steps_per_second': 4.464, 'total_flos': 4916389478400000.0, 'train_loss': 0.4249788884480794, 'epoch': 3.0})

In [14]:
# -----------------------------
# Step 9: Save the fine-tuned LoRA model
# -----------------------------
peft_model.save_pretrained("./lora_gpt2_imdb")

In [15]:
# -----------------------------
# Step 10: Load the fine-tuned LoRA model
# -----------------------------

loaded_model = GPT2ForSequenceClassification.from_pretrained(model_name, num_labels=2)
loaded_peft_model = PeftModel.from_pretrained(loaded_model, "./lora_gpt2_imdb")
loaded_peft_model.config.pad_token_id = tokenizer.pad_token_id

Some weights of GPT2ForSequenceClassification were not initialized from the model checkpoint at gpt2 and are newly initialized: ['score.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [16]:
# -----------------------------
# Step 11: Initialize the Trainer for evaluation
# -----------------------------
trainer = Trainer(
    model=loaded_peft_model,
    args=fine_tuning_args,
    eval_dataset=test_dataset,
    compute_metrics=compute_metrics,
    tokenizer=tokenizer,
)

In [17]:
# -----------------------------
# Step 12: Evaluate the fine-tuned model
# -----------------------------
print("\nEvaluating the fine-tuned model:")
fine_tuned_eval = trainer.evaluate()
print(f"Fine-tuned model accuracy: {fine_tuned_eval['eval_accuracy']:.4f}")


Evaluating the fine-tuned model:


Fine-tuned model accuracy: 0.8708


In [18]:
# -----------------------------
# Step 13: Comparison
# -----------------------------
print("\nComparison of performance:")
print(f"Pre-trained model accuracy: {pretrained_eval['eval_accuracy']:.4f}")
print(f"Fine-tuned model accuracy: {fine_tuned_eval['eval_accuracy']:.4f}")


Comparison of performance:
Pre-trained model accuracy: 0.4991
Fine-tuned model accuracy: 0.8708
