# Lightweight Fine-Tuning Project

TODO: In this cell, describe your choices for each of the following

* PEFT technique: 
* Model: 
* Evaluation approach: 
* Fine-tuning dataset: 

## Loading and Evaluating a Foundation Model

TODO: In the cells below, load your chosen pre-trained Hugging Face model and evaluate its performance prior to fine-tuning. This step includes loading an appropriate tokenizer and dataset.

In [30]:
from datasets import load_dataset
from transformers import AutoTokenizer, AutoModelForSequenceClassification, DataCollatorWithPadding, Trainer, TrainingArguments
from peft import LoraConfig, TaskType, PeftModelForSequenceClassification
import torch
import numpy as np
import pandas as pd
from sklearn.metrics import accuracy_score

In [2]:
# Load the AG News dataset
# See: https://huggingface.co/datasets/fancyzhx/ag_news
dataset = load_dataset("fancyzhx/ag_news", split="train").train_test_split(test_size=0.2, shuffle=True, seed=23)
splits = ["train", "test"]

# View the dataset characteristics
dataset["train"]

Dataset({
    features: ['text', 'label'],
    num_rows: 96000
})

In [3]:
# Tokenize the dataset using the RoBERTa tokenizer
tokenizer = AutoTokenizer.from_pretrained("albert-base-v2")
tokenizer.pad_token = tokenizer.eos_token

tokenized_dataset = {}
for split in splits:
    tokenized_dataset[split] = dataset[split].map(
        lambda x: tokenizer(x["text"], truncation=True), batched=True
    )

# Inspect the available columns in the dataset
tokenized_dataset["train"]

Map: 100%|██████████| 24000/24000 [00:02<00:00, 8335.06 examples/s]


Dataset({
    features: ['text', 'label', 'input_ids', 'token_type_ids', 'attention_mask'],
    num_rows: 96000
})

In [4]:
# Define the foundation model
foundation_model = AutoModelForSequenceClassification.from_pretrained(
    "albert-base-v2",
    num_labels=4,
    id2label={0: "World", 1: "Sports", 2: "Business", 3: "Sci/Tech"},
    label2id={"World": 0, "Sports": 1, "Business": 2, "Sci/Tech": 3},
)

foundation_model.config.pad_token_id = tokenizer.pad_token_id

for param in foundation_model.parameters():
    param.requires_grad = True

Some weights of AlbertForSequenceClassification were not initialized from the model checkpoint at albert-base-v2 and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [5]:
# Define the compute metrics function
def compute_metrics(eval_pred):
    predictions, labels = eval_pred
    predictions = np.argmax(predictions, axis=1)
    return {"accuracy": (predictions == labels).mean()}

# Prepare the training arguments
training_args = TrainingArguments(
    output_dir="./results",
    per_device_eval_batch_size=16,
    do_train=False,
    disable_tqdm=True,  # Disable progress bar
)

# Initialize the Trainer
trainer = Trainer(
    model=foundation_model,
    args=training_args,
    train_dataset=tokenized_dataset["train"],
    eval_dataset=tokenized_dataset["test"],
    tokenizer=tokenizer,
    data_collator=DataCollatorWithPadding(tokenizer=tokenizer),
    compute_metrics=compute_metrics,
)

In [6]:
# Evaluate the model
evaluation_results = trainer.evaluate()
print(evaluation_results)

{'eval_loss': 1.4078623056411743, 'eval_model_preparation_time': 0.001, 'eval_accuracy': 0.2693333333333333, 'eval_runtime': 79.5406, 'eval_samples_per_second': 301.733, 'eval_steps_per_second': 18.858}


{'eval_loss': 1.4078623056411743,
 'eval_model_preparation_time': 0.001,
 'eval_accuracy': 0.2693333333333333,
 'eval_runtime': 79.5406,
 'eval_samples_per_second': 301.733,
 'eval_steps_per_second': 18.858}

## Performing Parameter-Efficient Fine-Tuning

TODO: In the cells below, create a PEFT model from your loaded model, run a training loop, and save the PEFT model weights.

In [20]:
# Configure the PEFT model
lora_config = LoraConfig(
    task_type=TaskType.SEQ_CLS,
    inference_mode=False,
    r=4,
    lora_alpha=16,
    lora_dropout=0.1,
    target_modules=[
    "albert.encoder.albert_layer_groups.0.albert_layers.0.attention.query",
    "albert.encoder.albert_layer_groups.0.albert_layers.0.attention.key",
    "albert.encoder.albert_layer_groups.0.albert_layers.0.attention.value",
    "albert.encoder.albert_layer_groups.0.albert_layers.0.attention.dense",
    "albert.encoder.albert_layer_groups.0.albert_layers.0.ffn",
    "albert.encoder.albert_layer_groups.0.albert_layers.0.ffn_output"
],
    bias="none"
)

model = AutoModelForSequenceClassification.from_pretrained("albert-base-v2", num_labels=4,
    id2label={0: "World", 1: "Sports", 2: "Business", 3: "Sci/Tech"},
    label2id={"World": 0, "Sports": 1, "Business": 2, "Sci/Tech": 3},
)


model.config.pad_token_id = model.config.eos_token_id

# Initialize the PEFT model
peft_model = PeftModelForSequenceClassification(model, lora_config)
peft_model.print_trainable_parameters()

Some weights of AlbertForSequenceClassification were not initialized from the model checkpoint at albert-base-v2 and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


trainable params: 58,372 || all params: 11,745,032 || trainable%: 0.4970


In [21]:
# Define the compute_metrics function
def compute_metrics(eval_pred):
    predictions, labels = eval_pred
    predictions = np.argmax(predictions, axis=1)
    return {"accuracy": (predictions == labels).mean()}

# Define the training arguments
peft_training_args = TrainingArguments(
    output_dir="./data/results/peft",
    # Set the learning rate
    learning_rate=2e-5,
    # Set the per device train batch size and eval batch size
    per_device_train_batch_size=16,
    per_device_eval_batch_size=16,
    # Evaluate and save the model after each epoch
    eval_strategy="epoch",
    save_strategy="epoch",
    num_train_epochs=1,
    weight_decay=0.01,
    warmup_ratio=0.1,
    fp16=True,
    load_best_model_at_end=True,
    gradient_accumulation_steps=8,
    logging_steps=500, 
)

# Initialize the optimizer and scheduler
optimizer = AdamW(peft_model.parameters(), lr=2e-5)
total_steps = len(tokenized_dataset["train"]) // peft_training_args.per_device_train_batch_size * peft_training_args.num_train_epochs
scheduler = get_linear_schedule_with_warmup(optimizer, num_warmup_steps=0, num_training_steps=total_steps)


# The HuggingFace Trainer class handles the training and eval loop for PyTorch for us.
# Read more about it here https://huggingface.co/docs/transformers/main_classes/trainer
peft_trainer = Trainer(
    model=peft_model,
    args=peft_training_args,
    train_dataset=tokenized_dataset["train"],
    eval_dataset=tokenized_dataset["test"],
    tokenizer=tokenizer,
    data_collator=DataCollatorWithPadding(tokenizer=tokenizer),
    compute_metrics=compute_metrics,
)

peft_trainer.train()

 33%|███▎      | 500/1500 [05:49<14:55,  1.12it/s]  

{'loss': 0.9998, 'grad_norm': 13.405620574951172, 'learning_rate': 1.4844444444444445e-05, 'epoch': 0.33}


 67%|██████▋   | 1000/1500 [14:40<04:30,  1.85it/s] 

{'loss': 0.4289, 'grad_norm': 16.54842185974121, 'learning_rate': 7.451851851851852e-06, 'epoch': 0.67}


100%|██████████| 1500/1500 [22:12<00:00,  1.56it/s]

{'loss': 0.377, 'grad_norm': 22.814571380615234, 'learning_rate': 4.444444444444445e-08, 'epoch': 1.0}


                                                   
100%|██████████| 1500/1500 [23:58<00:00,  1.56it/s]

{'eval_loss': 0.371499240398407, 'eval_accuracy': 0.8874583333333333, 'eval_runtime': 106.3049, 'eval_samples_per_second': 225.766, 'eval_steps_per_second': 14.11, 'epoch': 1.0}


100%|██████████| 1500/1500 [23:58<00:00,  1.04it/s]

{'train_runtime': 1438.6459, 'train_samples_per_second': 66.729, 'train_steps_per_second': 1.043, 'train_loss': 0.6018951110839844, 'epoch': 1.0}





TrainOutput(global_step=1500, training_loss=0.6018951110839844, metrics={'train_runtime': 1438.6459, 'train_samples_per_second': 66.729, 'train_steps_per_second': 1.043, 'total_flos': 467276400806400.0, 'train_loss': 0.6018951110839844, 'epoch': 1.0})

In [22]:
peft_model.save_pretrained("/models/albert-base-v2-peft-ag-news")

## Performing Inference with a PEFT Model

TODO: In the cells below, load the saved PEFT model weights and evaluate the performance of the trained PEFT model. Be sure to compare the results to the results from prior to fine-tuning.

In [23]:
from peft import AutoPeftModelForSequenceClassification
ag_news_model = AutoPeftModelForSequenceClassification.from_pretrained(
    "/models/albert-base-v2-peft-ag-news",
    num_labels=4,
    id2label={0: "World", 1: "Sports", 2: "Business", 3: "Sci/Tech"},
    label2id={"World": 0, "Sports": 1, "Business": 2, "Sci/Tech": 3},
)

ag_news_model.config.pad_token_id = ag_news_model.config.eos_token_id

Some weights of AlbertForSequenceClassification were not initialized from the model checkpoint at albert-base-v2 and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [24]:
# Evaluate the PEFT model
peft_trainer = Trainer(
    model=ag_news_model,
    args=training_args,
    train_dataset=tokenized_dataset["train"],
    eval_dataset=tokenized_dataset["test"],
    tokenizer=tokenizer,
    data_collator=DataCollatorWithPadding(tokenizer=tokenizer),
    compute_metrics=compute_metrics,
)

peft_evaluation_results = peft_trainer.evaluate()
display(peft_evaluation_results)

{'eval_loss': 0.371499240398407, 'eval_model_preparation_time': 0.001, 'eval_accuracy': 0.8874583333333333, 'eval_runtime': 106.6825, 'eval_samples_per_second': 224.967, 'eval_steps_per_second': 14.06}
{'eval_loss': 0.371499240398407, 'eval_model_preparation_time': 0.001, 'eval_accuracy': 0.8874583333333333, 'eval_runtime': 106.6825, 'eval_samples_per_second': 224.967, 'eval_steps_per_second': 14.06}


In [27]:
# Select 10 items from the test dataset
test_samples = tokenized_dataset["test"].select(range(10))

In [33]:
from IPython.display import display, HTML

# Function to make predictions
# Select 10 items from the test dataset
test_samples = tokenized_dataset["test"].select(range(10))

def make_predictions(model, tokenizer, samples, device):
    model.to(device)
    inputs = tokenizer(samples["text"], padding=True, truncation=True, return_tensors="pt").to(device)
    with torch.no_grad():
        outputs = model(**inputs)
    predictions = torch.argmax(outputs.logits, dim=-1)
    return predictions.cpu().numpy()

# Set device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Make predictions using foundation_model
foundation_predictions = make_predictions(foundation_model, tokenizer, test_samples, device)

# Make predictions using peft_model
peft_predictions = make_predictions(peft_model, tokenizer, test_samples, device)

# Create a DataFrame to display the results
df = pd.DataFrame({
    "Text": test_samples["text"],
    "True Label": test_samples["label"],
    "Foundation Model Prediction": foundation_predictions,
    "PEFT Model Prediction": peft_predictions
})

# Display the DataFrame as a scrollable element
html = df.to_html(classes='table table-striped', index=False)
display(HTML(f'<div style="max-height: 400px; overflow-y: scroll;">{html}</div>'))

Text,True Label,Foundation Model Prediction,PEFT Model Prediction
"Foulke, Wakefield are happy to pitch in As much as the Red Sox relished thumping the Yankees, they took special satisfaction from the outings of two key pitchers who recently have struggled, Keith Foulke and Tim Wakefield .",1,2,1
"Nfld. oil spill more serious than originally thought ST. JOHN #39;S, Nfld. -- An oil spill on Newfoundland #39;s Grand Banks is larger than first reported. The Canada-Newfoundland Offshore Petroleum Board now says that up to 1,000 barrels of oil may have spilled into",2,3,2
Shell to scrap twin board structure Oil giant Royal Dutch/Shell today said it would to scrap its twin board structure as it battles to restore confidence in the wake of its reserves crisis.,2,2,2
"Death toll in Asian quake disaster passes 144,000 (AFP) AFP - The number of people killed in the massive earthquake and tidal waves that hit Indian Ocean shorelines a week ago passed 144,000.",0,2,0
"NASA and Russia OK Next Space Station Crew In a separate development, the head of the RKK Energia company, which builds the Soyuz spacecraft, said his company was planning to send a new space shuttle to the space station between 2010 and 2012, depending on funding.",3,2,3
"Sony To Sell MP3 Music Players in Japan Sony #39;s two MP3 players, the NW-HD3 with a 20-gigabyte hard-disk drive, and the NW-E99, with a built-in 1-gigabyte flash-memory chip, go on sale December 10 in Japan.",3,2,3
Dolphins Top Rams for 1st Win of Season (AP) AP - That noise the crowd made Sunday during the Miami Dolphins' victory at Pro Player Stadium was faintly familiar from seasons past. It's called cheering.,1,3,1
"Second Sunni Cleric Gunned Down in Iraq (Reuters) Reuters - Gunmen killed a Sunni Muslim\cleric in the city of Miqdadiya Tuesday, the second such\killing in Iraq in as many days, witnesses and hospital\officials said.",0,2,0
"Trump #39;s Casino Operation Files for Bankruptcy Donald J. Trump #39;s struggling casino operation has filed for bankruptcy reorganization, according to court documents, effectively commencing a recapitalization plan that was announced last month.",2,3,2
Lampard strikes as England cruise past Wales MANCHESTER: A stunning goal by captain David Beckham guided England to a convincing 2-0 victory over Wales in their World Cup qualifier at his former Old Trafford home yesterday.,1,2,1
