## Goal:  Lightweight Fine-Tuning Project

# AI Workflow Decisions

PEFT technique: LoRA (Low-Rank Adaptation), as it is lightweight and compatible with most Hugging Face models  
Model: distilbert-base-uncased (a compact, fast, and widely supported transformer model for classification tasks)  
Evaluation approach: Accuracy using the Hugging Face Trainer API  
Fine-tuning dataset: sms_spam (binary classification of SMS messages as spam or not spam)  


In [None]:
from datasets import load_dataset
from transformers import AutoTokenizer, AutoModelForSequenceClassification, Trainer, TrainingArguments
import numpy as np
from transformers import DataCollatorWithPadding

# Loading dataset and split
dataset = load_dataset("sms_spam", split="train").train_test_split(test_size=0.2, seed=42)

# Labels: 0 = not spam, 1 = spam
id2label = {0: "not spam", 1: "spam"}
label2id = {"not spam": 0, "spam": 1}

# Loading tokenizer
tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")

# Tokenization function
def tokenize_function(examples):
    return tokenizer(examples["sms"], truncation=True, padding=True)

tokenized_dataset = dataset.map(tokenize_function, batched=True)

# Loading base model
base_model = AutoModelForSequenceClassification.from_pretrained(
    "distilbert-base-uncased",
    num_labels=2,
    id2label=id2label,
    label2id=label2id
)



Downloading readme: 0.00B [00:00, ?B/s]

Downloading data: 100%|██████████| 359k/359k [00:00<00:00, 1.43MB/s]


Generating train split:   0%|          | 0/5574 [00:00<?, ? examples/s]

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/483 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

Map:   0%|          | 0/4459 [00:00<?, ? examples/s]

Map:   0%|          | 0/1115 [00:00<?, ? examples/s]

model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['classifier.bias', 'pre_classifier.weight', 'pre_classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


### Creating a Copy of the Baseline model before applying Lora

In [None]:
import copy

# Save a separate copy of the base model for clean baseline evaluation
base_model_copy = copy.deepcopy(base_model)



### Baseline Model Evals

In [None]:

# Computing metrics
def compute_metrics(eval_pred):
    predictions, labels = eval_pred
    preds = np.argmax(predictions, axis=1)
    return {"accuracy": (preds == labels).mean()}



# Baseline evaluation trainer
baseline_trainer = Trainer(
    model=base_model_copy,
    args=TrainingArguments(
        output_dir="./baseline_eval",
        per_device_eval_batch_size=16,
        report_to="none"
    ),
    eval_dataset=tokenized_dataset["test"],
    tokenizer=tokenizer,
    data_collator=DataCollatorWithPadding(tokenizer),
    compute_metrics=compute_metrics
)

# Evaluate baseline model before fine-tuning
base_results = baseline_trainer.evaluate()


You're using a DistilBertTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.


## Performing Parameter-Efficient Fine-Tuning


In [None]:
from peft import LoraConfig, get_peft_model

# Creating LoRA config
lora_config = LoraConfig(
    r=8,
    lora_alpha=16,
    target_modules=["q_lin", "v_lin"],  # based on DistilBERT internals
    lora_dropout=0.1,
    bias="none",
    task_type="SEQ_CLS"
)


# Applying LoRA to base model
peft_model = get_peft_model(base_model, lora_config)

# Showing trainable parameters
peft_model.print_trainable_parameters()

# Training PEFT model
peft_trainer = Trainer(
    model=peft_model,
    args=TrainingArguments(
        output_dir="/tmp/peft_model",
        per_device_train_batch_size=16,
        per_device_eval_batch_size=16,
        learning_rate=2e-5,
        num_train_epochs=2,
        evaluation_strategy="epoch",
        save_strategy="epoch",
        load_best_model_at_end=True,
        report_to="none"
    ),
    train_dataset=tokenized_dataset["train"],
    eval_dataset=tokenized_dataset["test"],
    tokenizer=tokenizer,
    data_collator=DataCollatorWithPadding(tokenizer),
    compute_metrics=compute_metrics
)

# Starting training
peft_trainer.train()

# Saving PEFT model (to /tmp to avoid workspace crash)
# Changed the save path
peft_model.save_pretrained("lora_model")
tokenizer.save_pretrained("lora_model")



trainable params: 1,331,716 || all params: 67,694,596 || trainable%: 1.967241225577297


Epoch,Training Loss,Validation Loss,Accuracy
1,No log,0.106433,0.973991
2,0.192300,0.067775,0.980269


('lora_model/tokenizer_config.json',
 'lora_model/special_tokens_map.json',
 'lora_model/vocab.txt',
 'lora_model/added_tokens.json',
 'lora_model/tokenizer.json')

## Performing Inference with a PEFT Model



In [None]:
from peft import AutoPeftModelForSequenceClassification

# Loading the saved PEFT model

reloaded_model = AutoPeftModelForSequenceClassification.from_pretrained("lora_model")


# New trainer with reloaded model
peft_reloaded_trainer = Trainer(
    model=reloaded_model,
    args=TrainingArguments(
        output_dir="/tmp/peft_model_eval",
        per_device_eval_batch_size=16,
        report_to="none"
    ),
    eval_dataset=tokenized_dataset["test"],
    tokenizer=tokenizer,
    data_collator=DataCollatorWithPadding(tokenizer),
    compute_metrics=compute_metrics
)

# Evaluating fine-tuned PEFT model
print("📊 PEFT model performance:")
peft_reloaded_trainer.evaluate()


Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['classifier.bias', 'pre_classifier.weight', 'pre_classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


📊 PEFT model performance:


{'eval_loss': 0.06777454912662506,
 'eval_accuracy': 0.9802690582959641,
 'eval_runtime': 5.9472,
 'eval_samples_per_second': 187.483,
 'eval_steps_per_second': 11.77}

### Baseline Model vs Peft Model Evals Comparison

In [None]:
print("🔍 Comparing base vs. fine-tuned model:")
base_results = baseline_trainer.evaluate()   # ✅ Use this instead
peft_results = peft_reloaded_trainer.evaluate()

print("Base Model Accuracy:", base_results["eval_accuracy"])
print("PEFT Model Accuracy:", peft_results["eval_accuracy"])


🔍 Comparing base vs. fine-tuned model:
Base Model Accuracy: 0.8080717488789237
PEFT Model Accuracy: 0.9802690582959641


# What The Results Mean

The baseline model performed reasonably well without any task-specific tuning.

After parameter-efficient fine-tuning using LoRA, the model’s accuracy significantly improved — from ~81% to ~98%.

This proves the effectiveness of LoRA in improving model performance while updating only a small fraction of parameters (a major advantage when working with limited compute).



