# Lightweight Fine-Tuning Project

TODO: In this cell, describe your choices for each of the following

* PEFT technique:
* Model:
* Evaluation approach:
* Fine-tuning dataset:

In [1]:
# This contains code along with its output performed on it.

In [2]:
# Disabling logging to wandb so that we can avoid API key requests
import os
os.environ["WANDB_DISABLED"]="true"

#Installing necessary libraries
!pip install transformers datasets peft accelerate bitsandbytes evaluate scikit-learn

Defaulting to user installation because normal site-packages is not writeable


In [3]:
# Importing necessary libraries
import torch
from transformers import AutoModelForSequenceClassification, AutoTokenizer, TrainingArguments, Trainer
from datasets import load_dataset
from peft import LoraConfig, get_peft_model, TaskType

In [4]:
# Loading a dataset (IMDB, a lightweight dataset for sentiment analysis)
dataset=load_dataset("imdb")

model_name="prajjwal1/bert-tiny" #from hugging face
model=AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=2)
tokenizer=AutoTokenizer.from_pretrained(model_name)

Some weights of BertForSequenceClassification were not initialized from the model checkpoint at prajjwal1/bert-tiny and are newly initialized: ['classifier.weight', 'classifier.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [5]:
# Defining a tokenizer function
def tokenize_function(examples):
    return tokenizer(examples["text"], padding="max_length", truncation=True, max_length=128)

# Tokenizing and preprocessing dataset and removing unnecessary columns
tokenized_datasets=dataset.map(tokenize_function, batched=True)
tokenized_datasets=tokenized_datasets.remove_columns(["text"])
tokenized_datasets.set_format("torch")

#Using a smaller data subset for fast execution (can be increased)
train_dataset=tokenized_datasets["train"].shuffle(seed=42).select(range(10000))
test_dataset=tokenized_datasets["test"].shuffle(seed=42).select(range(500))

Map:   0%|          | 0/50000 [00:00<?, ? examples/s]

In [6]:
# Evaluating the base model's performance before fine-tuning
# Importing evaluate
import evaluate
accuracy = evaluate.load("accuracy")

# Defining methods to compute metrics
def compute_metrics(eval_pred):
    logits, labels=eval_pred
    predictions=torch.argmax(torch.tensor(logits), dim=-1)
    return accuracy.compute(predictions=predictions, references=labels)

training_args=TrainingArguments(
    output_dir="/tmp/LightWeightFineTuning",  # Saving outputs to /tmp to avoid storage issues
    evaluation_strategy="epoch",
    per_device_eval_batch_size=4,
    report_to="none"  # Disabling wandb reporting (api login is avoided)
)

trainer=Trainer(
    model=model,
    args=training_args,
    eval_dataset=test_dataset,
    compute_metrics=compute_metrics,
)

# Evaluating results
print("Evaluating the base model...")
baseline_results = trainer.evaluate()
print("Baseline results:", baseline_results)

Evaluating the base model...


Baseline results: {'eval_loss': 0.6978216767311096, 'eval_accuracy': 0.502, 'eval_runtime': 1.3037, 'eval_samples_per_second': 383.52, 'eval_steps_per_second': 95.88}


## Performing Parameter-Efficient Fine-Tuning

TODO: In the cells below, create a PEFT model from your loaded model, run a training loop, and save the PEFT model weights.

In [7]:
# Using AutoPeftModelForSequenceClassification for proper loading
from peft import AutoPeftModelForSequenceClassification
#model=AutoPeftModelForSequenceClassification.from_pretrained("lora_finetuned_model")

# Loading the base model for sequence classification
#base_model=AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=2)

model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=2)

# Setting up PEFT configuration for LoRA fine-tuning using BERT's self-attention layers
peft_config=LoraConfig(
    task_type=TaskType.SEQ_CLS,
    inference_mode=False,
    r=32,                # LoRA rank (can be increased for better accuracy)
    lora_alpha=32,       # Scaling factor
    lora_dropout=0.1,
    target_modules=["query", "key", "value"]
)

# Applying LoRA to the model
lora_model=get_peft_model(model, peft_config)
lora_model.print_trainable_parameters()

Some weights of BertForSequenceClassification were not initialized from the model checkpoint at prajjwal1/bert-tiny and are newly initialized: ['classifier.weight', 'classifier.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


trainable params: 49,668 || all params: 4,435,588 || trainable%: 1.1197613484390345


In [8]:
# Setting up training arguments for fine-tuning
training_args=TrainingArguments(
    output_dir="Output",
    evaluation_strategy="epoch",
    save_strategy="epoch",
    learning_rate=2e-5,     #Can be adjusted for improving accuracy
    per_device_train_batch_size=4,
    per_device_eval_batch_size=4,
    num_train_epochs=15,  # Increase epochs if needed for better performance
    weight_decay=0.01,
    report_to="none",  # Disable wandb logging
    logging_dir="logs",
    logging_steps=10,
    load_best_model_at_end=True,
    save_total_limit=1,
)

trainer=Trainer(
    model=lora_model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=test_dataset,
    compute_metrics=compute_metrics,
)

No label_names provided for model class `PeftModelForSequenceClassification`. Since `PeftModel` hides base models input arguments, if label_names is not given, label_names can't be set automatically within `Trainer`. Note that empty label_names list will be used instead.


In [9]:
print("Starting fine-tuning with LoRA...")
trainer.train()

Starting fine-tuning with LoRA...


Epoch,Training Loss,Validation Loss,Accuracy
1,0.6723,0.678589,0.586
2,0.6606,0.660749,0.606
3,0.5933,0.647073,0.634
4,0.4824,0.621853,0.66
5,0.4723,0.583984,0.698
6,0.6227,0.567887,0.704
7,0.5366,0.560878,0.718
8,0.4262,0.555339,0.724
9,0.5502,0.554332,0.724
10,0.6235,0.554843,0.724


TrainOutput(global_step=37500, training_loss=0.5747052592086792, metrics={'train_runtime': 2248.5699, 'train_samples_per_second': 66.709, 'train_steps_per_second': 16.677, 'total_flos': 53335296000000.0, 'train_loss': 0.5747052592086792, 'epoch': 15.0})

In [10]:
# Saving the fine-tuned model and tokenizer
save_directory="LoraFineTunedModel"
lora_model.save_pretrained(save_directory)
tokenizer.save_pretrained(save_directory)
print(f"Fine-tuned model and tokenizer saved successfully in {save_directory}!")

Fine-tuned model and tokenizer saved successfully in LoraFineTunedModel!


###  ‚ö†Ô∏è IMPORTANT ‚ö†Ô∏è

Due to workspace storage constraints, you should not store the model weights in the same directory but rather use `/tmp` to avoid workspace crashes which are irrecoverable.
Ensure you save it in /tmp always.

In [11]:
# To infer the results, loading the saved model
# Importing AutoPeftModelForSequenceClassification

from peft import AutoPeftModelForSequenceClassification

if os.path.exists(save_directory) and os.path.isdir(save_directory):
    try:
        # Loading the fine-tuned model using AutoPeftModelForSequenceClassification
        loaded_model=AutoPeftModelForSequenceClassification.from_pretrained(save_directory)
        print("Loaded fine-tuned model for inference.")
    except Exception as e:
        print("Error loading the model:", e)
else:
    print(f"Directory '{save_directory}' does not exist.")

Some weights of BertForSequenceClassification were not initialized from the model checkpoint at prajjwal1/bert-tiny and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Loaded fine-tuned model for inference.


In [12]:
# Evaluating the fine-tuned model
trainer.model=loaded_model

In [13]:
fine_tuned_results=trainer.evaluate()
print("Fine-tuned results:", fine_tuned_results)

Fine-tuned results: {'eval_loss': 0.5505184531211853, 'eval_accuracy': 0.732, 'eval_runtime': 3.0106, 'eval_samples_per_second': 166.08, 'eval_steps_per_second': 41.52, 'epoch': 15.0}
