# Lightweight Fine-Tuning Project

TODO: In this cell, describe your choices for each of the following

* PEFT technique: The selected PEFT (Parameter-Efficient Fine-Tuning) technique starts by freezing all model parameters during the initial training of the base model for one epoch. After this, all parameters are unfrozen to allow the model to better adapt to the specific task during fine-tuning, which is done for an additional two epochs.
* Model: The base model used for sequence classification is "distilbert-base-uncased." This same base model is utilized for both the initial training and the PEFT process.
* Evaluation approach: The evaluation is conducted using the Trainer class from the Hugging Face transformers library. The evaluation strategy is set to "epoch," meaning evaluation occurs after each training epoch. The evaluation metrics include loss, accuracy, runtime, samples per second, steps per second, and epoch.
* Fine-tuning dataset: The fine-tuning dataset is derived from the imdb dataset, including train and test splits. A subset of 1,000 samples from each split is used. The dataset is preprocessed with the distilbert-base-uncased tokenizer.

## Loading and Evaluating a Foundation Model

TODO: In the cells below, load your chosen pre-trained Hugging Face model and evaluate its performance prior to fine-tuning. This step includes loading an appropriate tokenizer and dataset.

In [1]:
!pip install datasets #for colab



In [2]:
# importing dependencies.

import torch
from transformers import AutoTokenizer, DataCollatorWithPadding, TrainingArguments, Trainer, AutoModelForSequenceClassification
from datasets import load_dataset
import numpy as np
import pandas as pd

In [3]:
# Loading dataset (train and test splits)
splits = ["train", "test"]
dataset = {split: load_dataset("cornell-movie-review-data/rotten_tomatoes", split=split) for split in splits}

for split in splits:
    dataset[split] = dataset[split].shuffle(seed=50).select(range(1000))


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


Downloading readme:   0%|          | 0.00/7.46k [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/699k [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/90.0k [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/92.2k [00:00<?, ?B/s]

Generating train split:   0%|          | 0/8530 [00:00<?, ? examples/s]

Generating validation split:   0%|          | 0/1066 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/1066 [00:00<?, ? examples/s]

In [4]:
dataset

{'train': Dataset({
     features: ['text', 'label'],
     num_rows: 1000
 }),
 'test': Dataset({
     features: ['text', 'label'],
     num_rows: 1000
 })}

In [5]:
tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/483 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

In [6]:
def preprocess(examples): return tokenizer(examples["text"], padding="max_length", truncation =True)

tokenized_dataset = {}
for split in splits:
    tokenized_dataset[split] = dataset[split].map(preprocess, batched = True)

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

In [7]:
base_model = AutoModelForSequenceClassification.from_pretrained(
 "distilbert-base-uncased",
  num_labels = 2,
  id2label = {0: "NEGATIVE", 1: "POSITIVE"},
  label2id = {"NEGATIVE" : 0, "POSITIVE": 1},
)

for param in base_model.base_model.parameters():
    param.requires_grad = False

base_model

model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight', 'pre_classifier.bias', 'pre_classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


DistilBertForSequenceClassification(
  (distilbert): DistilBertModel(
    (embeddings): Embeddings(
      (word_embeddings): Embedding(30522, 768, padding_idx=0)
      (position_embeddings): Embedding(512, 768)
      (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
      (dropout): Dropout(p=0.1, inplace=False)
    )
    (transformer): Transformer(
      (layer): ModuleList(
        (0-5): 6 x TransformerBlock(
          (attention): MultiHeadSelfAttention(
            (dropout): Dropout(p=0.1, inplace=False)
            (q_lin): Linear(in_features=768, out_features=768, bias=True)
            (k_lin): Linear(in_features=768, out_features=768, bias=True)
            (v_lin): Linear(in_features=768, out_features=768, bias=True)
            (out_lin): Linear(in_features=768, out_features=768, bias=True)
          )
          (sa_layer_norm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
          (ffn): FFN(
            (dropout): Dropout(p=0.1, inplace=False)
 

In [8]:
def compute_metrics(eval_prediction):
    predictions, labels = eval_prediction
    predictions = np.argmax(predictions, axis=1)
    return {"accuracy": (predictions == labels).mean()}

In [9]:
trainer_base = Trainer(
model = base_model,
args = TrainingArguments(
    output_dir = "./data/sentiment_analysis_base",
    learning_rate = 2e-5,
    per_device_train_batch_size = 6,
    per_device_eval_batch_size = 6,
    num_train_epochs = 2,
    weight_decay = 0.01,
    evaluation_strategy = "epoch",
    save_strategy = "epoch",
    load_best_model_at_end = True
),
train_dataset = tokenized_dataset["train"],
eval_dataset = tokenized_dataset["test"],
tokenizer = tokenizer,
data_collator =  DataCollatorWithPadding(tokenizer=tokenizer),
compute_metrics = compute_metrics,
)

trainer_base.train()
base_model_evaluation = trainer_base.evaluate()



Epoch,Training Loss,Validation Loss,Accuracy
1,No log,0.675132,0.652
2,No log,0.668012,0.713


## Performing Parameter-Efficient Fine-Tuning

TODO: In the cells below, create a PEFT model from your loaded model, run a training loop, and save the PEFT model weights.

In [10]:
for param in base_model.parameters():
    param.requires_grad = True

In [12]:
trainer_peft = Trainer(
model = base_model,
args= TrainingArguments(
    output_dir="./data/sentiment_analysis_peft",
    learning_rate = 2e-5,
    per_device_train_batch_size=12,
    per_device_eval_batch_size=12,
    weight_decay = 0.01,
    evaluation_strategy="epoch",
    save_strategy="epoch",
    load_best_model_at_end=True
),
    train_dataset=tokenized_dataset["train"],
    eval_dataset=tokenized_dataset["test"],
    tokenizer=tokenizer,
    data_collator=DataCollatorWithPadding(tokenizer=tokenizer),
    compute_metrics=compute_metrics

)

trainer_peft.train()



Epoch,Training Loss,Validation Loss


Epoch,Training Loss,Validation Loss,Accuracy
1,No log,0.427136,0.815
2,No log,0.49031,0.801
3,No log,0.527591,0.813


TrainOutput(global_step=252, training_loss=0.33285504295712426, metrics={'train_runtime': 13899.4869, 'train_samples_per_second': 0.216, 'train_steps_per_second': 0.018, 'total_flos': 397402195968000.0, 'train_loss': 0.33285504295712426, 'epoch': 3.0})

## Performing Inference with a PEFT Model

TODO: In the cells below, load the saved PEFT model weights and evaluate the performance of the trained PEFT model. Be sure to compare the results to the results from prior to fine-tuning.

In [13]:
peft_model_evaluation = trainer_peft.evaluate()

print(f"Base Model Evaluation: {base_model_evaluation}")
print(f"PEFT Model Evaluation: {peft_model_evaluation}")

Base Model Evaluation: {'eval_loss': 0.6680116057395935, 'eval_accuracy': 0.713, 'eval_runtime': 1192.153, 'eval_samples_per_second': 0.839, 'eval_steps_per_second': 0.14, 'epoch': 2.0}
PEFT Model Evaluation: {'eval_loss': 0.4271356165409088, 'eval_accuracy': 0.815, 'eval_runtime': 985.8507, 'eval_samples_per_second': 1.014, 'eval_steps_per_second': 0.085, 'epoch': 3.0}


# Conclusion

The evaluation results indicate improvements in the performance of the PEFT (Parameter-Efficient Fine-Tuned) model compared to the base model. Specifically, the PEFT model achieved a lower evaluation loss (0.4271 vs. 0.6680) and a higher accuracy (0.815 vs. 0.713) over the test dataset. Additionally, the PEFT model demonstrated faster evaluation runtimes (985.8507s vs. 1192.153s) with higher samples (1.014 vs. 0.839) and slightly fewer steps processed per second (0.085 vs. 0.14). These improvements suggest that the fine-tuning process led to enhancements in both loss reduction and predictive accuracy, showcasing the effectiveness of parameter-efficient fine-tuning in optimizing the model using the Rotten Tomatoes dataset for sentiment analysis.