# Lightweight Fine-Tuning Project to confirm saved model properly provides the expected output

#### Student: Rodrigo Quezada Reyes

#### The Udacity platform was having Connection Failed over and over again during training or evaluation so rather perform locally with my GPU and will submit as a zip file.

***

Describing my choices for each of the following:

* PEFT technique: Lora as I find it a great option for fine-tuning while freezing a lot of paremeters for computation effiency.
* Model: deberta-v2-xlarge because it is a great model for text classification tasks and this is one of those.
* Evaluation approach: Performing initial evaluation with the foundational model, then performing the same evaluation using the trained Peft Model. This will allow a fair comparison of the model as is compared to the model fine-tuned.
* Fine-tuning dataset: Hugging Face tweet_eval dataset as it is an interesting collection of tweet-based benchmark tasks designed for evaluating text classification models on social media content.

## Loading and Evaluating a Foundation Model

* I am selecting to load as my chosen pre-trained foundational model, the deberta-v2-xlarge after assessing its capability for classification tasks.
* I will evaluate its performance prior to fine-tuning and then after fine-tuning. 
* I will also include loading an appropriate tokenizer and dataset.

***

#### Import all dependencies including the Hugging Face PEFT Library

In [1]:
#!pip install torch==2.7.1 torchvision==0.22.1 torchaudio==2.7.1 --index-url https://download.pytorch.org/whl/cu126

In [2]:
#!pip install sentencepiece

In [3]:
#import os
#os._exit(00)

In [4]:
import os
os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "expandable_segments:True"

In [5]:
# Importing the torch library

import torch
print(torch.__version__)
print(torch.version.cuda)
print(torch.cuda.is_available())

2.7.1+cu126
12.6
True


In [6]:
#!pip install --upgrade protobuf

In [7]:
# Importing the rest of the libraries

from peft import LoraConfig
from peft import TaskType
from transformers import AutoModelForCausalLM,AutoModelForSequenceClassification
from peft import get_peft_model
from peft import AutoPeftModelForCausalLM,AutoPeftModelForSequenceClassification
from peft import PeftModel, PeftConfig
from datasets import load_dataset
from transformers import AutoTokenizer
from sklearn.metrics import accuracy_score, classification_report
import numpy as np
from transformers import pipeline, DataCollatorWithPadding, Trainer, TrainingArguments
from transformers import DebertaV2ForSequenceClassification, DebertaV2Tokenizer
import evaluate
from torch.utils.data import DataLoader
import sentencepiece
#import protobuf


In [8]:
# Verify if GPU is available

import torch
print(torch.cuda.is_available())
print(torch.cuda.device_count())
print(torch.version.cuda)

True
1
12.6


In [9]:
import os
print(os.listdir("./tmp"))

['finetuned_814_2347_model', 'finetuned_814_model', 'finetuned_816_model', 'original_lora_peft_8142133_model', 'patent_class']


In [10]:
import os
print(os.listdir("./saved_models"))

['finetuned_8162335_model_v1', 'finetuned_8282025_model_v2', 'finetuned_8282025_model_v3', 'finetuned_8282025_model_v4', 'finetuned_8282025_model_v5', 'initial_lora_model_8282025_v0', 'initial_lora_model_8282025_v01']


In [11]:
#import os

#print(os.path.exists("./tmp/mod2347_model"))
#print(os.listdir("./tmp/mod2347_model"))


### Reviewer request

•	You must use the AUTO PEFT Model Hugging Face class: AutoPeftModelForSequenceClassification
•	This ensures you're leveraging the Parameter-Efficient Fine-Tuning (PEFT) capabilities designed for models like LoRA, Prefix Tuning, and more — making your model both efficient and scalable.

loaded_peft_model = AutoPeftModelForSequenceClassification.from_pretrained("/tmp/peft_model", num_labels=6)


In [12]:
load_dir = "./saved_models/finetuned_8282025_model_v4"

#my_816_model = AutoPeftModelForSequenceClassification.from_pretrained(load_dir)

loaded_peft_model = AutoPeftModelForSequenceClassification.from_pretrained(load_dir, num_labels=4)
new_tokenizer = AutoTokenizer.from_pretrained(load_dir)


Some weights of DebertaV2ForSequenceClassification were not initialized from the model checkpoint at microsoft/deberta-v3-small and are newly initialized: ['classifier.bias', 'classifier.weight', 'pooler.dense.bias', 'pooler.dense.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
  result[k] = f.get_tensor(k)


Since the loading above, the classification head gets randomly reinitialized so the trained parameters are getting lost.

Thus, need to save the trained model with Classification head weights saved too. 

Thus creating a new saved version as model V5.

In [13]:
load_dir = "./saved_models/finetuned_8282025_model_v5"

loaded_peft_model = AutoPeftModelForSequenceClassification.from_pretrained(
    load_dir, 
    num_labels=4
)

# Load the full state including classification head
state_dict = torch.load(f"{load_dir}/full_model_state.bin")
loaded_peft_model.load_state_dict(state_dict)
new_tokenizer = AutoTokenizer.from_pretrained(load_dir)



Some weights of DebertaV2ForSequenceClassification were not initialized from the model checkpoint at microsoft/deberta-v3-small and are newly initialized: ['classifier.bias', 'classifier.weight', 'pooler.dense.bias', 'pooler.dense.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [14]:
'''
I recall using the original one created issues with the last layer but will try

load_dir = "./saved_models/finetuned_8162335_model_v1"

my_816_model = AutoModelForSequenceClassification.from_pretrained(load_dir)
new_tokenizer = AutoTokenizer.from_pretrained(load_dir)
'''

'\nI recall using the original one created issues with the last layer but will try\n\nload_dir = "./saved_models/finetuned_8162335_model_v1"\n\nmy_816_model = AutoModelForSequenceClassification.from_pretrained(load_dir)\nnew_tokenizer = AutoTokenizer.from_pretrained(load_dir)\n'

In [15]:
# Sanity check

#print(my_816_model.config.num_labels)   # should show 4, not 2
#print(new_tokenizer.get_vocab_size()) ... this gets an error since Hugging Face’s DebertaV2TokenizerFast doesn’t expose get_vocab_size() (some tokenizers do, some don’t).

#print("Number of labels:", my_816_model.config.num_labels)  # should be 4
print("Number of labels:", loaded_peft_model.config.num_labels)  # should be 4
print("Vocab size:", new_tokenizer.vocab_size)           # or len(tokenizer.get_vocab())


Number of labels: 4
Vocab size: 128000


In [16]:
# Retrieval of the saved merged and fine-tuned model object

#ft_lora_model = AutoPeftModelForCausalLM.from_pretrained("./tmp/finetuned_814_2347_model")

#from transformers import AutoModelForCausalLM, AutoTokenizer

#model_path = r"C:\Users\dslab\Downloads\tmp\patent_class\checkpoint-6514"

#ft_lora_model = AutoModelForSequenceClassification.from_pretrained(
 #   model_path,
  #  num_labels=4,
   # local_files_only=True,
    #ignore_mismatched_sizes=True
#)

#ft_lora_model = DebertaV2ForSequenceClassification.from_pretrained(model_path, local_files_only=True)
#ft_tokenizer = AutoTokenizer.from_pretrained(model_path, local_files_only=True)

#ft_lora_model = DebertaV2ForSequenceClassification.from_pretrained(".tmp/patent_class/checkpoint-651/finetuned_814_2347_model", local_files_only=True)
#ft_lora_model = DebertaV2ForSequenceClassification.from_pretrained("./tmp/mod2347_model", local_files_only=True)


In [17]:
'''
import os

print(os.path.exists("./tmp/finetuned_mod2347_model"))
print(os.listdir("./tmp/finetuned_mod2347_model"))
'''

'\nimport os\n\nprint(os.path.exists("./tmp/finetuned_mod2347_model"))\nprint(os.listdir("./tmp/finetuned_mod2347_model"))\n'

In [18]:
'''
import os
for root, dirs, files in os.walk("./tmp"):
    if "config.json" in files:
        print(root)
'''

'\nimport os\nfor root, dirs, files in os.walk("./tmp"):\n    if "config.json" in files:\n        print(root)\n'

In [19]:
'''
for root, dirs, files in os.walk("."):
    if "config.json" in files:
        print(root)
'''

'\nfor root, dirs, files in os.walk("."):\n    if "config.json" in files:\n        print(root)\n'

In [20]:
'''
import os

print(os.path.exists("./tmp/finetuned_814_2347_model"))
print(os.listdir("./tmp/finetuned_814_2347_model"))
'''

'\nimport os\n\nprint(os.path.exists("./tmp/finetuned_814_2347_model"))\nprint(os.listdir("./tmp/finetuned_814_2347_model"))\n'

In [21]:
'''
import json
#with open("./tmp/finetuned_mod2347_model/config.json") as f:
with open("./tmp/finetuned_814_2347_model/config.json") as f:
    config = json.load(f)
print(config["num_labels"])
'''

'\nimport json\n#with open("./tmp/finetuned_mod2347_model/config.json") as f:\nwith open("./tmp/finetuned_814_2347_model/config.json") as f:\n    config = json.load(f)\nprint(config["num_labels"])\n'

In [22]:
#ft_lora_model
loaded_peft_model


PeftModelForSequenceClassification(
  (base_model): LoraModel(
    (model): DebertaV2ForSequenceClassification(
      (deberta): DebertaV2Model(
        (embeddings): DebertaV2Embeddings(
          (word_embeddings): Embedding(128100, 768, padding_idx=0)
          (LayerNorm): LayerNorm((768,), eps=1e-07, elementwise_affine=True)
          (dropout): Dropout(p=0.1, inplace=False)
        )
        (encoder): DebertaV2Encoder(
          (layer): ModuleList(
            (0-5): 6 x DebertaV2Layer(
              (attention): DebertaV2Attention(
                (self): DisentangledSelfAttention(
                  (query_proj): lora.Linear(
                    (base_layer): Linear(in_features=768, out_features=768, bias=True)
                    (lora_dropout): ModuleDict(
                      (default): Dropout(p=0.1, inplace=False)
                    )
                    (lora_A): ModuleDict(
                      (default): Linear(in_features=768, out_features=8, bias=False)
          

### Function to perform evaluate model performance (same function will be used for before and after model fine-tuning)

In [23]:
# Function to evaluate both the initial performance of the pretrained model with the dataset 
# to then to use to evaluate the pretrained fine-tuned model with the dataset 

kpi = evaluate.load("accuracy")

def model_evaluating(model, dataset, batch_size=1):
    model.eval()
    model.to("cuda")
    dataloader = DataLoader(dataset, batch_size=batch_size)
    
    for i in dataloader:
        input_ids = i["input_ids"].to("cuda")
        attention_mask = i["attention_mask"].to("cuda")
        labels = i["label"].to("cuda")
        
        with torch.no_grad():
            outputs = model(input_ids, attention_mask=attention_mask)
            predictions = outputs.logits.argmax(dim=-1)
        
        kpi.add_batch(predictions=predictions, references=labels)
    
    return kpi.compute()

In [24]:
# Practicing correct LoRA peft load
'''
# 1. Load PEFT config to find base model name
peft_model_path = "./tmp/original_lora_peft_8142133_model"  
peft_config = PeftConfig.from_pretrained(peft_model_path)

# 2. Load base model with correct label count
my_model = DebertaV2ForSequenceClassification.from_pretrained(
    'microsoft/deberta-v3-small', 
    num_labels=4,
    ignore_mismatched_sizes=True,
    use_safetensors=True 
    )

# 3. Load LoRA adapter on top of base model
original_lora_peft = PeftModel.from_pretrained(my_model, peft_model_path)
'''

'\n# 1. Load PEFT config to find base model name\npeft_model_path = "./tmp/original_lora_peft_8142133_model"  \npeft_config = PeftConfig.from_pretrained(peft_model_path)\n\n# 2. Load base model with correct label count\nmy_model = DebertaV2ForSequenceClassification.from_pretrained(\n    \'microsoft/deberta-v3-small\', \n    num_labels=4,\n    ignore_mismatched_sizes=True,\n    use_safetensors=True \n    )\n\n# 3. Load LoRA adapter on top of base model\noriginal_lora_peft = PeftModel.from_pretrained(my_model, peft_model_path)\n'

In [25]:
# Now select and load a dataset

from datasets import load_dataset

#my_dataset = load_dataset("emotion")
my_dataset = load_dataset("tweet_eval", "emotion")

dataset_splits = ['train', 'validation', 'test']

print(my_dataset)

DatasetDict({
    train: Dataset({
        features: ['text', 'label'],
        num_rows: 3257
    })
    test: Dataset({
        features: ['text', 'label'],
        num_rows: 1421
    })
    validation: Dataset({
        features: ['text', 'label'],
        num_rows: 374
    })
})


In [26]:
# Review overall train dataset

my_dataset["train"]

Dataset({
    features: ['text', 'label'],
    num_rows: 3257
})

In [27]:
# Review overall test dataset

my_dataset["test"]

Dataset({
    features: ['text', 'label'],
    num_rows: 1421
})

In [28]:
# Review the first example from the train dataset

my_dataset["train"][0]

{'text': "“Worry is a down payment on a problem you may never have'. \xa0Joyce Meyer.  #motivation #leadership #worry",
 'label': 2}

In [29]:
# Review the first example from the validation dataset

my_dataset["test"][0]

{'text': '#Deppression is real. Partners w/ #depressed people truly dont understand the depth in which they affect us. Add in #anxiety &amp;makes it worse',
 'label': 3}

### Key observation on the new tokenizer saved

In [30]:

# Making the saved variable the uploaded tokenizer from saving point


my_tokenizer = new_tokenizer



#my_tokenizer = DebertaV2Tokenizer.from_pretrained('microsoft/deberta-v2-xlarge')

# save tokenizer alongside the model
#my_tokenizer.save_pretrained("./tmp/finetuned_814_2347_model")


In [31]:
# Improved tokenizer version

my_tokenized_dataset = {}

for split in dataset_splits:
    my_tokenized_dataset[split] = my_dataset[split].map(
        #lambda x: my_tokenizer(x["text"], truncation=True, padding="max_length"), 
        lambda x: my_tokenizer(x["text"], truncation=True, padding=True, return_tensors="pt"), 
        batched=True
    )

# Inspect the available columns in the dataset
print(my_tokenized_dataset["train"].column_names)

Map:   0%|          | 0/3257 [00:00<?, ? examples/s]

Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.


Map:   0%|          | 0/374 [00:00<?, ? examples/s]

Map:   0%|          | 0/1421 [00:00<?, ? examples/s]

['text', 'label', 'input_ids', 'token_type_ids', 'attention_mask']


In [32]:
print(my_tokenized_dataset["test"].column_names)

['text', 'label', 'input_ids', 'token_type_ids', 'attention_mask']


In [33]:
print(my_tokenized_dataset["train"].column_names)

['text', 'label', 'input_ids', 'token_type_ids', 'attention_mask']


In [34]:
print(my_tokenized_dataset["train"][0])

{'text': "“Worry is a down payment on a problem you may never have'. \xa0Joyce Meyer.  #motivation #leadership #worry", 'label': 2, 'input_ids': [1, 68, 43422, 41870, 13, 10, 184, 1574, 21, 10, 453, 17, 111, 252, 30, 25, 4, 15282, 15583, 4, 1539, 76839, 1539, 71038, 1539, 118308, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'token_type_ids': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]}


In [35]:
print(my_tokenized_dataset["test"][0])

{'text': '#Deppression is real. Partners w/ #depressed people truly dont understand the depth in which they affect us. Add in #anxiety &amp;makes it worse', 'label': 3, 'input_ids': [1, 1539, 99185, 56743, 13, 340, 4, 8583, 2564, 96, 1539, 2539, 30606, 98, 1276, 5826, 513, 5, 3291, 11, 59, 49, 2271, 120, 4, 1962, 11, 1539, 63270, 169, 10087, 93, 54082, 22, 2416, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'token_type_ids': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]}


## Performing the baseline evaluation of the pre-trained model

***

In [36]:
my_tokenized_dataset["test"].set_format(type="torch", columns=["input_ids", "attention_mask", "label"])
#my_tokenized_dataset["test"].set_format(type="torch", columns=["input_ids", "attention_mask", "label"], padding=True, truncation=True, return_tensors="pt")
#tokenizer(texts, padding=True, truncation=True, return_tensors="pt")

print(my_tokenized_dataset["test"])

Dataset({
    features: ['text', 'label', 'input_ids', 'token_type_ids', 'attention_mask'],
    num_rows: 1421
})


In [37]:
# Making sure the testing dataset is ready for evaluation

my_testing_tokenized_dataset = my_tokenized_dataset["test"].map(batched=True)

my_testing_tokenized_dataset.set_format(type="torch", columns=["input_ids", "attention_mask", "label"])


Map:   0%|          | 0/1421 [00:00<?, ? examples/s]

In [38]:
# Delete cache before running to allow enough memory for it

torch.cuda.empty_cache()

In [39]:
# Test initial accuracy of the pretrained Model on the selected dataset

model_retrieval_results = model_evaluating(loaded_peft_model, my_testing_tokenized_dataset)
print("Base Model Evaluation:", model_retrieval_results)

Base Model Evaluation: {'accuracy': 0.39268121041520054}


### Worked like a charm!  

#### This confirms what I was expecting.

- The end -