# PEFT Models

Testing out PEFT with HuggingFace models

| Date | User | Change Type | Remarks |  
| ---- | ---- | ----------- | ------- |
| 07/10/2025   | Martin | Create  | Notebook created to test out various PEFT models with HuggingFace. Completed first cut of workflow | 

# Content

* [Introduction](#introduction)
* [Load Data](#load-data)
* [Preprocessing](#preprocessing)
* [Model Training](#model-training)

# Introduction

[ HuggingFace Source ](https://huggingface.co/docs/peft/index)

PEFT (Parameter-Efficient Fine-Tuning) is a library for efficiently adapting large pretrained models to various downstream applications without fine-tuning all of a model’s parameters because it is prohibitively costly. PEFT methods only fine-tune a small number of (extra) model parameters - significantly decreasing computational and storage costs - while yielding performance comparable to a fully fine-tuned model. This makes it more accessible to train and store large language models (LLMs) on consumer hardware.

In [2]:
import polars as pl
import numpy as np
import torch

from sklearn.preprocessing import LabelEncoder
from transformers import AutoTokenizer, AutoModelForSequenceClassification, DataCollatorWithPadding, TrainingArguments, Trainer
from datasets import Dataset

  from .autonotebook import tqdm as notebook_tqdm


# Load Data

In [3]:
path = "data/raw"
train = pl.read_csv(f"{path}/train.csv")
test = pl.read_csv(f"{path}/test.csv")

# Preprocessing

In [4]:
# Add new labels
le = LabelEncoder()
train = train.with_columns(
  Target = pl.col('Category') + ":" + pl.col('Misconception'),
  Correct = pl.col("Category").str.split("_").list.last() == "Correct"
)
train = train.with_columns(
  labels = pl.col("Target").map_batches(le.fit_transform)
)

# Get known answers
temp = train.filter(
  pl.col('Correct')
)
temp = temp.group_by(['QuestionId', 'MC_Answer']).len().sort('len', descending=True)
temp = temp.unique(subset=['QuestionId'])
temp = temp.drop('len')
temp = temp.with_columns(
  Correct=1
)

# Evaluate correctness of test set answer
test = test.join(
  temp,
  how='left',
  on=['QuestionId', 'MC_Answer']
)
test = test.fill_null(0)

In [5]:
def format_input(df):
  df = df.with_columns(
    pl.when(pl.col("Correct") == 1)
      .then(pl.lit("This answer is correct."))
      .otherwise(pl.lit("This answer is wrong."))
      .alias("Correctness")
  )

  return df.with_columns(
    pl.format(
      "Question:\n{}\nAnswer:\n{}\nCorrect:\n{}\nExplanation:\n{}",
      df['QuestionText'],
      df['MC_Answer'],
      df['Correctness'],
      df['StudentExplanation']
    ).alias("text")
  )

In [6]:
train = format_input(train)
test = format_input(test)

In [7]:
train.head()

row_id,QuestionId,QuestionText,MC_Answer,StudentExplanation,Category,Misconception,Target,Correct,labels,Correctness,text
i64,i64,str,str,str,str,str,str,bool,i64,str,str
0,31772,"""What fraction of the shape is …","""\( \frac{1}{3} \)""","""0ne third is equal to tree nin…","""True_Correct""","""NA""","""True_Correct:NA""",True,37,"""This answer is correct.""","""Question: What fraction of the…"
1,31772,"""What fraction of the shape is …","""\( \frac{1}{3} \)""","""1 / 3 because 6 over 9 is 2 th…","""True_Correct""","""NA""","""True_Correct:NA""",True,37,"""This answer is correct.""","""Question: What fraction of the…"
2,31772,"""What fraction of the shape is …","""\( \frac{1}{3} \)""","""1 3rd is half of 3 6th, so it …","""True_Neither""","""NA""","""True_Neither:NA""",False,64,"""This answer is wrong.""","""Question: What fraction of the…"
3,31772,"""What fraction of the shape is …","""\( \frac{1}{3} \)""","""1 goes into everything and 3 g…","""True_Neither""","""NA""","""True_Neither:NA""",False,64,"""This answer is wrong.""","""Question: What fraction of the…"
4,31772,"""What fraction of the shape is …","""\( \frac{1}{3} \)""","""1 out of every 3 isn't coloure…","""True_Correct""","""NA""","""True_Correct:NA""",True,37,"""This answer is correct.""","""Question: What fraction of the…"


In [8]:
train_ds = train.select(["text", "labels"])
train_ds = Dataset.from_polars(train_ds)

# Model Training

## LoRA

In [9]:
from peft import PeftModel, LoraConfig, get_peft_model, TaskType

In [10]:
NUM_LABELS = len(train['labels'].unique())
MODEL_PATH = "deepseek-ai/deepseek-math-7b-base"

In [12]:
# Tokeniser
tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH)
if tokenizer.pad_token is None:
  tokenizer.pad_token = tokenizer.eos_token


In [None]:
# Base model
base_model = AutoModelForSequenceClassification.from_pretrained(
  MODEL_PATH,
  num_labels=NUM_LABELS,
  dtype=torch.bfloat16,
  device_map="auto",
)

Fetching 2 files:   0%|          | 0/2 [04:44<?, ?it/s]


In [None]:
# LoRA configuration
lora_config = LoraConfig(
  task_type=TaskType.SEQ_CLS,
  r=16,
  lora_alpha=32, # A common practice is to double the r value
  target_modules=["query", "value"],
  lora_dropout=0.1,
  bias="none",
  modules_to_save=["classifier"]
)

model = get_peft_model(base_model, lora_config)
model.print_trainable_parameters()

'<｜end▁of▁sentence｜>'

In [None]:
training_args = TrainingArguments(
  output_dir="models/deepseek_base_misunderstanding",
  learning_rate=5e-3,
  per_device_train_batch_size=32,
  per_device_eval_batch_size=32,
  gradient_accumulation_steps=4,
  num_train_epochs=10,
  weight_decay=0.01,
  eval_strategy="epoch",
  save_strategy="epoch",
  load_best_model_at_end=True,
  fp16=True,
  label_names=["labels"]
)

trainer = Trainer(
  model=model,
  args=training_args,
  train_dataset=train_ds,
  processing_class=tokenizer,
  data_collator=DataCollatorWithPadding(tokenizer)
)

In [None]:
predictions = trainer.predict(test_tokenized)
logits = predictions.predictions

In [3]:
%watermark

Last updated: 2025-10-07T16:08:48.982864+08:00

Python implementation: CPython
Python version       : 3.11.9
IPython version      : 9.3.0

Compiler    : Clang 13.0.0 (clang-1300.0.29.30)
OS          : Darwin
Release     : 24.6.0
Machine     : arm64
Processor   : arm
CPU cores   : 10
Architecture: 64bit

