# 📘 Discipline Classifier v3.0 – DeBERTa + LoRA

This notebook fine-tunes `microsoft/deberta-v3-base` using Low-Rank Adaptation (LoRA) for the task of classifying research abstracts into Computer Science (CS), Information Systems (IS), or Information Technology (IT).

We use:
- Hugging Face 🤗 Transformers
- PEFT for LoRA
- Stratified train/test split on 1138 manually labeled abstracts


## Step 1: Imports and Setup

We import all required libraries for model loading, training, and evaluation.


In [1]:
import torch
import numpy as np
import pandas as pd
from datasets import Dataset
from transformers import (
    AutoTokenizer,
    AutoModelForSequenceClassification,
    TrainingArguments,
    Trainer
)
from peft import get_peft_model, LoraConfig, TaskType
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report


## Step 2: Load and Prepare Dataset

We use the manually labelled dataset of 1138 abstracts with `Discipline` labels.
Labels are: 
- CS → 0
- IS → 1
- IT → 2


In [2]:
# Replace with your actual file path if reading from CSV
df = pd.read_csv("Data/Discipline (1138).csv")

# Combine Title and Abstract
df["text"] = df["Title"].fillna("") + ". " + df["Abstract"].fillna("")

# Label encoding
label2id = {"CS": 0, "IS": 1, "IT": 2}
id2label = {v: k for k, v in label2id.items()}
df["label"] = df["Discipline"].map(label2id)

# Train-test split
train_df, test_df = train_test_split(df, test_size=0.2, stratify=df["label"], random_state=42)
train_ds = Dataset.from_pandas(train_df[["text", "label"]])
test_ds = Dataset.from_pandas(test_df[["text", "label"]])


## Step 3: Tokenize with DeBERTa Tokenizer

We use the `microsoft/deberta-v3-base` tokenizer to convert text into token IDs. Texts are padded/truncated to 256 tokens max.


In [8]:
model_name = "microsoft/deberta-base"
tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True)

def tokenize_function(example):
    return tokenizer(example["text"], truncation=True, padding="max_length", max_length=256)

train_ds = train_ds.map(tokenize_function, batched=True)
test_ds = test_ds.map(tokenize_function, batched=True)

train_ds.set_format(type="torch", columns=["input_ids", "attention_mask", "label"])
test_ds.set_format(type="torch", columns=["input_ids", "attention_mask", "label"])


Map:   0%|          | 0/910 [00:00<?, ? examples/s]

Map:   0%|          | 0/228 [00:00<?, ? examples/s]

## Step 4: Load DeBERTa and Apply LoRA

We load the `microsoft/deberta-base` model and apply Low-Rank Adaptation (LoRA) to fine-tune it efficiently on our classification task. Only a small subset of the model’s weights will be updated.


In [9]:
from transformers import AutoModelForSequenceClassification
from peft import LoraConfig, get_peft_model, TaskType

# Re-initialize base model here
base_model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=3)

# Configure LoRA
peft_config = LoraConfig(
    task_type=TaskType.SEQ_CLS,
    r=8,
    lora_alpha=32,
    lora_dropout=0.1,
    bias="none",
    target_modules=["in_proj"]
)

# Inject LoRA adapters
model = get_peft_model(base_model, peft_config)
model.print_trainable_parameters()

Some weights of DebertaForSequenceClassification were not initialized from the model checkpoint at microsoft/deberta-base and are newly initialized: ['classifier.bias', 'classifier.weight', 'pooler.dense.bias', 'pooler.dense.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


'NoneType' object has no attribute 'cadam32bit_grad_fp32'
trainable params: 297,219 || all params: 139,491,846 || trainable%: 0.2131


  warn("The installed version of bitsandbytes was compiled without GPU support. "


## Step 5: Train the Model (Simplified Version)

This version skips intermediate evaluation and just fine-tunes the model for 5 epochs. We'll evaluate separately after training.


In [10]:
from transformers import TrainingArguments, Trainer
import evaluate
import numpy as np
from sklearn.metrics import classification_report

# Metrics
accuracy = evaluate.load("accuracy")
f1 = evaluate.load("f1")

def compute_metrics(eval_pred):
    logits, labels = eval_pred
    preds = np.argmax(logits, axis=1)
    return {
        "accuracy": accuracy.compute(predictions=preds, references=labels)["accuracy"],
        "f1": f1.compute(predictions=preds, references=labels, average="macro")["f1"]
    }

# Safe config for older versions of Hugging Face
training_args = TrainingArguments(
    output_dir="./discipline_deberta_lora_v3.0",
    per_device_train_batch_size=8,
    per_device_eval_batch_size=8,
    num_train_epochs=5,
    learning_rate=2e-5,
    weight_decay=0.01,
    logging_steps=50
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_ds,
    eval_dataset=test_ds,
    compute_metrics=compute_metrics
)

trainer.train()


No label_names provided for model class `PeftModelForSequenceClassification`. Since `PeftModel` hides base models input arguments, if label_names is not given, label_names can't be set automatically within `Trainer`. Note that empty label_names list will be used instead.


Step,Training Loss
50,1.1056
100,1.0918
150,1.0867
200,1.0821
250,1.0656
300,1.0388
350,1.034
400,1.0112
450,1.021
500,1.0015


TrainOutput(global_step=570, training_loss=1.0497670458074202, metrics={'train_runtime': 3913.084, 'train_samples_per_second': 1.163, 'train_steps_per_second': 0.146, 'total_flos': 699592116787200.0, 'train_loss': 1.0497670458074202, 'epoch': 5.0})

In [11]:
metrics = trainer.evaluate()
print(metrics)



{'eval_loss': 0.9938977956771851, 'eval_accuracy': 0.5394736842105263, 'eval_f1': 0.3782505910165484, 'eval_runtime': 17.4641, 'eval_samples_per_second': 13.055, 'eval_steps_per_second': 1.661, 'epoch': 5.0}


## Step 6: Evaluation & Results Analysis

After 5 epochs of LoRA fine-tuning on the `microsoft/deberta-base` model, we evaluate the classifier on the test set of 228 abstracts.


In [13]:
from sklearn.metrics import classification_report
import numpy as np

preds = trainer.predict(test_ds)
y_true = preds.label_ids
y_pred = np.argmax(preds.predictions, axis=1)

print(classification_report(y_true, y_pred, target_names=["CS", "IS", "IT"]))




              precision    recall  f1-score   support

          CS       0.51      0.95      0.67        95
          IS       0.62      0.38      0.47        88
          IT       0.00      0.00      0.00        45

    accuracy                           0.54       228
   macro avg       0.38      0.44      0.38       228
weighted avg       0.45      0.54      0.46       228



  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


### 🔍 Metrics Summary:
- **Eval Loss**: 0.99
- **Accuracy**: 54%
- **Macro F1 Score**: 0.38

| Class | Precision | Recall | F1    | Support |
|-------|-----------|--------|-------|---------|
| CS    | 0.51      | 0.95   | 0.67  | 95      |
| IS    | 0.62      | 0.38   | 0.47  | 88      |
| IT    | 0.00      | 0.00   | 0.00  | 45      |

### 🧠 Interpretation:
- The model **strongly overpredicts the CS class**, achieving high recall but low precision.
- **IS classification is weak**, and **IT is completely unpredicted**, leading to a precision/recall/F1 of 0.00.
- This indicates a heavy class imbalance or weak learning signal from the LoRA adaptation for the IT class.
- Macro F1 score is significantly below baseline (0.38 vs 0.89 in v2.2), making this model unsuitable for deployment.

### ⚠️ Why This Happened:
- LoRA updates only ~0.2% of the model parameters, which may be **insufficient** for this 3-class task with subtle semantic boundaries.
- The model was trained **without class weighting**, so it likely biased toward the majority class (CS).
- DeBERTa, while strong in general NLP tasks, may lack domain-specific understanding needed for abstract classification — compared to SciBERT.

In [14]:
import joblib
import os

# Save model, tokenizer, and label mappings
joblib.dump(model, "Artefacts/discipline_classifier_deberta_lora_v3.0.pkl")
joblib.dump(tokenizer, "Artefacts/tokenizer_deberta_lora_v3.0.pkl")
joblib.dump({"CS": 0, "IS": 1, "IT": 2}, "Artefacts/label2id_deberta_lora_v3.0.pkl")


['Artefacts/label2id_deberta_lora_v3.0.pkl']