# LLM Fine-tuning Methods Comparison

This notebook demonstrates various parameter-efficient fine-tuning (PEFT) methods for large language models.

**Note:** All implementations are abstracted with a shared `BaseFineTuner` class that handles common training, validation, and evaluation logic. Each method only implements its specific configuration and setup, making the codebase clean and maintainable.

**Methods covered:**
1. **Full Fine-tuning** - Updates all parameters (baseline)
2. **LoRA** - Low-rank adaptation with trainable A and B matrices
3. **LoRA-FA** - LoRA with frozen A matrix (only B trainable)
4. **LoRA+** - LoRA with different learning rates for A and B
5. **Delta-LoRA** - LoRA with base weight updates via approximation
6. **AdaLoRA** - Adaptive LoRA with dynamic rank allocation
7. **QLoRA** - Quantized LoRA with 4-bit base weights
8. **VeRA** - Vector-based random matrix adaptation
9. **Prompt Tuning** - Simple learnable prompt tokens
10. **P-Tuning** - Prompt tuning with encoder



In [None]:
from src.methods import (
    FullFineTuner,
    LoRAFineTuner,
    LoRAFAFineTuner,
    LoRAPlusFineTuner,
    DeltaLoRAFineTuner,
    AdaLoRAFineTuner,
    QLoRAFineTuner,
    VeRAFineTuner,
    PromptTuningFineTuner,
    PTuningFineTuner
)

## Full Fine-tuning

Updates all model parameters during training.

- Achieves high accuracy. But requires most memory
- All weights are trainable
- Best for when you have sufficient computational resources

In [4]:
full_finetuner = FullFineTuner()
full_finetuner.run(save_model=True)

Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight', 'pre_classifier.bias', 'pre_classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Full Fine-tuning

Loading IMDB dataset...

Starting training...


Epoch 1/3: 100%|██████████| 1563/1563 [04:38<00:00,  5.62it/s]


Epoch 1 - Training Loss: 0.3033
Epoch 1 - Validation Loss: 0.1808


Epoch 2/3: 100%|██████████| 1563/1563 [04:38<00:00,  5.62it/s]


Epoch 2 - Training Loss: 0.1628
Epoch 2 - Validation Loss: 0.1732


Epoch 3/3: 100%|██████████| 1563/1563 [04:38<00:00,  5.62it/s]


Epoch 3 - Training Loss: 0.0904
Epoch 3 - Validation Loss: 0.2632
Model saved to models/full_finetuning

Evaluating after fine-tuning...


Evaluating: 100%|██████████| 782/782 [01:38<00:00,  7.92it/s]


Evaluation Results
Loss: 0.2835
Accuracy: 0.9326 (93.26%)
Precision: 0.9326
Recall: 0.9326
F1 Score: 0.9326

Confusion Matrix:
[[11629   871]
 [  815 11685]]


Full fine-tuning completed!





## LoRA (Low-Rank Adaptation)

Adds trainable low-rank matrices (A and B) to specific layers while keeping base weights frozen.

- Weight update: W' = W + BA
- Only ~1-2% of parameters are trainable
- Memory efficient with competitive performance

In [5]:
lora = LoRAFineTuner()
lora.run(save_model=True)

Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight', 'pre_classifier.bias', 'pre_classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


trainable params: 887,042 || all params: 67,842,052 || trainable%: 1.3075
LoRA Fine-tuning

Loading IMDB dataset...

Starting training...


Epoch 1/3: 100%|██████████| 1563/1563 [03:52<00:00,  6.72it/s]


Epoch 1 - Training Loss: 0.3041
Epoch 1 - Validation Loss: 0.2450


Epoch 2/3: 100%|██████████| 1563/1563 [03:53<00:00,  6.69it/s]


Epoch 2 - Training Loss: 0.2089
Epoch 2 - Validation Loss: 0.1989


Epoch 3/3: 100%|██████████| 1563/1563 [03:53<00:00,  6.69it/s]


Epoch 3 - Training Loss: 0.1775
Epoch 3 - Validation Loss: 0.1899
Model saved to models/lora

Evaluating after fine-tuning...


Evaluating: 100%|██████████| 782/782 [01:51<00:00,  7.02it/s]


Evaluation Results
Loss: 0.2054
Accuracy: 0.9250 (92.50%)
Precision: 0.9254
Recall: 0.9250
F1 Score: 0.9249

Confusion Matrix:
[[11371  1129]
 [  747 11753]]


LoRA fine-tuning completed!





## LoRA-FA (LoRA with Frozen-A)

Only trains the B matrix while keeping A frozen, reducing trainable parameters by 50% compared to standard LoRA.

In [6]:
lora_fa = LoRAFAFineTuner()
lora_fa.run(save_model=True)

Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight', 'pre_classifier.bias', 'pre_classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


trainable params: 739,586 || all params: 67,842,052 || trainable%: 1.0902
LoRA-FA Fine-tuning
Key Feature: Only matrix B is trained, matrix A is frozen

Loading IMDB dataset...

Starting LoRA-FA training...


Epoch 1/3: 100%|██████████| 1563/1563 [03:47<00:00,  6.86it/s]


Epoch 1 - Training Loss: 0.3308
Epoch 1 - Validation Loss: 0.2449


Epoch 2/3: 100%|██████████| 1563/1563 [03:48<00:00,  6.84it/s]


Epoch 2 - Training Loss: 0.2397
Epoch 2 - Validation Loss: 0.2154


Epoch 3/3: 100%|██████████| 1563/1563 [03:48<00:00,  6.84it/s]


Epoch 3 - Training Loss: 0.2189
Epoch 3 - Validation Loss: 0.2051
Model saved to models/lora_fa


Evaluating: 100%|██████████| 782/782 [01:51<00:00,  7.02it/s]


Evaluation Results
Loss: 0.2165
Accuracy: 0.9148 (91.48%)
Precision: 0.9148
Recall: 0.9148
F1 Score: 0.9148

Confusion Matrix:
[[11416  1084]
 [ 1045 11455]]


LoRA-FA fine-tuning completed!





## LoRA+

- Different learning rates for A and B matrices
- B matrix uses much higher learning rate (16-32x)
- Improved training stability and performance

In [7]:
lora_plus = LoRAPlusFineTuner()
lora_plus.run(save_model=True)

Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight', 'pre_classifier.bias', 'pre_classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


trainable params: 887,042 || all params: 67,842,052 || trainable%: 1.3075
LoRA+ Fine-tuning

Loading IMDB dataset...

Starting LoRA+ training...


Epoch 1/3: 100%|██████████| 1563/1563 [03:52<00:00,  6.71it/s]


Epoch 1 - Training Loss: 0.2974
Epoch 1 - Validation Loss: 0.2276


Epoch 2/3: 100%|██████████| 1563/1563 [03:53<00:00,  6.69it/s]


Epoch 2 - Training Loss: 0.2076
Epoch 2 - Validation Loss: 0.1862


Epoch 3/3: 100%|██████████| 1563/1563 [03:53<00:00,  6.69it/s]


Epoch 3 - Training Loss: 0.1629
Epoch 3 - Validation Loss: 0.1985
Model saved to models/lora_plus

Evaluating after fine-tuning...


Evaluating: 100%|██████████| 782/782 [01:51<00:00,  7.01it/s]


Evaluation Results
Loss: 0.2184
Accuracy: 0.9284 (92.84%)
Precision: 0.9286
Recall: 0.9284
F1 Score: 0.9284

Confusion Matrix:
[[11497  1003]
 [  786 11714]]


LoRA+ fine-tuning completed!





## Delta-LoRA - Approximation

Updates both LoRA adapters and base weights. The difference between the product of low-rank matrices A and B in two consecutive steps is added to W. This implementation used an approximation method with different learning rate for base weights instead of directly computing the delta.

- Combines LoRA efficiency with direct base weight updates
- Base weights are trainable
- Approximates delta mechanism using a smaller learning rate for base weights

In [8]:
delta_lora = DeltaLoRAFineTuner()
delta_lora.run(save_model=True)

Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight', 'pre_classifier.bias', 'pre_classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


trainable params: 67,842,052 || all params: 67,842,052 || trainable%: 100.0000
Delta-LoRA Fine-tuning

Loading IMDB dataset...

Starting Delta-LoRA training...


Epoch 1/3: 100%|██████████| 1563/1563 [05:05<00:00,  5.11it/s]


Epoch 1 - Training Loss: 0.3380
Epoch 1 - Validation Loss: 0.2161


Epoch 2/3: 100%|██████████| 1563/1563 [05:06<00:00,  5.10it/s]


Epoch 2 - Training Loss: 0.2176
Epoch 2 - Validation Loss: 0.1801


Epoch 3/3: 100%|██████████| 1563/1563 [05:06<00:00,  5.10it/s]


Epoch 3 - Training Loss: 0.1877
Epoch 3 - Validation Loss: 0.1857
Model saved to models/delta_lora

Evaluating after fine-tuning...


Evaluating: 100%|██████████| 782/782 [01:51<00:00,  7.02it/s]


Evaluation Results
Loss: 0.2185
Accuracy: 0.9255 (92.55%)
Precision: 0.9257
Recall: 0.9255
F1 Score: 0.9255

Confusion Matrix:
[[11441  1059]
 [  804 11696]]


Delta-LoRA fine-tuning completed!





## AdaLoRA (Adaptive LoRA)

Dynamically adjusts the rank of LoRA adapters during training.

- Adaptive rank allocation based on importance
- Prunes less important adapters during training
- More efficient parameter usage than standard LoRA

In [9]:
ada_lora = AdaLoRAFineTuner()
ada_lora.run(save_model=True)

Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight', 'pre_classifier.bias', 'pre_classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


trainable params: 1,034,786 || all params: 67,989,820 || trainable%: 1.5220
AdaLoRA Fine-tuning

Loading IMDB dataset...

Starting AdaLoRA training...


Epoch 1/3: 100%|██████████| 1563/1563 [04:08<00:00,  6.29it/s]


Epoch 1 - Training Loss: 0.5267
Epoch 1 - Validation Loss: 0.3245


Epoch 2/3: 100%|██████████| 1563/1563 [04:09<00:00,  6.27it/s]


Epoch 2 - Training Loss: 0.2559
Epoch 2 - Validation Loss: 0.2293


Epoch 3/3: 100%|██████████| 1563/1563 [04:09<00:00,  6.27it/s]


Epoch 3 - Training Loss: 0.2365
Epoch 3 - Validation Loss: 0.2197
Model saved to models/adalora

Evaluating after fine-tuning...


Evaluating: 100%|██████████| 782/782 [01:56<00:00,  6.71it/s]



Evaluation Results
Loss: 0.2221
Accuracy: 0.9108 (91.08%)
Precision: 0.9109
Recall: 0.9108
F1 Score: 0.9108

Confusion Matrix:
[[11292  1208]
 [ 1021 11479]]


AdaLoRA fine-tuning completed!


## QLoRA (Quantized LoRA)

Combines 4-bit quantization with LoRA for memory-efficient fine-tuning of large models.

- Base model weights quantized to 4-bit (NF4)
- LoRA adapters remain in full precision
- Significantly reduces memory requirements

**Note:** Requires x86_64 Linux with CUDA (NVIDIA GPU)

In [11]:
qlora = QLoRAFineTuner()
qlora.run(save_model=True)

Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight', 'pre_classifier.bias', 'pre_classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


trainable params: 296,450 || all params: 67,249,922 || trainable%: 0.4408
QLoRA Fine-tuning

Loading IMDB dataset...

Starting QLoRA training...


Epoch 1/3: 100%|██████████| 1563/1563 [04:18<00:00,  6.05it/s]


Epoch 1 - Training Loss: 0.3331
Epoch 1 - Validation Loss: 0.2037


Epoch 2/3: 100%|██████████| 1563/1563 [04:18<00:00,  6.04it/s]


Epoch 2 - Training Loss: 0.2116
Epoch 2 - Validation Loss: 0.1642


Epoch 3/3: 100%|██████████| 1563/1563 [04:18<00:00,  6.04it/s]


Epoch 3 - Training Loss: 0.1857




Epoch 3 - Validation Loss: 0.1833
Model saved to models/qlora


Evaluating: 100%|██████████| 782/782 [01:28<00:00,  8.84it/s]


Evaluation Results
Loss: 0.2028
Accuracy: 0.9258 (92.58%)
Precision: 0.9260
Recall: 0.9258
F1 Score: 0.9258

Confusion Matrix:
[[11468  1032]
 [  822 11678]]


QLoRA fine-tuning completed!





## VeRA (Vector-based Random Matrix Adaptation)

Uses shared frozen random matrices (A and B) with trainable scaling vectors (b and d).

- Only two trainable vectors per module (b and d)
- A and B matrices are frozen shared random matrices
- Extremely parameter efficient

In [12]:
vera = VeRAFineTuner()
vera.run(save_model=True)

Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight', 'pre_classifier.bias', 'pre_classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


trainable params: 616,706 || all params: 67,571,716 || trainable%: 0.9127
VeRA Fine-tuning

Loading IMDB dataset...

Starting VeRA training...


Epoch 1/3: 100%|██████████| 1563/1563 [04:16<00:00,  6.11it/s]


Epoch 1 - Training Loss: 0.3521
Epoch 1 - Validation Loss: 0.2781


Epoch 2/3: 100%|██████████| 1563/1563 [04:16<00:00,  6.09it/s]


Epoch 2 - Training Loss: 0.2537
Epoch 2 - Validation Loss: 0.2246


Epoch 3/3: 100%|██████████| 1563/1563 [04:16<00:00,  6.09it/s]


Epoch 3 - Training Loss: 0.2349
Epoch 3 - Validation Loss: 0.2150
Model saved to models/vera

Evaluating after fine-tuning...


Evaluating: 100%|██████████| 782/782 [02:02<00:00,  6.36it/s]



Evaluation Results
Loss: 0.2230
Accuracy: 0.9102 (91.02%)
Precision: 0.9104
Recall: 0.9102
F1 Score: 0.9102

Confusion Matrix:
[[11227  1273]
 [  972 11528]]


VeRA fine-tuning completed!


## Prompt Tuning

Adds learnable prompt tokens to input sequences that guide model behavior without modifying base weights.

- Base model weights remain completely frozen
- Learnable prompt tokens prepended to inputs
- Highly parameter-efficient (only prompt embeddings are trainable)

In [14]:
prompt_tuning = PromptTuningFineTuner()
prompt_tuning.run(save_model=True)

Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight', 'pre_classifier.bias', 'pre_classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


trainable params: 607,490 || all params: 67,562,500 || trainable%: 0.8992
Prompt Tuning Fine-tuning

Loading IMDB dataset...

Starting Prompt Tuning training...


Epoch 1/3: 100%|██████████| 1563/1563 [03:27<00:00,  7.52it/s]


Epoch 1 - Training Loss: 0.5103
Epoch 1 - Validation Loss: 0.4181


Epoch 2/3: 100%|██████████| 1563/1563 [03:29<00:00,  7.47it/s]


Epoch 2 - Training Loss: 0.3846
Epoch 2 - Validation Loss: 0.3761


Epoch 3/3: 100%|██████████| 1563/1563 [03:29<00:00,  7.47it/s]


Epoch 3 - Training Loss: 0.3613
Epoch 3 - Validation Loss: 0.3655
Model saved to models/prompt_tuning

Evaluating after fine-tuning...


Evaluating: 100%|██████████| 782/782 [01:38<00:00,  7.90it/s]


Evaluation Results
Loss: 0.3316
Accuracy: 0.8588 (85.88%)
Precision: 0.8591
Recall: 0.8588
F1 Score: 0.8588

Confusion Matrix:
[[10552  1948]
 [ 1581 10919]]


Prompt Tuning fine-tuning completed!





## P-Tuning

Uses a prompt encoder (MLP/LSTM) to generate prompt representations that are optimized across transformer layers.

- Base model weights remain frozen
- Prompt encoder (MLP/LSTM) generates prompt representations
- More complex than simple prompt tuning, typically better performance

In [None]:
p_tuning = PTuningFineTuner()
p_tuning.run(save_model=True)

Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight', 'pre_classifier.bias', 'pre_classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


trainable params: 2,379,266 || all params: 69,334,276 || trainable%: 3.4316
P-Tuning v2 Fine-tuning

Loading IMDB dataset...

Starting P-Tuning v2 training...


Epoch 1/3: 100%|██████████| 1563/1563 [03:28<00:00,  7.51it/s]


Epoch 1 - Training Loss: 0.3967
Epoch 1 - Validation Loss: 0.3889


Epoch 2/3: 100%|██████████| 1563/1563 [03:29<00:00,  7.47it/s]


Epoch 2 - Training Loss: 0.3105
Epoch 2 - Validation Loss: 0.2878


Epoch 3/3: 100%|██████████| 1563/1563 [03:29<00:00,  7.47it/s]


Epoch 3 - Training Loss: 0.2842
Epoch 3 - Validation Loss: 0.2849
Model saved to models/p_tuning_v2

Evaluating after fine-tuning...


Evaluating: 100%|██████████| 782/782 [01:38<00:00,  7.91it/s]


Evaluation Results
Loss: 0.2702
Accuracy: 0.8894 (88.94%)
Precision: 0.8895
Recall: 0.8894
F1 Score: 0.8894

Confusion Matrix:
[[11036  1464]
 [ 1301 11199]]


P-Tuning v2 fine-tuning completed!



