# Day 2b
In this notebook, we will continue with tweet bias classification. However, we will now use feature extraction instead of zero-/few-shot classification. We will use the `SentenceTransformer` library to extract features from the text, and then use a `RidgeClassifierCV` to classify the tweets. We will train the classifier on some additional training data (`media_bias_train`) and evaluate it on the same data as day_3a (`media_bias_test`).

## Environment Setup
Make sure to set your runtime back to using CPU by going to `Runtime` -> `Change runtime type` -> `Hardware accelerator` -> `CPU`. This will save you some GPU hours.

In [None]:
import sys
if 'google.colab' in sys.modules:  # If in Google Colab environment
    # Mount google drive to enable access to data files
    from google.colab import drive
    drive.mount('/content/drive')

    # Install requisite packages
    !pip install sentence_transformers &> /dev/null

    # Change working directory to day_3
    %cd /content/drive/MyDrive/LLM4BeSci_StGallen2025/day_3

: 

In [None]:
import pandas as pd
from sentence_transformers import SentenceTransformer
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import RidgeClassifierCV
import seaborn as sns

## Feature Extraction
The code begins by loading the data as `pandas.DataFrame` objects.

In [None]:
# Reload test data from last last notebook (day_3a.ipynb)
media_bias_test = pd.read_csv('media_bias_test.csv')

# Load training data
media_bias_train = pd.read_csv('media_bias_train.csv')
media_bias_train

Note the considerable increase in the number of training samples. The code then next initializes the `SentenceTransformer` model `'all-mpnet-base-v2'` and extracts features from the training data using the `encode` method.

In [None]:
# Initialize feature extraction pipeline
model = SentenceTransformer('all-mpnet-base-v2')

# Extract features
train_features = model.encode(media_bias_train['text'])
train_features


The features are then standardised before being fed into `RidgeClassifierCV`. This is crucial, since `RidgeClassifierCV` uses l2 (ridge) regularisation to prevent over-fitting, which assumes that the features have the same scaling. The classifier is then trained on the training data using the `fit` method. Note that `RidgeClassifierCV` will automatically perform cross-validation on the training data to find the best alpha value from the list of `alphas` provided. Performance on the training data is then evaluated using the `score` method.

In [None]:
# Standardize features
scaler = StandardScaler()
scaler.fit(train_features)
features = scaler.transform(train_features)

# Initialize classifier
ridge = RidgeClassifierCV(alphas=[1e-3, 1e-2, 1e-1, 1, 10, 100])

# Train classifier
ridge.fit(train_features, media_bias_train['bias'])
f"Train accuracy: {ridge.score(train_features, media_bias_train['bias'])}"

Features are next extracted for the test set and standardised using the same `StandardScaler` object that was fitted on the training data to prevent data leakage. The classifier is then evaluated on the test data using the `score` method.

In [None]:
# Extract features for test set
test_features = model.encode(media_bias_test['text'])

# Standardising features
test_features = scaler.transform(test_features)

# Test classifier
f"Test accuracy: {ridge.score(test_features, media_bias_test['bias'])}"

As you can see, feature extraction outperforms zero-shot and few-shot classification from the last notebook. Why do you think this is?

We can also visualize the confusion matrix:

In [None]:
# Confusion matrix
confusion = pd.crosstab(media_bias_test['bias'], ridge.predict(test_features))
sns.heatmap(confusion, annot=True)

Like with few-shot, the feature extraction approach identifies more neutral tweets as partisan (false positives) than it does partisan tweets as neutral (false negatives).

**TASK 1:** Go to the [MTEB leaderboard](https://huggingface.co/spaces/mteb/leaderboard) and find a well-performing small model (i.e., high in the leaderboard, <1 billion parameters). Open the model card by clicking on the model, test whether the model can be run with `sentence-transformers` (by looking at the tags under the model name: there should be tag called `sentence-transformers`). Replace the `"all-mpnet-base-v2"` in the code above and re-run the analysis. Does the performance improve?

**TASK 2:** The few shot performance measure we have so far been using is a single point estimate. Can you think of a way to get an uncertainty estimate on the test performance (e.g., a confidence interval)? Hint: Think along the lines of bootstrapping.

##**BONUS - LoRA Fine-Tuning (Small BERT)**

So far, we have treated the language model as a fixed feature extractor:
the model produces embeddings, and all task learning happens in a separate, linear classifier.

Now we allow the language model itself to **adapt to the task**.

A straightforward way to do this would be full fine-tuning, where all model parameters are updated.

However, full fine-tuning: (1) updates millions of parameters, (2) requires substantial GPU memory and time, (3) and is often unnecessary for relatively small classification tasks.

Instead, we use **LoRA (Low-Rank Adaptation)**, a parameter-efficient fine-tuning method.

LoRA inserts small, trainable low-rank matrices into the model’s attention layers, keeps the original pretrained weights frozen, and updates less than 1% of the total parameters.

This keeps training fast and lightweight, while still allowing task-specific learning.


## Environment Setup (LoRA fine-tuning)

For fine-tuning, a GPU runtime is recommended.

In [None]:
import sys
if 'google.colab' in sys.modules:
    !pip -q install transformers datasets evaluate accelerate peft

import numpy as np
import pandas as pd
import torch
import transformers

print("torch:", torch.__version__)
print("transformers:", transformers.__version__)
print("cuda available:", torch.cuda.is_available())

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/84.1 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m84.1/84.1 kB[0m [31m4.1 MB/s[0m eta [36m0:00:00[0m
[?25htorch: 2.9.0+cu126
transformers: 4.57.3
cuda available: True


## Load the same data and Prepare labels


We reuse:
- `media_bias_train.csv`
- `media_bias_test.csv`

Columns:
- `text`: tweet text
- `bias`: label


Transformers expect integer labels: 0..K-1.
We create a mapping from label name → id using the training set.

In [None]:
# 1. Load Data
train_df = pd.read_csv("media_bias_train.csv")
test_df  = pd.read_csv("media_bias_test.csv")

# 2. Prepare Label Mappings
# We create a dictionary to map string labels to integers and vice versa.
label_names = sorted(train_df["bias"].unique())
label2id = {name: i for i, name in enumerate(label_names)}
id2label = {i: name for name, i in label2id.items()}

# 3. Apply mapping to DataFrames
# We create a new column 'label' which the Trainer specifically looks for by default
train_df["label"] = train_df["bias"].map(label2id)
test_df["label"]  = test_df["bias"].map(label2id)

# Quick check
print(label2id)
train_df.head(3)

{'neutral': 0, 'partisan': 1}


Unnamed: 0,author,text,bias,type,audience,label
0,Mark Pocan (Representative from Wisconsin),Excited to join @fairvote today @NYUWashington...,neutral,policy,national,0
1,Ileana Ros-Lehtinen (Representative from Florida),Placer reunirme c la directora de @NTN24 @CGur...,neutral,media,national,0
2,George Miller (Representative from California),DID YOU KNOW: 73% of Americans want to #RaiseT...,neutral,policy,national,0


## Tokenization

Transformer models cannot read raw text directly.
Instead, they rely on a built-in tokenizer that converts text into a numerical representation.

Here, we simply:
- pass each text through the tokenizer,
- let it take care of formatting details internally,
- and ensure texts are not too long for the model.

No manual feature engineering is needed — the model handles this step for us.


In [None]:
from datasets import Dataset
from transformers import AutoTokenizer

# Define the base model - a small version of BERT to fit in memory/compute constraints
base_model_name = "sentence-transformers/all-MiniLM-L6-v2"

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(base_model_name)

# Tokenizes the text. Truncation=True ensures texts longer than max_length are cut off.
def tokenize(batch):
    return tokenizer(batch["text"], truncation=True, max_length=256, padding = "max_length")

# Convert Pandas DataFrames to Hugging Face Datasets
train_ds = Dataset.from_pandas(train_df[["text", "label"]])
test_ds  = Dataset.from_pandas(test_df[["text", "label"]])

# Apply tokenization
# We remove the 'text' column because the model only needs the numerical 'input_ids'
train_tok = train_ds.map(tokenize, batched=True).remove_columns(["text"])
test_tok  = test_ds.map(tokenize, batched=True).remove_columns(["text"])

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/383 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

Map:   0%|          | 0/1200 [00:00<?, ? examples/s]

Map:   0%|          | 0/100 [00:00<?, ? examples/s]

## LoRA setup (PEFT)

We load a BERT sequence classifier and apply LoRA adapters to attention projections.

For BERT, common LoRA targets are `query` and `value`.
This updates only a small number of parameters while leaving the base model frozen.



**LoRA parameters**

- `r`: the rank of the low-rank adapters.  
  Higher values give the model more capacity to adapt, but add more trainable parameters.

- `lora_alpha`: a scaling factor for the LoRA updates.  
  It controls how strongly the adapters influence the original model weights.

- `lora_dropout`: dropout applied inside the LoRA adapters during training.  
  This helps regularization and can reduce overfitting.

In [None]:
from transformers import AutoModelForSequenceClassification
from peft import LoraConfig, get_peft_model, TaskType

# Load the base model
model = AutoModelForSequenceClassification.from_pretrained(
    base_model_name,
    num_labels=len(label_names),
    id2label=id2label,
    label2id=label2id,
)

# LoRA Configuration
# LoRA works by adding pairs of rank-decomposition matrices to existing weights
# and only training those newly added weights.
lora_config = LoraConfig(
    task_type=TaskType.SEQ_CLS, # Sequence Classification
    r=8,                        # Rank: The dimension of the low-rank matrices. Higher = more parameters.
    lora_alpha=16,              # Alpha: Scaling factor. Usually set to 2x rank. Controls weight of adapter.
    lora_dropout=0.05,          # Dropout probability for LoRA layers
    target_modules=["query", "value"], # Modules to apply LoRA to. For BERT, query/value is standard.
)

# Wrap the base model with the LoRA configuration
model = get_peft_model(model, lora_config)

# Verify trainable parameters
# You should see a very low percentage (usually <1%)
model.print_trainable_parameters()

pytorch_model.bin:   0%|          | 0.00/45.1M [00:00<?, ?B/s]

Some weights of BertForSequenceClassification were not initialized from the model checkpoint at sentence-transformers/all-MiniLM-L6-v2 and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


trainable params: 33,282 || all params: 11,204,356 || trainable%: 0.2970


## Training

We train with HuggingFace `Trainer`.

**Key training parameters**

- `num_train_epochs`: how many times the model sees the full training dataset.  More epochs allow better learning but may lead to overfitting.

- `learning_rate`: size of each update step during training.  Smaller values are more stable; larger values learn faster but can be unstable.

- `logging_steps`: how often training progress is printed.

- `save_strategy`: controls whether model checkpoints are saved during training.  Here we disable saving.

- `report_to`: disables external logging tools (e.g., Weights & Biases).

In [None]:
from transformers import TrainingArguments, Trainer, DataCollatorWithPadding

# Training Arguments
training_args = TrainingArguments(
    output_dir="./lora_small_bert",
    per_device_train_batch_size=64,
    logging_steps = 50,
    num_train_epochs=15, # will take a bit of time
    learning_rate=2e-4, # LoRA usually requires a higher LR
    save_strategy="no",
    report_to="none"
)

# Initialize Trainer
trainer = Trainer(
    model=model,  # LoRA-augmented BERT model whose trainable parameters will be optimized
    args=training_args,  # Training hyperparameters (learning rate, epochs, batch size, logging, etc.)
    train_dataset=train_tok,  # Tokenized training data providing inputs and labels
    data_collator=DataCollatorWithPadding(tokenizer),  # Pads sequences dynamically per batch using the tokenizer
)

# Start Training
trainer.train()

model.safetensors:   0%|          | 0.00/45.1M [00:00<?, ?B/s]

Step,Training Loss
50,0.6922
100,0.6613
150,0.6154
200,0.5806
250,0.563


TrainOutput(global_step=285, training_loss=0.6153738523784437, metrics={'train_runtime': 72.6933, 'train_samples_per_second': 247.616, 'train_steps_per_second': 3.921, 'total_flos': 90108702720000.0, 'train_loss': 0.6153738523784437, 'epoch': 15.0})

### Evaluation

To fairly compare the base model and the LoRA-fine-tuned model, we must evaluate them using exactly the same procedure.

Instead of writing our own evaluation loop, we use HuggingFace’s built-in **Trainer.evaluate()** method.

The idea is simple:

We keep the evaluation setup fixed and only change which model is being evaluated.

To do this, we:


*  Define how accuracy should be computed.

*  Use a Trainer object to evaluate the base model.

*  Use the same setup to evaluate the LoRA-fine-tuned model.













In [None]:
import evaluate

# Load a standard accuracy metric
accuracy_metric = evaluate.load("accuracy")

def compute_metrics(eval_pred):
    """
    This function tells the Trainer how to measure performance.

    It receives:
    - model outputs (logits)
    - the correct labels

    It returns:
    - classification accuracy
    """
    logits, labels = eval_pred

    # Convert model scores into predicted class labels
    preds = logits.argmax(axis=-1)

    # Compare predictions to true labels and compute accuracy
    return accuracy_metric.compute(
        predictions=preds,
        references=labels
    )


Downloading builder script: 0.00B [00:00, ?B/s]

In [None]:
from transformers import Trainer

# Load the base model
base_model = AutoModelForSequenceClassification.from_pretrained(
    base_model_name,
    num_labels=len(label_names),
    id2label=id2label,
    label2id=label2id,
)

# Create a Trainer for the base (not fine-tuned) model
base_trainer = Trainer(
    model=base_model,        # pretrained model with no task-specific adaptation
    args=training_args,      # evaluation settings (batch size, device, etc.)
    eval_dataset=test_tok,   # test data (never seen during training)
    tokenizer=tokenizer,     # tokenizer used to prepare inputs
    compute_metrics=compute_metrics,  # how to compute accuracy
)

# Run evaluation
base_metrics = base_trainer.evaluate()

# Print baseline accuracy
print("Accuracy before fine-tuning:", base_metrics["eval_accuracy"])


Some weights of BertForSequenceClassification were not initialized from the model checkpoint at sentence-transformers/all-MiniLM-L6-v2 and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
  base_trainer = Trainer(


Accuracy before fine-tuning: 0.5


In [None]:
# Create a Trainer for the LoRA-fine-tuned model
lora_trainer = Trainer(
    model=trainer.model,     # model after LoRA fine-tuning
    args=training_args,      # same evaluation settings
    eval_dataset=test_tok,   # same test data
    tokenizer=tokenizer,     # same tokenizer
    compute_metrics=compute_metrics,  # same accuracy computation
)

# Run evaluation
lora_metrics = lora_trainer.evaluate()

# Print accuracy after fine-tuning
print("Accuracy after fine-tuning:", lora_metrics["eval_accuracy"])


  lora_trainer = Trainer(


Accuracy after fine-tuning: 0.69


**LoRA allows us to fine-tune a language model efficiently by training only a small number of additional parameters, while keeping the original model mostly fixed.**


**TASK:** LoRA sweep (capacity vs performance)
Try:
- `r ∈ {4, 8, 16}`
- `lora_alpha ∈ {8, 16, 32}`

How the performance changes? Why do you think it is?