# Finetuning roBERTa for Political Media Bias Detection

## Step 1: Data pre-processing

### Import packages & libraries

In [None]:
pip install transformers datasets torch scikit-learn pandas



In [None]:
pip install --upgrade wandb --upgrade transformers

Collecting transformers
  Downloading transformers-4.57.1-py3-none-any.whl.metadata (43 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.0/44.0 kB[0m [31m1.1 MB/s[0m eta [36m0:00:00[0m
Downloading transformers-4.57.1-py3-none-any.whl (12.0 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m12.0/12.0 MB[0m [31m90.2 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: transformers
  Attempting uninstall: transformers
    Found existing installation: transformers 4.57.0
    Uninstalling transformers-4.57.0:
      Successfully uninstalled transformers-4.57.0
Successfully installed transformers-4.57.1


In [None]:
import numpy as np
import pandas as pd
import torch
from sklearn.model_selection import train_test_split
from transformers import RobertaForSequenceClassification, RobertaTokenizer
from transformers import BertTokenizer
from datasets import Dataset
from transformers import BertForSequenceClassification, Trainer, TrainingArguments, EarlyStoppingCallback
from sklearn.metrics import accuracy_score, precision_recall_fscore_support
from sklearn.utils.class_weight import compute_class_weight
from peft import LoraConfig, get_peft_model
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


### Load & inspect dataset

In [None]:
# load labeled data (for training and TV split)
labeled_data_path = '/content/drive/My Drive/dsa3101/labelled_data_clean.csv'
df_labelled = pd.read_csv(labeled_data_path)

In [None]:
# check the first few rows of data
df_labelled.head(10)

Unnamed: 0,title,author,permalink,body,bias,bias_text
0,"Bomb Suspect Changed After Trip Abroad, Friend...",N. R. Kleinfield,http://www.nytimes.com/2016/09/20/nyregion/ahm...,"Besides his most recent trip to Quetta , Mr. R...",0,left
1,Why Susan Collins claims she’s being bribed ov...,"Emily Stewart, Terry Nguyen, Rebecca Jennings,...",https://www.vox.com/policy-and-politics/2018/9...,Is Maine Republican Sen. Susan Collins being b...,0,left
2,Poll: Prestigious Colleges Won't Make You Happ...,Anya Kamenetz,http://www.npr.org/blogs/thetwo-way/2014/05/06...,Poll : Prestigious Colleges Wo n't Make You Ha...,0,left
3,Paul Ryan Reportedly Says No Chance for Border...,Ian Mason,http://www.breitbart.com/big-government/2017/0...,"House Speaker Paul Ryan , at a private dinner ...",2,right
4,OPINION: Trump seeking change of legal fortune...,Analysis Stephen Collinson,https://www.cnn.com/2019/07/11/politics/donald...,( CNN ) President Donald Trump has reason to h...,0,left
5,PAUL: Blocking the pathway to a national ID,Sen. Rand Paul,http://www.washingtontimes.com/news/2013/may/2...,The controversial immigration-reform bill that...,2,right
6,Dick Morris Says He Is Working On An RNC Ad Ai...,,http://mediamatters.org/blog/2013/03/28/dick-m...,Dick Morris is working with Republican Nationa...,0,left
7,WSJ Economist Moore: No Grounds Logic for Obam...,"Jim Meyers, John Bachman",http://www.newsmax.com/Newsfront/moore-obama-t...,Wall Street Journal economics expert Stephen M...,2,right
8,Bernie Surges,,https://www.theflipside.io/archives/bernie-surges,The left believes Sanders ’ s chances have imp...,1,center
9,AOC for president? The buzz has begun,,https://www.politico.com/news/2019/12/27/aoc-p...,Sanders and Ocasio-Cortez ’ s fans have also b...,0,left


In [None]:
# check the unique values in the 'Label' column and their counts
label_counts = df_labelled['bias_text'].value_counts()

# display the unique labels and their counts
print(label_counts)

# clean the 'body' column to ensure all entries are strings
df_labelled['body'] = df_labelled['body'].fillna('').astype(str)

bias_text
right     13719
left      12930
center    10791
Name: count, dtype: int64


In [None]:
# load unlabelled data (for prediction)
unlabelled_data_path = '/content/drive/My Drive/dsa3101/unlabelled_data_clean.csv'
df_unlabelled = pd.read_csv(unlabelled_data_path)

In [None]:
# check the first few rows of data
df_unlabelled.head(10)

Unnamed: 0,title,author,permalink,body
0,Nancy Pelosi Has Amassed ~$200 Million Since F...,Own_Palpitation_8477,/r/Askpolitics/comments/1hcmrgi/nancy_pelosi_h...,"As the title says, how do folks who see their ..."
1,"Elon Musk is $70,000,000,000 richer since supp...",hotdogman200,/r/Askpolitics/comments/1hcssgg/elon_musk_is_7...,"Keep in mind he is not just a donor, he is now..."
2,For all of the people who claim California is ...,Advanced_Aspect_7601,/r/Askpolitics/comments/1hn3z7a/for_all_of_the...,I've noticed California has kind of become the...
3,"Trump voters, did you believe Trump when he sa...",Snarkasm71,/r/Askpolitics/comments/1gxon1k/trump_voters_d...,Now that Donald Trump has nominated the archit...
4,Trump Supporters - How Are You Feeling About T...,chewbaccasaux,/r/Askpolitics/comments/1grgs4c/trump_supporte...,As an (apparently out of touch) liberal democr...
5,"Conservatives, how do you feel about trump adm...",themontajew,/r/Askpolitics/comments/1h9tu5l/conservatives_...,Now that everyone seems on the same page of ho...
6,Do people actually believe that racism and mis...,Feeling-Currency6212,/r/Askpolitics/comments/1h1kc07/do_people_actu...,For the liberals or anyone who voted for Kamal...
7,How did Joe Biden go from 81 million votes to ...,[deleted],/r/Askpolitics/comments/1hqkm3a/how_did_joe_bi...,The last 4 years have been something. We saw B...
8,Jimmy Carter has died. Let’s take a moment and...,[deleted],/r/Askpolitics/comments/1hp6gu8/jimmy_carter_h...,"As the title suggests, can we even briefly say..."
9,"MAGA supporters and Republicans, what do you t...",DreamLunatik,/r/Askpolitics/comments/1hco8s7/maga_supporter...,Today it was reported https://www.reddit.com/r...


## Step 2: Tokenize the text and prepare the data for roBERTa

In [None]:
# load the roBERTa tokenizer
tokenizer = RobertaTokenizer.from_pretrained('roberta-base')

# preprocess function to tokenize text (using 'body' column)
def preprocess_function(examples):
    return tokenizer(examples['body'], truncation=True, padding=True, max_length=256)

# create label mapping
print("\nActual labels in dataset:")
print(df_labelled['bias_text'].unique())

label_mapping = {'center': 0, 'left': 1, 'right': 2}

# encode labels to integers
df_labelled['labels'] = df_labelled['bias_text'].map(label_mapping)

# Ccnvert to int
df_labelled['labels'] = df_labelled['labels'].astype(int)

print("\nMapped label distribution:")
print(df_labelled['labels'].value_counts().sort_index())

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/25.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

config.json:   0%|          | 0.00/481 [00:00<?, ?B/s]


Actual labels in dataset:
['left' 'right' 'center']

Mapped label distribution:
labels
0    10791
1    12930
2    13719
Name: count, dtype: int64


## Step 3: TV Split

In [None]:
# split the labeled data into train and validation sets
train_df, val_df = train_test_split(
    df_labelled,
    test_size=0.15,
    random_state=42,
    stratify=df_labelled['labels']
)

print(f"\nTrain set size: {len(train_df)}")
print("Train label distribution:")
print(train_df['labels'].value_counts().sort_index())

print(f"\nValidation set size: {len(val_df)}")
print("Validation label distribution:")
print(val_df['labels'].value_counts().sort_index())

# convert pandas DataFrame to Hugging Face Dataset format
train_dataset = Dataset.from_pandas(train_df[['body', 'labels']])
val_dataset = Dataset.from_pandas(val_df[['body', 'labels']])

# tokenize the datasets
train_dataset = train_dataset.map(preprocess_function, batched=True)
val_dataset = val_dataset.map(preprocess_function, batched=True)

# set format for PyTorch (IMPORTANT: includes 'labels')
train_dataset.set_format(type='torch', columns=['input_ids', 'attention_mask', 'labels'])
val_dataset.set_format(type='torch', columns=['input_ids', 'attention_mask', 'labels'])

# for the unlabeled dataset, tokenize the 'body' column
df_unlabelled['body'] = df_unlabelled['body'].apply(str)
unlabeled_dataset = Dataset.from_pandas(df_unlabelled[['body']])
unlabeled_dataset = unlabeled_dataset.map(preprocess_function, batched=True)
unlabeled_dataset.set_format(type='torch', columns=['input_ids', 'attention_mask'])

# check tokenized output (optional)
print("\nTokenized Train Dataset Example:")
print(train_dataset[0])

decoded_text = tokenizer.decode(train_dataset[0]['input_ids'])
print(f"\nDecoded text preview: {decoded_text[:200]}...")


Train set size: 31824
Train label distribution:
labels
0     9172
1    10991
2    11661
Name: count, dtype: int64

Validation set size: 5616
Validation label distribution:
labels
0    1619
1    1939
2    2058
Name: count, dtype: int64


Map:   0%|          | 0/31824 [00:00<?, ? examples/s]

Map:   0%|          | 0/5616 [00:00<?, ? examples/s]

Map:   0%|          | 0/93 [00:00<?, ? examples/s]


Tokenized Train Dataset Example:
{'labels': tensor(0), 'input_ids': tensor([    0,   846,  9211, 13851,   961, 11687,    14,  2455,     5,  1226,
            7,  1136,   160,     5,  2358, 15344,    74,    28,    10,  1099,
          631,   479, 50118, 28747,  1767,    74,    28,   847,  2156,  2556,
           74,  1430,  3625,    15,    10,  1647,     9,  1791,  2156,     8,
          309,     7,     5,  9588,  8587,  1387,  2156,     5,   866,    74,
         1136,   124,    88,  7306,   479, 50118,  1708,   120,    42,  4832,
         1648,   114,    70,     9,   167,   383,  1369,  2156,    89,    74,
          202,    28,    10,  1229,  3781,   479, 50118,  1779,    24,   606,
            7,  9072,     5,  2358, 15344,    93,    14,  4069,     9,   629,
         3488,     8,  1408,  2599,    14,    74,  6885,  1642,    11,   644,
         3867,  1148,     8,     5,   394,  1149,    11,    93,   752,  1229,
        25259,  8995,  6888,  7232,  4072,     7,  2422,   462, 11649,   

## Step 4: Compute class weights and use custom trainer

In [None]:
# compute class weights since classes/ labels are imbalanced
print("COMPUTING CLASS WEIGHTS")

class_weights = compute_class_weight(
    'balanced',
    classes=np.unique(train_df['labels']),
    y=train_df['labels']
)
class_weights_tensor = torch.tensor(class_weights, dtype=torch.float)

print(f"\nClass weights (to handle imbalance):")
for i, weight in enumerate(class_weights):
    print(f"  Class {i}: {weight:.4f}")

# custom trainer with weighted classes/ labels
class WeightedTrainer(Trainer):
    def compute_loss(self, model, inputs, return_outputs=False, num_items_in_batch=None):
        labels = inputs.pop("labels")
        outputs = model(**inputs)
        logits = outputs.logits

        # apply class weights to loss
        loss_fct = torch.nn.CrossEntropyLoss(weight=class_weights_tensor.to(logits.device), label_smoothing=0.1)
        loss = loss_fct(logits, labels)

        return (loss, outputs) if return_outputs else loss

# eval metrics function
from sklearn.metrics import accuracy_score, f1_score, precision_score, recall_score

def compute_metrics(eval_pred):
    predictions, labels = eval_pred
    preds = predictions.argmax(axis=1)

    # overall metrics only
    acc = accuracy_score(labels, preds)
    f1 = f1_score(labels, preds, average='weighted')
    precision = precision_score(labels, preds, average='weighted')
    recall = recall_score(labels, preds, average='weighted')

    # optional: prediction distribution
    unique, counts = np.unique(preds, return_counts=True)
    print(f"\n Prediction distribution: {dict(zip(unique, counts))}")

    return {
        'accuracy': acc,
        'f1': f1,
        'precision': precision,
        'recall': recall
    }


COMPUTING CLASS WEIGHTS

Class weights (to handle imbalance):
  Class 0: 1.1566
  Class 1: 0.9652
  Class 2: 0.9097


## Step 5: Full Finetune RoBERTa

In [None]:
# load roBERTa model
num_labels = len(df_labelled['labels'].unique())
print(f"\nNumber of classes: {num_labels}")



model = RobertaForSequenceClassification.from_pretrained(
    'roberta-base',
    num_labels=num_labels,
    problem_type="single_label_classification",
    hidden_dropout_prob=0.3,           # ADD THIS
    attention_probs_dropout_prob=0.3,  # ADD THIS
)

# finetuning parameters
training_args = TrainingArguments(
    output_dir='./results',
    num_train_epochs=10,                  # longer training for full convergence
    per_device_train_batch_size=16,       # larger batch size improves stability
    per_device_eval_batch_size=32,        # faster evaluation
    learning_rate=1e-5,                   # lower LR for stable fine-tuning
    warmup_ratio=0.1,                     # small warmup helps convergence
    weight_decay=0.1,                    # standard weight decay
    logging_strategy="epoch",             # only log per epoch
    eval_strategy="epoch",                # evaluate per epoch
    save_strategy="epoch",                # save per epoch
    save_total_limit=3,                   # keep 3 best checkpoints
    load_best_model_at_end=True,          # automatically load best model
    metric_for_best_model="f1",           # optimize for F1 score
    greater_is_better=True,
    report_to="none",
    seed=42,
    fp16=torch.cuda.is_available(),       # use mixed precision if GPU supports
    gradient_accumulation_steps=2,        # larger effective batch possible
    dataloader_num_workers=2,             # speed up data loading
    remove_unused_columns=False,          # avoid data loss
)

# initialise weighted trainer
trainer = WeightedTrainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=val_dataset,
    compute_metrics=compute_metrics,
    callbacks=[EarlyStoppingCallback(early_stopping_patience=3)]
)

# start training
trainer.train()

# save model to gdrive
model.save_pretrained('/content/drive/My Drive/dsa3101/bias_detection_model_roberta')
tokenizer.save_pretrained('/content/drive/My Drive/dsa3101/bias_detection_model_roberta')

print("\n✓ Model and tokenizer saved to Google Drive!")


Number of classes: 3


model.safetensors:   0%|          | 0.00/499M [00:00<?, ?B/s]

Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.bias', 'classifier.dense.weight', 'classifier.out_proj.bias', 'classifier.out_proj.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch,Training Loss,Validation Loss,Accuracy,F1,Precision,Recall
1,1.0301,0.815416,0.665776,0.663716,0.685234,0.665776
2,0.7663,0.758268,0.749644,0.748384,0.762122,0.749644
3,0.6839,0.813017,0.722044,0.714509,0.770584,0.722044
4,0.6449,0.760732,0.757301,0.754648,0.786012,0.757301
5,0.6158,0.765742,0.763177,0.760523,0.786825,0.763177
6,0.5962,0.753264,0.769587,0.765127,0.793077,0.769587
7,0.5812,0.73493,0.779024,0.775771,0.797977,0.779024
8,0.5661,0.785035,0.757123,0.752556,0.792127,0.757123
9,0.5543,0.754058,0.774395,0.770864,0.798462,0.774395
10,0.544,0.76943,0.772436,0.768613,0.799284,0.772436



📊 Prediction distribution: {np.int64(0): np.int64(2049), np.int64(1): np.int64(2155), np.int64(2): np.int64(1412)}

📊 Prediction distribution: {np.int64(0): np.int64(1414), np.int64(1): np.int64(2493), np.int64(2): np.int64(1709)}

📊 Prediction distribution: {np.int64(0): np.int64(1550), np.int64(1): np.int64(2921), np.int64(2): np.int64(1145)}

📊 Prediction distribution: {np.int64(0): np.int64(1491), np.int64(1): np.int64(2707), np.int64(2): np.int64(1418)}

📊 Prediction distribution: {np.int64(0): np.int64(1513), np.int64(1): np.int64(2640), np.int64(2): np.int64(1463)}

📊 Prediction distribution: {np.int64(0): np.int64(1736), np.int64(1): np.int64(2516), np.int64(2): np.int64(1364)}

📊 Prediction distribution: {np.int64(0): np.int64(1710), np.int64(1): np.int64(2463), np.int64(2): np.int64(1443)}

📊 Prediction distribution: {np.int64(0): np.int64(1578), np.int64(1): np.int64(2743), np.int64(2): np.int64(1295)}

📊 Prediction distribution: {np.int64(0): np.int64(1629), np.int64(1): n

## Step 6: Evaluate model performance

In [None]:
# load saved model and tokenizer from the given path
model_path = '/content/drive/My Drive/dsa3101/bias_detection_model_roberta'

model = RobertaForSequenceClassification.from_pretrained(model_path)
tokenizer = RobertaTokenizer.from_pretrained(model_path)

training_args = TrainingArguments(
    output_dir='./results',
    num_train_epochs=10,                  # longer training for full convergence
    per_device_train_batch_size=16,       # larger batch size improves stability
    per_device_eval_batch_size=32,        # faster evaluation
    learning_rate=1e-5,                   # lower LR for stable fine-tuning
    warmup_ratio=0.1,                     # small warmup helps convergence
    weight_decay=0.1,                     # standard weight decay
    logging_strategy="epoch",             # only log per epoch
    eval_strategy="epoch",                # evaluate per epoch
    save_strategy="epoch",                # save per epoch
    save_total_limit=3,                   # keep 3 best checkpoints
    load_best_model_at_end=True,          # automatically load best model
    metric_for_best_model="f1",           # optimize for F1 score
    greater_is_better=True,
    report_to="none",
    seed=42,
    fp16=torch.cuda.is_available(),       # use mixed precision if GPU supports
    gradient_accumulation_steps=2,        # larger effective batch possible
    dataloader_num_workers=2,             # speed up data loading
    remove_unused_columns=False,          # avoid data loss
)

# prepare the Trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=val_dataset,
    compute_metrics=compute_metrics,
    callbacks=[EarlyStoppingCallback(early_stopping_patience=5)]
)

# final evaluation on the validation set
print("FINAL EVALUATION")
results = trainer.evaluate()
print("\nEvaluation Results:", results)


# Get predictions on the validation set
predictions = trainer.predict(val_dataset)
preds = predictions.predictions.argmax(axis=1)

# validation set prediction distribution
print("\nValidation set PREDICTION distribution:")
unique, counts = np.unique(preds, return_counts=True)
for label, count in zip(unique, counts):
    percentage = count / len(preds) * 100
    print(f"  Class {label}: {count} ({percentage:.1f}%)")

# validation set true label distribution
print("\nValidation set TRUE label distribution:")
true_labels = val_df['labels'].values
unique, counts = np.unique(true_labels, return_counts=True)
for label, count in zip(unique, counts):
    percentage = count / len(true_labels) * 100
    print(f"  Class {label}: {count} ({percentage:.1f}%)")

FINAL EVALUATION



📊 Prediction distribution: {np.int64(0): np.int64(1710), np.int64(1): np.int64(2463), np.int64(2): np.int64(1443)}

Evaluation Results: {'eval_loss': 0.5870651006698608, 'eval_model_preparation_time': 0.0054, 'eval_accuracy': 0.7790242165242165, 'eval_f1': 0.7757705192794976, 'eval_precision': 0.7979770057337319, 'eval_recall': 0.7790242165242165, 'eval_runtime': 19.2602, 'eval_samples_per_second': 291.586, 'eval_steps_per_second': 9.138}
PREDICTION SANITY CHECK

📊 Prediction distribution: {np.int64(0): np.int64(1710), np.int64(1): np.int64(2463), np.int64(2): np.int64(1443)}

Validation set PREDICTION distribution:
  Class 0: 1710 (30.4%)
  Class 1: 2463 (43.9%)
  Class 2: 1443 (25.7%)

Validation set TRUE label distribution:
  Class 0: 1619 (28.8%)
  Class 1: 1939 (34.5%)
  Class 2: 2058 (36.6%)


In [None]:
# get predictions for the unlabeled dataset using the trained model
predictions_unlabeled = trainer.predict(unlabeled_dataset)  # Use `trainer` to predict
predicted_labels = predictions_unlabeled.predictions.argmax(axis=1)
predicted_probs = torch.softmax(torch.tensor(predictions_unlabeled.predictions), dim=1).numpy()

# add predicted labels to the dataframe
df_unlabelled['predicted_label'] = [reverse_label_mapping[label] for label in predicted_labels]

# add confidence scores (the max probability for each prediction)
df_unlabelled['confidence'] = predicted_probs.max(axis=1)

# add probabilities for each class (left, center, right)
for i, label_name in enumerate(['left', 'center', 'right']):
    df_unlabelled[f'prob_{label_name}'] = predicted_probs[:, i]

# display the dataframe with the predictions and additional information
print("\nDataFrame with Predictions:")
print(df_unlabelled[['body', 'predicted_label', 'confidence', 'prob_left', 'prob_center', 'prob_right']].head(10))  # Print first 10 rows

# save predictions to a CSV file
df_unlabelled.to_csv('/content/drive/My Drive/dsa3101/fullfinetuned_valtest_predictions.csv', index=False)
print("\n✓ Predictions saved to 'predictions.csv'!")



DataFrame with Predictions:
                                                body predicted_label  \
0  As the title says, how do folks who see their ...            left   
1  Keep in mind he is not just a donor, he is now...           right   
2  I've noticed California has kind of become the...            left   
3  Now that Donald Trump has nominated the archit...          center   
4  As an (apparently out of touch) liberal democr...          center   
5  Now that everyone seems on the same page of ho...            left   
6  For the liberals or anyone who voted for Kamal...           right   
7  The last 4 years have been something. We saw B...           right   
8  As the title suggests, can we even briefly say...            left   
9  Today it was reported https://www.reddit.com/r...            left   

   confidence  prob_left  prob_center  prob_right  
0    0.596295   0.596295     0.270978    0.132727  
1    0.804331   0.052896     0.142772    0.804331  
2    0.929926   0.9299

## Step 7: Make predictions on unlabelled data

In [None]:
# load the saved model and tokenizer
model_path = '/content/drive/My Drive/dsa3101/bias_detection_model_roberta'
model = RobertaForSequenceClassification.from_pretrained(model_path)
tokenizer = RobertaTokenizer.from_pretrained(model_path)

# prepare the trainer
trainer = Trainer(
    model=model,
    args=training_args,  # Use the same training arguments you used previously
    train_dataset=train_dataset,
    eval_dataset=val_dataset,
    compute_metrics=compute_metrics,
    callbacks=[EarlyStoppingCallback(early_stopping_patience=5)]
)

# make predictions on the unlabeled dataset
predictions_unlabeled = trainer.predict(unlabeled_dataset)
predicted_labels = predictions_unlabeled.predictions.argmax(axis=1)
predicted_probs = torch.softmax(torch.tensor(predictions_unlabeled.predictions), dim=1).numpy()

# convert back to label names (assuming label_mapping is defined)
reverse_label_mapping = {v: k for k, v in label_mapping.items()}
df_unlabelled['predicted_label'] = [reverse_label_mapping[label] for label in predicted_labels]

# add confidence scores (the max probability for each prediction)
df_unlabelled['confidence'] = predicted_probs.max(axis=1)

# add probabilities for each class (neutral, left, right)
for i, label_name in enumerate(['neutral', 'left', 'right']):
    df_unlabelled[f'prob_{label_name}'] = predicted_probs[:, i]

# show prediction distribution
print("\nPrediction distribution on unlabeled data:")
print(df_unlabelled['predicted_label'].value_counts())

# show the average confidence
print(f"\nAverage confidence: {df_unlabelled['confidence'].mean():.4f}")

# Save predictions to CSV
df_unlabelled.to_csv('/content/drive/My Drive/dsa3101/fullfinetuned_unlabelled_predictions.csv', index=False)
print("\n✓ Predictions saved to 'predictions.csv'!")

# show some sample predictions
print("\nSample predictions:")
print(df_unlabelled[['body', 'predicted_label', 'confidence']].head(10))


Prediction distribution on unlabeled data:
predicted_label
center    52
right     23
left      18
Name: count, dtype: int64

Average confidence: 0.7120

✓ Predictions saved to 'predictions.csv'!

Sample predictions:
                                                body predicted_label  \
0  As the title says, how do folks who see their ...          center   
1  Keep in mind he is not just a donor, he is now...           right   
2  I've noticed California has kind of become the...          center   
3  Now that Donald Trump has nominated the archit...            left   
4  As an (apparently out of touch) liberal democr...            left   
5  Now that everyone seems on the same page of ho...          center   
6  For the liberals or anyone who voted for Kamal...           right   
7  The last 4 years have been something. We saw B...           right   
8  As the title suggests, can we even briefly say...          center   
9  Today it was reported https://www.reddit.com/r...          c