Idea: We can leverage automated fact-checking frameworks for bias detection.

https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00454/109469/A-Survey-on-Automated-Fact-Checking

This article highlights the following pipeline:

![image.png](attachment:image.png)

For our purposes, we will use a slightly modified version of this:
1) Claim detection: Transformer classifier
2) Source checking: Which sources are saying similar things? If we know the source, we can also reference MediaBias chart for their standings.
3) Bias detection: Transformer classifier
4) Prediction & Check with LLM

# Step 1: Transformer classifier for claim detection.

There's 3 methods I wanna try (ranked from easiest to hardest):
- Feature extraction classifier
- Fine-tuning output layers
- Fine-tuning all layers

But before we do any of that, let's define a dataset for claim detection.

In [2]:
!pip install -U datasets huggingface_hub fsspec lightning

Collecting datasets
  Downloading datasets-4.0.0-py3-none-any.whl.metadata (19 kB)
Collecting lightning
  Downloading lightning-2.5.2-py3-none-any.whl.metadata (38 kB)
Collecting fsspec
  Downloading fsspec-2025.3.0-py3-none-any.whl.metadata (11 kB)
Collecting lightning-utilities<2.0,>=0.10.0 (from lightning)
  Downloading lightning_utilities-0.14.3-py3-none-any.whl.metadata (5.6 kB)
Collecting torchmetrics<3.0,>=0.7.0 (from lightning)
  Downloading torchmetrics-1.8.0-py3-none-any.whl.metadata (21 kB)
Collecting pytorch-lightning (from lightning)
  Downloading pytorch_lightning-2.5.2-py3-none-any.whl.metadata (21 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch<4.0,>=2.1.0->lightning)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch<4.0,>=2.1.0->lightning)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvid

In [None]:
# We will be using the LIAR2 dataset https://aclanthology.org/P17-2067/ because it contains relevant language to our purpose (article bias detection)

import pandas as pd
import datasets

dataset = datasets.load_dataset("chengxuphd/liar2")

statement_train, y_train = dataset["train"]["statement"], dataset["train"]["label"]
statement_val, y_val = dataset["validation"]["statement"], dataset["validation"]["label"]
statement_test, y_test = dataset["test"]["statement"], dataset["test"]["label"]

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


README.md: 0.00B [00:00, ?B/s]

train.csv:   0%|          | 0.00/19.0M [00:00<?, ?B/s]

valid.csv: 0.00B [00:00, ?B/s]

test.csv: 0.00B [00:00, ?B/s]

Generating train split:   0%|          | 0/18369 [00:00<?, ? examples/s]

Generating validation split:   0%|          | 0/2297 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/2296 [00:00<?, ? examples/s]

The "label" provides information about how true a statement is, but has been proven inaccurate at least when I ran it (I got about 31-32% test accuracies). Instead, we're going to train a one-class classifier for this task

Now, let's work with the transformer: https://github.com/rasbt/MachineLearning-QandAI-book/blob/main/supplementary/q18-using-llms/

1. Tokenization & Numericalization

In [None]:
# Make sure all these are installed.
import os.path as op

import lightning as L
from lightning.pytorch.loggers import CSVLogger
from lightning.pytorch.callbacks import ModelCheckpoint

import numpy as np
import pandas as pd
import torch

from sklearn.feature_extraction.text import CountVectorizer

In [None]:
import torch

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(device)

cpu


In [None]:
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")
print("Tokenizer input max length:", tokenizer.model_max_length)
print("Tokenizer vocabulary size:", tokenizer.vocab_size)

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/483 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

Tokenizer input max length: 512
Tokenizer vocabulary size: 30522


In [None]:
tokenized_train = tokenizer(list(statement_train), padding=True, truncation=True, return_tensors="pt")
tokenized_val = tokenizer(list(statement_val), padding=True, truncation=True, return_tensors="pt")
tokenized_test = tokenizer(list(statement_test), padding=True, truncation=True, return_tensors="pt")

2. DistilBERT as Feature Extraction

In [None]:
from transformers import AutoModel

model = AutoModel.from_pretrained("distilbert-base-uncased")
model.to(device);

model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

In [None]:
from datasets import Dataset

# Step 1 — create a raw Hugging Face dataset from your input text
raw_train = Dataset.from_list([{"statement": s} for s in statement_train])
raw_val = Dataset.from_list([{"statement": s} for s in statement_val])
raw_test = Dataset.from_list([{"statement": s} for s in statement_test])

# Step 2 — tokenize using map so the result stays a Dataset object
tokenized_train = raw_train.map(lambda x: tokenizer(x["statement"], padding=True, truncation=True), batched=True)
tokenized_val = raw_val.map(lambda x: tokenizer(x["statement"], padding=True, truncation=True), batched=True)
tokenized_test = raw_test.map(lambda x: tokenizer(x["statement"], padding=True, truncation=True), batched=True)

Map:   0%|          | 0/18369 [00:00<?, ? examples/s]

Map:   0%|          | 0/2297 [00:00<?, ? examples/s]

Map:   0%|          | 0/2296 [00:00<?, ? examples/s]

In [None]:
import torch

@torch.inference_mode()
def get_output_embeddings(batch):
    input_ids = torch.tensor(batch["input_ids"]).to(device)
    attention_mask = torch.tensor(batch["attention_mask"]).to(device)

    output = model(input_ids, attention_mask=attention_mask).last_hidden_state[:, 0]
    return {"features": output.cpu().numpy()}

In [None]:
import time
start = time.time()

train_features = tokenized_train.map(get_output_embeddings, batched=True, batch_size=10)

Map:   0%|          | 0/18369 [00:00<?, ? examples/s]

In [None]:
import time
start = time.time()

val_features = tokenized_val.map(get_output_embeddings, batched=True, batch_size=10)

Map:   0%|          | 0/2297 [00:00<?, ? examples/s]

In [None]:
import time
start = time.time()

test_features = tokenized_test.map(get_output_embeddings, batched=True, batch_size=10)

Map:   0%|          | 0/2296 [00:00<?, ? examples/s]

In [None]:
import numpy as np

# Saving
# np.save("X_train.npy", np.stack(train_features["features"]))
# np.save("X_val.npy", np.stack(val_features["features"]))
# np.save("X_test.npy", np.stack(test_features["features"]))

# Loading
X_train = np.load("X_train.npy")
X_val = np.load("X_val.npy")
X_test = np.load("X_test.npy")
# y_train = np.array(y_train)
# y_val = np.array(y_val)
# y_test = np.array(y_test)

FileNotFoundError: [Errno 2] No such file or directory: 'X_train.npy'

In [None]:
X_train = np.array(train_features["features"])
y_train = np.array(y_train)

X_val = np.array(val_features["features"])
y_val = np.array(y_val)

X_test = np.array(test_features["features"])
y_test = np.array(y_test)

3) Train a baseline model on these embeddings

In [None]:
from sklearn.linear_model import LogisticRegression

clf = LogisticRegression(max_iter=1000)
clf.fit(X_train, y_train)

print("Training accuracy", clf.score(X_train, y_train))
print("Validation accuracy", clf.score(X_val, y_val))
print("Test accuracy", clf.score(X_test, y_test))

end = time.time()
elapsed = end - start
print(f"Time elapsed {elapsed/60:.2f} min")

Training accuracy 0.4216342751374598
Validation accuracy 0.31911188506747934
Test accuracy 0.3170731707317073
Time elapsed 6.86 min


STOP: TOTAL NO. OF ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(


In [None]:
from sklearn.ensemble import RandomForestClassifier

clf = RandomForestClassifier()
clf.fit(X_train, y_train)

print("Training accuracy", clf.score(X_train, y_train))
print("Validation accuracy", clf.score(X_val, y_val))
print("Test accuracy", clf.score(X_test, y_test))

Training accuracy 0.9999455604551146
Validation accuracy 0.3239007400957771
Test accuracy 0.31837979094076657


In [None]:
from sklearn.svm import OneClassSVM

# Fit only on "claim" features
model = OneClassSVM(kernel='rbf', gamma='scale', nu=0.05)
model.fit(X_train)

# Predict on validation and test sets
val_preds = model.predict(X_val)
test_preds = model.predict(X_test)

# Interpret output: +1 means in-class (claim), -1 means outlier (not claim)

In [None]:
correct = 0
incorrect = 0
total = 0
for val in val_preds:
  if val == 1:
    correct += 1
  else:
    incorrect += 1
  total += 1
print(f"Accuracy: {correct/total}")

Accuracy: 0.9442751414888986


Claim detection model:

In [13]:
# OneClassSVM initializer.
import numpy as np

X_train = np.load("X_train.npy")
X_val = np.load("X_val.npy")
X_test = np.load("X_test.npy")

from sklearn.svm import OneClassSVM

# Fit only on "claim" features
svm = OneClassSVM(kernel='rbf', gamma='scale', nu=0.05)
svm.fit(X_train)

# Predict on validation and test sets
val_preds = svm.predict(X_val)
test_preds = svm.predict(X_test)

def quick_accuracy(predictions):
  """
  Assumptions
  - 1 is correct (in-class), -1 is incorrect (out-class)
  """
  correct = 0
  incorrect = 0
  total = 0
  for prediction in predictions:
    if prediction == 1:
      correct += 1
    else:
      incorrect += 1
    total += 1
  print(f"Accuracy: {correct/total}")

quick_accuracy(val_preds)
quick_accuracy(test_preds)

import pickle
with open('one_class_svm.pkl', 'wb') as f:
    pickle.dump(svm, f)


Accuracy: 0.9442751414888986
Accuracy: 0.9499128919860628


In [12]:
# Pipeline for new texts

# I'm honestly not sure if we need these libraries
import os.path as op
import lightning as L
from lightning.pytorch.loggers import CSVLogger
from lightning.pytorch.callbacks import ModelCheckpoint
from sklearn.feature_extraction.text import CountVectorizer

# Extra libraries we actually need
import numpy as np
import pandas as pd

# Initialize PyTorch
import torch
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(device)

from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")
print("Tokenizer input max length:", tokenizer.model_max_length)
print("Tokenizer vocabulary size:", tokenizer.vocab_size)

from transformers import AutoModel
model = AutoModel.from_pretrained("distilbert-base-uncased")
model.to(device);

# Split the article into sentences, and tokenize each sentence.
import nltk
nltk.download('punkt_tab')
article = "It was only when the administration felt Newsom was not restoring order in the city — and after Trump watched the situation escalate for 24 hours and White House officials saw imagery of federal law enforcement officers with lacerations and other injuries — that the president moved to deploy the Guard, according to the official, who was granted anonymity to discuss private deliberations.   The president has put hundreds of National Guard troops on the streets to quell protests over his administration’s immigration raids, a deployment that state and city officials say has only inflamed tensions. “I understand the importance of deporting criminal aliens, but what we are witnessing are arbitrary measures to hunt down people who are complying with their immigration hearings — in many cases, with credible fear of persecution claims — all driven by a Miller-like desire to satisfy a self-fabricated deportation goal,” said Garcia, referring to Stephen Miller, a White House deputy chief of staff and key architect of Trump’s immigration crackdown.   Trump’s speedy deployment in California of troops against those whom the president has alluded to as “insurrectionists” on social media is a sharp contrast to his decision to issue no order or formal request for National Guard troops during the insurrection at the U.S. Capitol on Jan. 6, 2021, despite his repeated and false assertions that he had made such an offer.   The president and his top immigration aides accused the governor of mismanaging the protests, with border czar Tom Homan asserting in a Fox News interview Monday that Newsom stoked anti-ICE sentiments and waited two days to declare unlawful assembly in the city."
sentences = nltk.sent_tokenize(article)
from datasets import Dataset
raw_sentences = Dataset.from_list([{"text": s} for s in sentences])
tokenized_sentences = raw_sentences.map(lambda x: tokenizer(x["text"], padding=True, truncation=True), batched=True)

# Get embeddings
@torch.inference_mode()
def get_output_embeddings(batch):
    input_ids = torch.tensor(batch["input_ids"]).to(device)
    attention_mask = torch.tensor(batch["attention_mask"]).to(device)

    output = model(input_ids, attention_mask=attention_mask).last_hidden_state[:, 0]
    return {"features": output.cpu().numpy()}
import time
start = time.time()
sentence_features = tokenized_sentences.map(get_output_embeddings, batched=True, batch_size=10)

# Predict
predictions = svm.predict(sentence_features["features"])

# Format for output
for prediction, sentence in zip(predictions, sentences):
  label = "claim" if prediction == 1 else "not claim"
  print(f"{label}: {sentence}")




cpu
Tokenizer input max length: 512
Tokenizer vocabulary size: 30522


[nltk_data] Downloading package punkt_tab to /root/nltk_data...
[nltk_data]   Package punkt_tab is already up-to-date!


Map:   0%|          | 0/5 [00:00<?, ? examples/s]

Map:   0%|          | 0/5 [00:00<?, ? examples/s]

claim: It was only when the administration felt Newsom was not restoring order in the city — and after Trump watched the situation escalate for 24 hours and White House officials saw imagery of federal law enforcement officers with lacerations and other injuries — that the president moved to deploy the Guard, according to the official, who was granted anonymity to discuss private deliberations.
claim: The president has put hundreds of National Guard troops on the streets to quell protests over his administration’s immigration raids, a deployment that state and city officials say has only inflamed tensions.
claim: “I understand the importance of deporting criminal aliens, but what we are witnessing are arbitrary measures to hunt down people who are complying with their immigration hearings — in many cases, with credible fear of persecution claims — all driven by a Miller-like desire to satisfy a self-fabricated deportation goal,” said Garcia, referring to Stephen Miller, a White House d

Claim detection final product:

In [16]:
class ClaimDetector():
  def __init__(self, pkl_path: str):

    # Initialize the model itself
    import pickle
    with open(pkl_path, 'rb') as f:
      self.svm = pickle.load(f)

    # I'm honestly not sure if we need these libraries
    import os.path as op
    import lightning as L
    from lightning.pytorch.loggers import CSVLogger
    from lightning.pytorch.callbacks import ModelCheckpoint
    from sklearn.feature_extraction.text import CountVectorizer

    # Extra libraries we actually need
    import numpy as np
    import pandas as pd
    from datasets import Dataset

    # Initialize PyTorch
    import torch
    self.device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
    print(device)
    from transformers import AutoTokenizer
    self.tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")
    print("Tokenizer input max length:", tokenizer.model_max_length)
    print("Tokenizer vocabulary size:", tokenizer.vocab_size)
    from transformers import AutoModel
    self.model = AutoModel.from_pretrained("distilbert-base-uncased")
    self.model.to(device);

    # NLTK
    import nltk
    nltk.download('punkt_tab')

    print("Model successfully loaded!")

  def predict(self, article: str):

    # Split the article into sentences, then tokenize the sentences.
    sentences = nltk.sent_tokenize(article)
    raw_sentences = Dataset.from_list([{"text": s} for s in sentences])
    tokenized_sentences = raw_sentences.map(lambda x: self.tokenizer(x["text"], padding=True, truncation=True), batched=True)

    # Embedding function
    @torch.inference_mode()
    def get_output_embeddings(batch):
        input_ids = torch.tensor(batch["input_ids"]).to(device)
        attention_mask = torch.tensor(batch["attention_mask"]).to(device)

        output = model(input_ids, attention_mask=attention_mask).last_hidden_state[:, 0]
        return {"features": output.cpu().numpy()}
    sentence_features = tokenized_sentences.map(get_output_embeddings, batched=True, batch_size=10)

    # Predict
    predictions = svm.predict(sentence_features["features"])

    # Format for output
    output = []
    for prediction, sentence in zip(predictions, sentences):
      label = "claim" if prediction == 1 else "not claim"
      print(f"{label}: {sentence}")
      output.append((label, sentence))
    return output


Example usage:

In [25]:
# Initialize
pkl_path = "one_class_svm.pkl"
claim_detector = ClaimDetector(pkl_path)

# Works on articles and single sentences.
article = "It was only when the administration felt Newsom was not restoring order in the city — and after Trump watched the situation escalate for 24 hours and White House officials saw imagery of federal law enforcement officers with lacerations and other injuries — that the president moved to deploy the Guard, according to the official, who was granted anonymity to discuss private deliberations.   The president has put hundreds of National Guard troops on the streets to quell protests over his administration’s immigration raids, a deployment that state and city officials say has only inflamed tensions. “I understand the importance of deporting criminal aliens, but what we are witnessing are arbitrary measures to hunt down people who are complying with their immigration hearings — in many cases, with credible fear of persecution claims — all driven by a Miller-like desire to satisfy a self-fabricated deportation goal,” said Garcia, referring to Stephen Miller, a White House deputy chief of staff and key architect of Trump’s immigration crackdown.   Trump’s speedy deployment in California of troops against those whom the president has alluded to as “insurrectionists” on social media is a sharp contrast to his decision to issue no order or formal request for National Guard troops during the insurrection at the U.S. Capitol on Jan. 6, 2021, despite his repeated and false assertions that he had made such an offer.   The president and his top immigration aides accused the governor of mismanaging the protests, with border czar Tom Homan asserting in a Fox News interview Monday that Newsom stoked anti-ICE sentiments and waited two days to declare unlawful assembly in the city."
output = claim_detector.predict(article)
sentence = "Written by: Chengyi Li"
output = claim_detector.predict(sentence)
sentence = "Trump has been a good president"
output = claim_detector.predict(sentence)
sentence = "1 + 2 = 3"
output = claim_detector.predict(sentence)

cpu
Tokenizer input max length: 512
Tokenizer vocabulary size: 30522
Model successfully loaded!


[nltk_data] Downloading package punkt_tab to /root/nltk_data...
[nltk_data]   Package punkt_tab is already up-to-date!


Map:   0%|          | 0/5 [00:00<?, ? examples/s]

Map:   0%|          | 0/5 [00:00<?, ? examples/s]

claim: It was only when the administration felt Newsom was not restoring order in the city — and after Trump watched the situation escalate for 24 hours and White House officials saw imagery of federal law enforcement officers with lacerations and other injuries — that the president moved to deploy the Guard, according to the official, who was granted anonymity to discuss private deliberations.
claim: The president has put hundreds of National Guard troops on the streets to quell protests over his administration’s immigration raids, a deployment that state and city officials say has only inflamed tensions.
claim: “I understand the importance of deporting criminal aliens, but what we are witnessing are arbitrary measures to hunt down people who are complying with their immigration hearings — in many cases, with credible fear of persecution claims — all driven by a Miller-like desire to satisfy a self-fabricated deportation goal,” said Garcia, referring to Stephen Miller, a White House d

Map:   0%|          | 0/1 [00:00<?, ? examples/s]

Map:   0%|          | 0/1 [00:00<?, ? examples/s]

not claim: Written by: Chengyi Li


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

Map:   0%|          | 0/1 [00:00<?, ? examples/s]

claim: Trump has been a good president


Map:   0%|          | 0/1 [00:00<?, ? examples/s]

Map:   0%|          | 0/1 [00:00<?, ? examples/s]

not claim: 1 + 2 = 3
