Idea: We can leverage automated fact-checking frameworks for bias detection.

https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00454/109469/A-Survey-on-Automated-Fact-Checking

The following pipeline is inspired by the article above:


1.   Claim detection: OneClassSVM
2.   Source checking: Probability distribution defined by a dataset
3.   Bias detection: Transformer classifier
4.   Prediction & Justification visualizations.



# Step 1: OneClassSVM classifier on transformer embeddings for claim detection

There's 3 methods I wanna try (ranked from easiest to hardest):
- Feature extraction classifier
- Fine-tuning output layers
- Fine-tuning all layers

But before we do any of that, let's define a dataset for claim detection.

In [None]:
!pip install -U datasets huggingface_hub fsspec lightning

Collecting fsspec
  Downloading fsspec-2025.7.0-py3-none-any.whl.metadata (12 kB)
Collecting lightning
  Downloading lightning-2.5.3-py3-none-any.whl.metadata (39 kB)
Collecting lightning-utilities<2.0,>=0.10.0 (from lightning)
  Downloading lightning_utilities-0.15.2-py3-none-any.whl.metadata (5.7 kB)
Collecting torchmetrics<3.0,>0.7.0 (from lightning)
  Downloading torchmetrics-1.8.1-py3-none-any.whl.metadata (22 kB)
Collecting pytorch-lightning (from lightning)
  Downloading pytorch_lightning-2.5.3-py3-none-any.whl.metadata (20 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch<4.0,>=2.1.0->lightning)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch<4.0,>=2.1.0->lightning)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch<4.0,>=2.1.0->lightning)
  Downloading nvidi

In [None]:
# We will be using the LIAR2 dataset https://aclanthology.org/P17-2067/ because it contains relevant language to our purpose (article bias detection)

import pandas as pd
import datasets

dataset = datasets.load_dataset("chengxuphd/liar2")
statement_train, y_train = dataset["train"]["statement"], dataset["train"]["label"]
statement_val, y_val = dataset["validation"]["statement"], dataset["validation"]["label"]
statement_test, y_test = dataset["test"]["statement"], dataset["test"]["label"]

# Make sure all these are installed.
import os.path as op

import lightning as L
from lightning.pytorch.loggers import CSVLogger
from lightning.pytorch.callbacks import ModelCheckpoint

import numpy as np
import pandas as pd
import torch

from sklearn.feature_extraction.text import CountVectorizer

import torch

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(device)

from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")
print("Tokenizer input max length:", tokenizer.model_max_length)
print("Tokenizer vocabulary size:", tokenizer.vocab_size)

tokenized_train = tokenizer(list(statement_train), padding=True, truncation=True, return_tensors="pt")
tokenized_val = tokenizer(list(statement_val), padding=True, truncation=True, return_tensors="pt")
tokenized_test = tokenizer(list(statement_test), padding=True, truncation=True, return_tensors="pt")

from transformers import AutoModel
model = AutoModel.from_pretrained("distilbert-base-uncased")
model.to(device);

from datasets import Dataset

# Step 1 — create a raw Hugging Face dataset from your input text
raw_train = Dataset.from_list([{"statement": s} for s in statement_train])
raw_val = Dataset.from_list([{"statement": s} for s in statement_val])
raw_test = Dataset.from_list([{"statement": s} for s in statement_test])

# Step 2 — tokenize using map so the result stays a Dataset object
tokenized_train = raw_train.map(lambda x: tokenizer(x["statement"], padding=True, truncation=True), batched=True)
tokenized_val = raw_val.map(lambda x: tokenizer(x["statement"], padding=True, truncation=True), batched=True)
tokenized_test = raw_test.map(lambda x: tokenizer(x["statement"], padding=True, truncation=True), batched=True)

import torch

@torch.inference_mode()
def get_output_embeddings(batch):
    input_ids = torch.tensor(batch["input_ids"]).to(device)
    attention_mask = torch.tensor(batch["attention_mask"]).to(device)

    output = model(input_ids, attention_mask=attention_mask).last_hidden_state[:, 0]
    return {"features": output.cpu().numpy()}

train_features = tokenized_train.map(get_output_embeddings, batched=True, batch_size=10)
val_features = tokenized_val.map(get_output_embeddings, batched=True, batch_size=10)
test_features = tokenized_test.map(get_output_embeddings, batched=True, batch_size=10)

# Uncomment as needed!
import numpy as np

# Saving
# np.save("X_train.npy", np.stack(train_features["features"]))
# np.save("X_val.npy", np.stack(val_features["features"]))
# np.save("X_test.npy", np.stack(test_features["features"]))

# Loading
# X_train = np.load("X_train.npy")
# X_val = np.load("X_val.npy")
# X_test = np.load("X_test.npy")
# y_train = np.array(y_train)
# y_val = np.array(y_val)
# y_test = np.array(y_test)

X_train = np.array(train_features["features"])
y_train = np.array(y_train)
X_val = np.array(val_features["features"])
y_val = np.array(y_val)
X_test = np.array(test_features["features"])
y_test = np.array(y_test)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


README.md: 0.00B [00:00, ?B/s]

train.csv:   0%|          | 0.00/19.0M [00:00<?, ?B/s]

valid.csv: 0.00B [00:00, ?B/s]

test.csv: 0.00B [00:00, ?B/s]

Generating train split:   0%|          | 0/18369 [00:00<?, ? examples/s]

Generating validation split:   0%|          | 0/2297 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/2296 [00:00<?, ? examples/s]

The "label" provides information about how true a statement is, but has been proven inaccurate at least when I ran it (I got about 31-32% test accuracies). Instead, we're going to train a one-class classifier for this task

Now, let's work with the transformer using this example: https://github.com/rasbt/MachineLearning-QandAI-book/blob/main/supplementary/q18-using-llms/

3) Train a one-class model

Claim detection model:

In [None]:
# OneClassSVM initializer.
import numpy as np

X_train = np.load("X_train.npy")
X_val = np.load("X_val.npy")
X_test = np.load("X_test.npy")

from sklearn.svm import OneClassSVM

# Fit only on "claim" features
svm = OneClassSVM(kernel='rbf', gamma='scale', nu=0.05)
svm.fit(X_train)

# Predict on validation and test sets
val_preds = svm.predict(X_val)
test_preds = svm.predict(X_test)

def quick_accuracy(predictions):
  """
  Assumptions
  - 1 is correct (in-class), -1 is incorrect (out-class)
  """
  correct = 0
  incorrect = 0
  total = 0
  for prediction in predictions:
    if prediction == 1:
      correct += 1
    else:
      incorrect += 1
    total += 1
  print(f"Accuracy: {correct/total}")

quick_accuracy(val_preds)
quick_accuracy(test_preds)

import pickle
with open('one_class_svm.pkl', 'wb') as f:
    pickle.dump(svm, f)


Accuracy: 0.9442751414888986
Accuracy: 0.9499128919860628


In [None]:
# Pipeline for new texts

# I'm honestly not sure if we need these libraries
import os.path as op
import lightning as L
from lightning.pytorch.loggers import CSVLogger
from lightning.pytorch.callbacks import ModelCheckpoint
from sklearn.feature_extraction.text import CountVectorizer

# Extra libraries we actually need
import numpy as np
import pandas as pd

# Initialize PyTorch
import torch
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(device)

from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")
print("Tokenizer input max length:", tokenizer.model_max_length)
print("Tokenizer vocabulary size:", tokenizer.vocab_size)

from transformers import AutoModel
model = AutoModel.from_pretrained("distilbert-base-uncased")
model.to(device);

# Split the article into sentences, and tokenize each sentence.
import nltk
nltk.download('punkt_tab')
article = "[redacted insert text here]"
sentences = nltk.sent_tokenize(article)
from datasets import Dataset
raw_sentences = Dataset.from_list([{"text": s} for s in sentences])
tokenized_sentences = raw_sentences.map(lambda x: tokenizer(x["text"], padding=True, truncation=True), batched=True)

# Get embeddings
@torch.inference_mode()
def get_output_embeddings(batch):
    input_ids = torch.tensor(batch["input_ids"]).to(device)
    attention_mask = torch.tensor(batch["attention_mask"]).to(device)

    output = model(input_ids, attention_mask=attention_mask).last_hidden_state[:, 0]
    return {"features": output.cpu().numpy()}
import time
start = time.time()
sentence_features = tokenized_sentences.map(get_output_embeddings, batched=True, batch_size=10)

# Predict
predictions = svm.predict(sentence_features["features"])

# Format for output
for prediction, sentence in zip(predictions, sentences):
  label = "claim" if prediction == 1 else "not claim"
  print(f"{label}: {sentence}")

Claim detection final product:

In [None]:
class ClaimDetector(): # Note: This model only works on sentences because that's what the embeddings are trained on.
  def __init__(self, pkl_path: str):

    # Initialize the model itself
    import pickle
    with open(pkl_path, 'rb') as f:
      self.svm = pickle.load(f)

    # I'm honestly not sure if we need these libraries
    import os.path as op
    import lightning as L
    from lightning.pytorch.loggers import CSVLogger
    from lightning.pytorch.callbacks import ModelCheckpoint
    from sklearn.feature_extraction.text import CountVectorizer

    # Extra libraries we actually need
    import numpy as np
    import pandas as pd
    from datasets import Dataset
    from datasets.utils.logging import disable_progress_bar

    # Initialize PyTorch
    import torch
    self.device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
    print(device)
    from transformers import AutoTokenizer
    self.tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")
    print("Tokenizer input max length:", tokenizer.model_max_length)
    print("Tokenizer vocabulary size:", tokenizer.vocab_size)
    from transformers import AutoModel
    self.model = AutoModel.from_pretrained("distilbert-base-uncased")
    self.model.to(device);

    # NLTK
    import nltk
    nltk.download('punkt_tab')

    print("Model successfully loaded!")

  def split_sentences(self, article: str):
    return nltk.sent_tokenize(article)

  def get_embeddings(self, sentences: list[str]):

    # Tokenize sentences
    raw_sentences = Dataset.from_list([{"text": s} for s in sentences])
    tokenized_sentences = raw_sentences.map(lambda x: self.tokenizer(x["text"], padding=True, truncation=True), batched=True)

    # Embedding function
    @torch.inference_mode()
    def get_output_embeddings(batch):
        input_ids = torch.tensor(batch["input_ids"]).to(device)
        attention_mask = torch.tensor(batch["attention_mask"]).to(device)

        output = self.model(input_ids, attention_mask=attention_mask).last_hidden_state[:, 0]
        return {"features": output.cpu().numpy()}
    return tokenized_sentences.map(get_output_embeddings, batched=True, batch_size=10)

  def embedding_predict(self, sentences, embeddings, print_output=True):

    # Predict & get estimated probabilities
    X = embeddings
    predictions = self.svm.predict(X)
    from scipy.special import expit
    prob_estimates = expit(self.svm.decision_function(X))

    # Format for output
    output = []
    for estimated_probability, prediction, sentence in zip(prob_estimates, predictions, sentences):
      label = "claim" if prediction == 1 else "not claim"
      if print_output:
        #print(f"{int(estimated_probability * 10**5) / 10**5:.5f} {label}: {sentence}")
        print(f"{estimated_probability} {label}: {sentence}")
      output.append((estimated_probability, label, sentence))
    return output

  def text_predict(self, article: str, print_output=True):

    # Split the article into sentences.
    sentences = self.split_sentences(article)

    # Splits article into sentences, tokenizes those sentences, and gets embeddings for those sentences as a dataset.
    sentence_features = self.get_embeddings(sentences)

    # Predict & get estimated probabilities
    return self.embedding_predict(sentences, np.array(sentence_features['features']), print_output=print_output)

Example usage:

In [None]:
# Initialize
pkl_path = "one_class_svm.pkl"
claim_detector = ClaimDetector(pkl_path)

# Works on articles and single sentences.
article = "[redacted insert text here]"
output = claim_detector.text_predict(article)
sentence = "Written by: Chengyi Li"
output = claim_detector.text_predict(sentence)
sentence = "[redacted insert text here]"
output = claim_detector.text_predict(sentence)
sentence = "1 + 2 = 3"
output = claim_detector.text_predict(sentence)

After testing this model out with more outputs (not shown), here are my findings:

Strengths:
- The model seems to be able to have a consistent definition for a class -- it's not just a random guess.
- To me, it seems that the model learned the grammatical features of what makes a political claim, however, some sentences outside the realm of politics also exhibit these features.
- The model is doing slightly better than chance at identify political texts rather than just any other texts.

Weaknesses:
- Almost all of the AP News article was classified as a claim
- Some false positives are seen in non-politically charged texts


Features I want this model to have:
- Ability to quantify degree of opinion (i.e. not classify everything in an article as a claim, even though most sentences in articles are indeed claims).
- Ability to distinguish between politically-charged articles vs. just texts.

# Step 2: Source checking probability distribution


Based off of this annotated dataset, we're able to see which sources are left/right leaning at the time of the dataset being collected.

In [None]:
!git clone https://github.com/ramybaly/Article-Bias-Prediction

Cloning into 'Article-Bias-Prediction'...
remote: Enumerating objects: 37585, done.[K
remote: Counting objects: 100% (4/4), done.[K
remote: Compressing objects: 100% (4/4), done.[K
remote: Total 37585 (delta 0), reused 0 (delta 0), pack-reused 37581 (from 1)[K
Receiving objects: 100% (37585/37585), 127.14 MiB | 9.23 MiB/s, done.
Resolving deltas: 100% (8/8), done.
Updating files: 100% (37563/37563), done.


In [None]:
# Load dataset
import pandas as pd
import numpy as np
import os
import json

# Initialize paths
repo_file = "Article-Bias-Prediction"
data_folder = f"{repo_file}/data"
json_folder = f"{data_folder}/jsons"
splits_folder = f"{data_folder}/splits"
media_splits_folder = f"{splits_folder}/media"
media_splits = [media_splits_folder + "/" + name for name in os.listdir(media_splits_folder)]

# Read the labels
articles_df = pd.concat([pd.read_csv(media_split, delimiter="\t") for media_split in media_splits])
columns = ['topic', 'source', 'bias', 'url', 'title', 'date', 'authors', 'content', 'content_original', 'source_url', 'bias_text', 'id']
for column in columns:
  articles_df[column] = ['' for _ in range(len(articles_df))]

# Append json info
for index, row in articles_df.iterrows():
  id = row['ID']
  filename = json_folder + "/" + id + ".json"
  with open(filename) as f:
    article = json.load(f)
    for column in article:
      articles_df.loc[articles_df['ID'] == id, column] = article[column]

# Save
articles_df.to_csv('article-bias-df.csv', index=False)


In [None]:
import pandas as pd

# Load
articles_df = pd.read_csv('article-bias-df.csv')
counts = articles_df.groupby('source')['bias'].value_counts().unstack(fill_value=0)
probs = counts.div(counts.sum(axis=1), axis=0)
probs.columns = ['P(left|source)', 'P(center|source)', 'P(right|source)']
probs = probs.loc[articles_df['source'].value_counts().index] # Sorted by how common a source is
print(probs.head(10))

# Save
probs.to_csv("probs.csv")

# Load
import pandas as pd
probs = pd.read_csv("probs.csv", index_col=0)

# Note: The probability distribution is going to be more or less 0 or 1.
# Thus, when using this structure, make sure to do a +C to ensure you don't multiply anything by 0

                       P(left|source)  P(center|source)  P(right|source)
source                                                                  
Washington Times                  0.0               0.0              1.0
CNN (Web News)                    1.0               0.0              0.0
Politico                          1.0               0.0              0.0
Fox Online News                   0.0               0.0              1.0
NPR Online News                   0.0               1.0              0.0
USA TODAY                         0.0               1.0              0.0
Vox                               1.0               0.0              0.0
New York Times - News             1.0               0.0              0.0
The Hill                          0.0               1.0              0.0
Fox News                          0.0               0.0              1.0


In [None]:
# Fuzzy search
# !pip install thefuzz[speedup]
from thefuzz import process
def get_bias_distribution_fuzzy(query, prob_table, cutoff=70):
    sources = prob_table.index.tolist()
    match_result = process.extractOne(query, sources)
    if match_result is None or match_result[1] < cutoff:
        return f"No good match found for '{query}'"
    match, score = match_result
    return match, prob_table.loc[match]
query = "foxnews.com"
match_info = get_bias_distribution_fuzzy(query, probs)

if isinstance(match_info, tuple):
    match, dist = match_info
    print(f"Closest match: {match}")
    print(dist)
else:
    print(match_info)

Closest match: Fox News
P(left|source)      0.0
P(center|source)    0.0
P(right|source)     1.0
Name: Fox News, dtype: float64


Final product: creating probability distributions 'left', 'center', 'right'

In [None]:
# !pip install thefuzz[speedup]
from thefuzz import process
import pandas as pd

class SourceChecker():
  def __init__(self, dataframe: pd.DataFrame):
    bias_column = 'bias'
    source_column = 'source'
    counts = dataframe.groupby(source_column)[bias_column].value_counts().unstack(fill_value=0)
    probs = counts.div(counts.sum(axis=1), axis=0)
    probs.columns = ['P(left|source)', 'P(center|source)', 'P(right|source)'] # 0 = left, 1 = center, 2 = right
    probs = probs.loc[dataframe[source_column].value_counts().index] # Sorted by how common a source is
    self.probs = probs
    print("Source checking initialized.")

  def get_bias_distribution_fuzzy(self, query, prob_table, cutoff=70):
      sources = prob_table.index.tolist()
      match_result = process.extractOne(query, sources)
      if match_result is None or match_result[1] < cutoff:
          return f"No good match found for '{query}'"
      match, score = match_result
      return match, prob_table.loc[match]

  def search(self, query):
    match_info = self.get_bias_distribution_fuzzy(query, self.probs)
    if isinstance(match_info, tuple):
        match, dist = match_info
        print(f"Closest match: {match}")
        print(dist)
    else:
        print(match_info)

In [None]:
# SourceChecker example usage
import pandas as pd
articles_df = pd.read_csv('article-bias-df.csv')
checker = SourceChecker(articles_df)
print(checker.search('fox news')) # Correct use
print(checker.search('cbs')) # Correct use
print(checker.search('cbsnews.com')) # Incorrect use
print(checker.search('sagnsalkgsa')) # Correct use

Source checking initialized.
Closest match: Fox News
P(left|source)      0.0
P(center|source)    0.0
P(right|source)     1.0
Name: Fox News, dtype: float64
None
Closest match: CBS News
P(left|source)      1.0
P(center|source)    0.0
P(right|source)     0.0
Name: CBS News, dtype: float64
None
Closest match: NBCNews.com
P(left|source)      1.0
P(center|source)    0.0
P(right|source)     0.0
Name: NBCNews.com, dtype: float64
None
No good match found for 'sagnsalkgsa'
None


# Step 3: Transformer classifier for bias detection


Method 1: Trained on embeddings

In [None]:
# Initialize Pytorch and relevant libraries again.
!pip install lightning

# I'm honestly not sure if we need these libraries
import os.path as op
import lightning as L
from lightning.pytorch.loggers import CSVLogger
from lightning.pytorch.callbacks import ModelCheckpoint
from sklearn.feature_extraction.text import CountVectorizer

# Extra libraries we actually need
import numpy as np
import pandas as pd
from datasets import Dataset
from datasets.utils.logging import set_verbosity_error, disable_progress_bar


# Initialize PyTorch
import torch
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(device)
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")
print("Tokenizer input max length:", tokenizer.model_max_length)
print("Tokenizer vocabulary size:", tokenizer.vocab_size)
from transformers import AutoModel
model = AutoModel.from_pretrained("distilbert-base-uncased")
model.to(device);

# NLTK
import nltk
nltk.download('punkt_tab')

# Embedding function
@torch.inference_mode()
def get_output_embeddings(batch):
    input_ids = torch.tensor(batch["input_ids"]).to(device)
    attention_mask = torch.tensor(batch["attention_mask"]).to(device)
    output = model(input_ids, attention_mask=attention_mask).last_hidden_state[:, 0]
    return {"features": output.cpu().numpy()}

def get_embedding(article: str):

    # Split the article into sentences, then tokenize the sentences.
    sentences = nltk.sent_tokenize(article)
    raw_sentences = Dataset.from_list([{"text": s} for s in sentences])
    tokenized_sentences = raw_sentences.map(lambda x: tokenizer(x["text"], padding=True, truncation=True), batched=True)
    sentence_features = tokenized_sentences.map(get_output_embeddings, batched=True, batch_size=10)

    # Output features
    return sentence_features

Collecting lightning
  Downloading lightning-2.5.2-py3-none-any.whl.metadata (38 kB)
Collecting lightning-utilities<2.0,>=0.10.0 (from lightning)
  Downloading lightning_utilities-0.15.2-py3-none-any.whl.metadata (5.7 kB)
Collecting torchmetrics<3.0,>=0.7.0 (from lightning)
  Downloading torchmetrics-1.8.1-py3-none-any.whl.metadata (22 kB)
Collecting pytorch-lightning (from lightning)
  Downloading pytorch_lightning-2.5.2-py3-none-any.whl.metadata (21 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch<4.0,>=2.1.0->lightning)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch<4.0,>=2.1.0->lightning)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch<4.0,>=2.1.0->lightning)
  Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Co

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/483 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

Tokenizer input max length: 512
Tokenizer vocabulary size: 30522


model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

[nltk_data] Downloading package punkt_tab to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt_tab.zip.


In [None]:
import pandas as pd

# Load dataset
article_bias_df = pd.read_csv('article-bias-df.csv')
from datasets import Dataset

# Let's focus on the articles as the unit for now.
raw_articles = Dataset.from_list([{"article": a} for a in article_bias_df['content']])
tokenized_articles = raw_articles.map(lambda x: tokenizer(x["article"], padding=True, truncation=True), batched=True)
articles_features = tokenized_articles.map(get_output_embeddings, batched=True, batch_size=10)
np.save("X_articles.npy", np.stack(articles_features["features"]))

# Save
from google.colab import files
files.download("X_articles.npy")

# Load
X = np.load("X_articles.npy")
y = np.array(article_bias_df['bias'])

# Let's convert this dataset into a sentence-based dataset
article_bias_df.drop(columns=["content_original", "id"], inplace=True)

# Tokenize sentences
sentence_level_data = []
for _, row in article_bias_df.iterrows():
    sentences = nltk.sent_tokenize(row["content"])
    for sent in sentences:
        new_row = row.to_dict()
        new_row["sentence"] = sent  # Replace article with sentence
        sentence_level_data.append(new_row)
sentence_bias_df = pd.DataFrame(sentence_level_data)

# Drop the original article content
sentence_bias_df.drop(columns=["content"], inplace=True)

# Save
sentence_bias_df.to_csv("sentence-bias-df.csv")



In [None]:
import pandas as pd
from datasets import Dataset

# Load
sentence_bias_df = pd.read_csv("sentence-bias-df.csv")

# Now let's focus on sentences.
raw_sentences = Dataset.from_list([{"sentence": s} for s in sentence_bias_df['sentence']])
tokenized_sentences = raw_sentences.map(lambda x: tokenizer(x["sentence"], padding=True, truncation=True), batched=True)
sentences_features = tokenized_sentences.map(get_output_embeddings, batched=True, batch_size=10) # TODO: Google Colab runs out of memory with this 1,291,558 length dataset. Need to find a work around.


In [None]:
# Saving it is a bit more complicated
import numpy as np
import os
from tqdm import tqdm

output_dir = "sentence_features_chunks"
os.makedirs(output_dir, exist_ok=True)

features = sentences_features["features"]
chunk_size = 10000  # Choose based on available RAM

for i in tqdm(range(0, len(features), chunk_size)):
    chunk = features[i:i+chunk_size]  # A list of small lists
    chunk = np.array(chunk)  # Convert just this chunk to a NumPy array
    np.save(os.path.join(output_dir, f"chunk_{i//chunk_size}.npy"), chunk)


100%|██████████| 130/130 [09:26<00:00,  4.36s/it]


In [None]:
from google.colab import files
import shutil
shutil.make_archive("sentence_features_chunks", 'zip', "sentence_features_chunks")
files.download("sentence_features_chunks.zip")

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

In [None]:
# Veryifying order has been preserved
import numpy as np

# Grab a few reference embeddings
reference_indices = [0, 10000, 20000]  # Adjust as needed
reference_features = [np.array(sentences_features["features"][i]) for i in reference_indices]

# Save for verification later
np.save("reference_features.npy", reference_features)

import numpy as np
import os

# Load reference
reference_indices = [0, 10000, 20000]
reference = np.load("reference_features.npy", allow_pickle=True)

# Load chunks
chunk_files = sorted(os.listdir("sentence_features_chunks"), key=lambda x: int(x.split("_")[1].split(".")[0]))

# Function to map global index to chunk
def get_chunk_and_local_index(global_index, chunk_size):
    return global_index // chunk_size, global_index % chunk_size

# Parameters
chunk_size = 10000

# Load chunks only as needed and verify
for i, ref_idx in enumerate(reference_indices):
    chunk_num, local_idx = get_chunk_and_local_index(ref_idx, chunk_size)
    chunk_file = chunk_files[chunk_num]
    chunk = np.load(os.path.join("sentence_features_chunks", chunk_file))

    matches = np.allclose(chunk[local_idx], reference[i])
    print(f"Index {ref_idx} match: {matches}")


Index 0 match: True
Index 10000 match: True
Index 20000 match: True


In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
# Loading the data
import numpy as np
import os
import pandas as pd

# Load
sentence_bias_df = pd.read_csv("/content/drive/MyDrive/ML stuff I've done/Sentences/sentence-bias-df.csv")

# Path to the folder where chunks are saved
chunk_dir = "/content/drive/MyDrive/ML stuff I've done/Sentences/sentence_features_chunks" # might need to run this a couple times to load this.

# List all chunk files and sort by chunk index
chunk_files = sorted(
    [f for f in os.listdir(chunk_dir) if f.endswith(".npy")],
    key=lambda x: int(x.split("_")[1].split(".")[0])
)

# Load and concatenate
chunks = [np.load(os.path.join(chunk_dir, f)) for f in chunk_files]
X = np.concatenate(chunks, axis=0)
print("Loaded shape:", X.shape)
y = np.array(sentence_bias_df['bias'])

In [None]:
from tqdm import tqdm
import numpy as np

# Filtering for "claims"
model = ClaimDetector("one_class_svm.pkl")
sentence_bias_df['claim_probability'] = [[] for _ in range(len(sentence_bias_df))]
sentence_bias_df['claim_classified'] = [[] for _ in range(len(sentence_bias_df))]

# Iterat ethrough each sentence, classifying the sentence as a claim or not
for (index, row), embedding in tqdm(zip(sentence_bias_df.iterrows(), X), total=len(sentence_bias_df), desc="Classifying sentences"):

    outputs = model.embedding_predict(
        np.array([row['sentence']]),
        np.array([embedding]),
        print_output=False
    )  # Shape (sentences, embeddings)

    for probability, label, _ in outputs:  # There should only be 1 output, but just in case
        sentence_bias_df.at[index, 'claim_probability'] = probability
        sentence_bias_df.at[index, 'claim_classified'] = label


cuda:0
Tokenizer input max length: 512
Tokenizer vocabulary size: 30522


[nltk_data] Downloading package punkt_tab to /root/nltk_data...
[nltk_data]   Package punkt_tab is already up-to-date!


Model successfully loaded!


Classifying sentences: 100%|██████████| 1291558/1291558 [44:11<00:00, 487.16it/s]


In [None]:
sentence_bias_df.to_csv('sentence-bias-df-labeled')

In [None]:
# Updated loading in data chunk (Takes about 2 minutes)
import numpy as np
import os
import pandas as pd

# Mount drive
from google.colab import drive
drive.mount('/content/drive')

# Load
sentence_bias_df = pd.read_csv("/content/drive/MyDrive/ML stuff I've done/Sentences/sentence-bias-df-labeled.csv")

# Path to the folder where chunks are saved
chunk_dir = "/content/drive/MyDrive/ML stuff I've done/Sentences/sentence_features_chunks" # might need to run this a couple times to load this.

# List all chunk files and sort by chunk index
chunk_files = sorted(
    [f for f in os.listdir(chunk_dir) if f.endswith(".npy")],
    key=lambda x: int(x.split("_")[1].split(".")[0])
)

# Load and concatenate
chunks = [np.load(os.path.join(chunk_dir, f)) for f in chunk_files]
X = np.concatenate(chunks, axis=0)
print("Loaded shape:", X.shape)
y = np.array(sentence_bias_df['bias'])

Mounted at /content/drive
Loaded shape: (1291558, 768)


Train some baseline models

Training sentence models:
1.   Full dataset
2.   Just claims

In [None]:
sentence_bias_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1291558 entries, 0 to 1291557
Data columns (total 16 columns):
 #   Column             Non-Null Count    Dtype  
---  ------             --------------    -----  
 0   Unnamed: 0.1       1291558 non-null  int64  
 1   Unnamed: 0         1291558 non-null  int64  
 2   ID                 1291558 non-null  object 
 3   bias               1291558 non-null  int64  
 4   topic              1291558 non-null  object 
 5   source             1291558 non-null  object 
 6   url                1291558 non-null  object 
 7   title              1291558 non-null  object 
 8   date               1174185 non-null  object 
 9   authors            1021743 non-null  object 
 10  source_url         1291558 non-null  object 
 11  bias_text          1291558 non-null  object 
 12  sentence           1291558 non-null  object 
 13  claim_probability  1291558 non-null  float64
 14  claim_classified   1291558 non-null  object 
 15  sentence_features  1291558 non-n

In [None]:
import numpy as np
import pandas as pd

# Grab a fraction of the dataset
sentence_bias_df['sentence_features'] = [np.array(features) for features in X]
sampled_df = sentence_bias_df.sample(frac=0.25, random_state=42)

# Filter for claims only
claims_df = sampled_df[sampled_df['claim_classified'] == 'claim']

# Find the smallest class count
min_count = claims_df['bias'].value_counts().min()

# Sample equally from each class
balanced_df = (
    claims_df.groupby('bias', group_keys=False)
    .apply(lambda x: x.sample(n=min_count, random_state=42))
)

# Create x and y arrays
x_arr = balanced_df['sentence_features'].tolist()
y_arr = balanced_df['bias'].tolist()

print(f"Balanced size: {len(balanced_df)}")
print(f"Right count: {y_arr.count(2)}") # 74k sentences per class should be good enough.
print(f"Center count: {y_arr.count(1)}")
print(f"Left count: {y_arr.count(0)}")

Balanced size: 223335
Right count: 74445
Center count: 74445
Left count: 74445


  .apply(lambda x: x.sample(n=min_count, random_state=42))


In [None]:
X = np.array([feature for feature in x_arr])
y = np.array(y_arr)

In [None]:
# clear some variables before loading
del sampled_df
del x_arr
del y_arr
del chunk_dir
del chunk_files
del chunks
del sentence_bias_df
del balanced_df
del claims_df
del min_count

In [None]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30, random_state=42)

In [None]:
# Try out some baseline models
from sklearn.metrics import accuracy_score, f1_score, precision_score, recall_score
from sklearn.linear_model import LogisticRegression
from sklearn.naive_bayes import GaussianNB
from sklearn.neighbors import KNeighborsClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.dummy import DummyClassifier
from sklearn.svm import SVC
from sklearn.ensemble import RandomForestClassifier
import pickle

# Define baseline models
models = {
    "Decision Tree": DecisionTreeClassifier(),
    "Random Forest": RandomForestClassifier(),
    "Support Vector Machine": SVC(), # This one takes for ever, we might not use this one.
}

# Train and evaluate
results = []
index = 0
for name, model in models.items():
    model.fit(X_train, y_train)
    y_pred = model.predict(X_test)
    results.append({
        "Model": name,
        "Accuracy": accuracy_score(y_test, y_pred),
        "F1 Score": f1_score(y_test, y_pred, average="weighted"),
        "Precision": precision_score(y_test, y_pred, average="weighted"),
        "Recall": recall_score(y_test, y_pred, average="weighted"),
    })
    print(f"{name}: {results[index]['Accuracy']}")
    with open(f'{name[:3]}_sentences.pkl', 'wb') as f:
      pickle.dump(model, f)
    index += 1

# Display results
import pandas as pd
results_df = pd.DataFrame(results).sort_values(by="F1 Score", ascending=False)
print(results_df)

Decision Tree: 0.361278189877763
Random Forest: 0.42609811793853825


Articles: Yikes a 64% accuracy. It's okay though we still have a few more things that we expect to improve accuracy:
- Training on claim detected sentences.
- Fine tuning last layers of transformer
- Fine tuning the entire transformer

Sentences (claim detected): A 42% accuracy with Random Forest. TODO: We can still try the other 2 strategies

With using the full article, we get 64% accuracy on Log Reg and SVM.

Final product: Bias Detection model that works with sentences and articles

In [None]:
import pickle

# I'm honestly not sure if we need these libraries
!pip install lightning
import os.path as op
import lightning as L
from lightning.pytorch.loggers import CSVLogger
from lightning.pytorch.callbacks import ModelCheckpoint
from sklearn.feature_extraction.text import CountVectorizer

# Extra libraries we actually need
import numpy as np
import pandas as pd
from datasets import Dataset
from datasets.utils.logging import disable_progress_bar

# Extra
import torch
from transformers import AutoTokenizer
from transformers import AutoModel
import nltk
nltk.download('punkt_tab')

class ArticleBiasDetector():
  def __init__(self, pkl_path: str):

    # Initialize the model itself
    with open(pkl_path, 'rb') as f:
      self.classifier = pickle.load(f)

    # Initialize PyTorch
    self.device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
    print(self.device)
    self.tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")
    print("Tokenizer input max length:", self.tokenizer.model_max_length)
    print("Tokenizer vocabulary size:", self.tokenizer.vocab_size)
    self.model = AutoModel.from_pretrained("distilbert-base-uncased")
    self.model.to(self.device);
    print("Model successfully loaded!")

  def split_sentences(self, article: str):
    return nltk.sent_tokenize(article)

  def get_embeddings(self, sentences: list[str]):

    # Tokenize sentences
    raw_sentences = Dataset.from_list([{"text": s} for s in sentences])
    tokenized_sentences = raw_sentences.map(lambda x: self.tokenizer(x["text"], padding=True, truncation=True), batched=True)

    # Embedding function
    @torch.inference_mode()
    def get_output_embeddings(batch):
        input_ids = torch.tensor(batch["input_ids"]).to(self.device)
        attention_mask = torch.tensor(batch["attention_mask"]).to(self.device)
        output = self.model(input_ids, attention_mask=attention_mask).last_hidden_state[:, 0]
        return {"features": output.cpu().numpy()}
    return tokenized_sentences.map(get_output_embeddings, batched=True, batch_size=10)

  def embedding_predict(self, sentences, embeddings, print_output=True):

    # Predict & get estimated probabilities
    X = embeddings
    predictions = self.classifier.predict(X)

    # Format for output
    output = []
    for prediction, sentence in zip(predictions, sentences):
      if prediction == 0:
        label = "left"
      elif prediction == 1:
        label = "center"
      elif prediction == 2:
        label = "right"
      if print_output:
        print(f"{label}: {sentence}")
      output.append((label, sentence))
    return output

  def text_predict(self, article: str, print_output=True):

    # Formats article, tokenizes article, gets embeddings. Then, predicts off of the embedding.
    sentences = [article]
    sentence_features = self.get_embeddings(sentences)
    return self.embedding_predict(sentences, np.array(sentence_features['features']), print_output=print_output)

class SentenceBiasDetector(ArticleBiasDetector):
    def __init__(self, pkl_path: str):
        super().__init__(pkl_path)  # Inherits ArticleBiasDetector init

    # Override only text_predict
    def text_predict(self, article: str, print_output=True):

      # Splits article into sentences, tokenizes those sentences, and gets embeddings for those sentences as a dataset. Then predicts off of the embeddings.
      sentences = self.split_sentences(article)
      sentence_features = self.get_embeddings(sentences)
      return self.embedding_predict(sentences, np.array(sentence_features['features']), print_output=print_output)



[nltk_data] Downloading package punkt_tab to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt_tab.zip.


In [None]:
article_bias_path = 'Log_articles.pkl'
article_bias_model = ArticleBiasDetector(article_bias_path)

cpu
Tokenizer input max length: 512
Tokenizer vocabulary size: 30522
Model successfully loaded!


In [None]:
# Recent AP News article about Newsom and Trump
article = "[redacted insert text here]"
output = article_bias_model.text_predict(article)

# Recent CBS article about vaccines
article = "[redacted insert text here]"
output = article_bias_model.text_predict(article)

# Recent Fox News article about Russia & Ukraine
article = "[redacted insert text here]"
output = article_bias_model.text_predict(article)

In [None]:
sentence_bias_path = 'Ran_sentences.pkl'
sentence_bias_model = SentenceBiasDetector(sentence_bias_path)

cpu
Tokenizer input max length: 512
Tokenizer vocabulary size: 30522
Model successfully loaded!


In [None]:
# Given these are claims, what side do they lean on?
sentence = "[redacted insert text here]"
output = sentence_bias_model.text_predict(sentence)
sentence = "[redacted insert text here]"
output = sentence_bias_model.text_predict(sentence)
sentence = "[redacted insert text here]"
output = sentence_bias_model.text_predict(sentence)

# TODO: Check if this is outputting left for everything.

Method 2: Fine-tuning transformer

In [None]:
from datasets import DatasetDict, Dataset
from transformers import AutoTokenizer, AutoModelForSequenceClassification, TrainingArguments, Trainer
!pip install evaluate
import evaluate
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from transformers import DataCollatorWithPadding
import torch
from google.colab import drive
drive.mount('/content/drive')

# Load dataset
article_bias_df = pd.read_csv("/content/drive/MyDrive/ML stuff I've done/Sentences/article-bias-df.csv")
X_train_val, X_test, y_train_val, y_test = train_test_split(article_bias_df['content'], article_bias_df['bias'], test_size=0.20, random_state=42)
X_train, X_val, y_train, y_val = train_test_split(X_train_val, y_train_val, test_size=0.25, random_state=42)

# Create Dataset objects from the split data
train_dataset = Dataset.from_dict({'text': X_train.tolist(), 'label': y_train.tolist()})
val_dataset = Dataset.from_dict({'text': X_val.tolist(), 'label': y_val.tolist()})
test_dataset = Dataset.from_dict({'text': X_test.tolist(), 'label': y_test.tolist()})

# Create a DatasetDict
dataset_dict = DatasetDict({
    'train': train_dataset,
    'validation': val_dataset,
    'test': test_dataset
})
print(dataset_dict)

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")
print("Tokenizer input max length:", tokenizer.model_max_length)
print("Tokenizer vocabulary size:", tokenizer.vocab_size)

# Load model with trinary classification head
id2label = {0: "left", 1: "center", 2: "right"}
label2id = {"left": 0, "center": 1, "right": 2}
model = AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased",
                                                           num_labels=3,
                                                           id2label=id2label,
                                                           label2id=label2id)
# Freeze base model parameters
for name, param in model.base_model.named_parameters():
  param.requires_grad = False

# Unfreeze base model pooling layers
for name, param in model.base_model.named_parameters():
  if "pooler" in name:
    param.requires_grad = True

# Tokenize our dataset
def tokenize(batch):
   return tokenizer(batch["text"], padding=True, truncation=True)
tokenized_dataset = dataset_dict.map(tokenize, batched=True)

# Data collator
data_collator = DataCollatorWithPadding(tokenizer=tokenizer)

# Metrics
accuracy = evaluate.load("accuracy")
auc_score = evaluate.load("roc_auc")
def compute_metrics(eval_pred):
  predictions, labels = eval_pred
  predicted_classes = np.argmax(predictions, axis=1)
  acc = accuracy.compute(predictions=predicted_classes, references=labels)['accuracy']
  return {"Accuracy": acc, "AUC": 0}

# Hyper parameters
lr = 2e-4
batch_size = 8
num_epochs = 10

# Training arguments
training_args = TrainingArguments(
    output_dir="distilbert-political-bias-classifier_teacher",
    learning_rate=lr,
    per_device_train_batch_size=batch_size,
    per_device_eval_batch_size=batch_size,
    num_train_epochs=num_epochs,
    logging_strategy="epoch",
    eval_strategy="epoch",
    save_strategy="epoch",
    load_best_model_at_end=True,
    report_to="none" # Disable wandb logging
)

# Train the model
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_dataset["train"],
    eval_dataset=tokenized_dataset["test"],
    tokenizer=tokenizer,
    data_collator=data_collator,
    compute_metrics=compute_metrics
)
trainer.train()

Collecting evaluate
  Downloading evaluate-0.4.5-py3-none-any.whl.metadata (9.5 kB)
Downloading evaluate-0.4.5-py3-none-any.whl (84 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m84.1/84.1 kB[0m [31m8.4 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: evaluate
Successfully installed evaluate-0.4.5
Mounted at /content/drive
DatasetDict({
    train: Dataset({
        features: ['text', 'label'],
        num_rows: 18147
    })
    validation: Dataset({
        features: ['text', 'label'],
        num_rows: 6049
    })
    test: Dataset({
        features: ['text', 'label'],
        num_rows: 6050
    })
})


tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/483 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

Tokenizer input max length: 512
Tokenizer vocabulary size: 30522


model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight', 'pre_classifier.bias', 'pre_classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Map:   0%|          | 0/18147 [00:00<?, ? examples/s]

Map:   0%|          | 0/6049 [00:00<?, ? examples/s]

Map:   0%|          | 0/6050 [00:00<?, ? examples/s]

Downloading builder script: 0.00B [00:00, ?B/s]

Downloading builder script: 0.00B [00:00, ?B/s]

  trainer = Trainer(


Epoch,Training Loss,Validation Loss,Accuracy,Auc
1,1.022,0.970306,0.522645,0
2,0.9619,0.983836,0.527934,0
3,0.9357,0.976001,0.529091,0
4,0.9105,0.969797,0.529587,0
5,0.8924,0.909002,0.57124,0
6,0.8797,0.882556,0.592727,0
7,0.868,0.851615,0.61405,0
8,0.8573,0.874249,0.597521,0
9,0.8463,0.847262,0.618678,0
10,0.8403,0.845714,0.619174,0


TrainOutput(global_step=22690, training_loss=0.9014029209074758, metrics={'train_runtime': 3638.3537, 'train_samples_per_second': 49.877, 'train_steps_per_second': 6.236, 'total_flos': 2.403928753302528e+16, 'train_loss': 0.9014029209074758, 'epoch': 10.0})

In [None]:
from sklearn.metrics import roc_auc_score
import torch.nn.functional as F

predictions = trainer.predict(tokenized_dataset["test"])
probs = F.softmax(torch.tensor(predictions.predictions), dim=-1).numpy()
auc = roc_auc_score(predictions.label_ids, probs, multi_class="ovr")
print("Final Test AUC:", auc) # 80% AUC, but the training accuracy was 61%. We'll have to do some extra tuning to see which model actually performs the best.

Final Test AUC: 0.8041250634951664


In [None]:
!zip -r '/content/distilbert-political-bias-classifier_teacher.zip' '/content/distilbert-political-bias-classifier_teacher/'

  adding: content/distilbert-political-bias-classifier_teacher/ (stored 0%)
  adding: content/distilbert-political-bias-classifier_teacher/checkpoint-9076/ (stored 0%)
  adding: content/distilbert-political-bias-classifier_teacher/checkpoint-9076/tokenizer_config.json (deflated 75%)
  adding: content/distilbert-political-bias-classifier_teacher/checkpoint-9076/rng_state.pth (deflated 26%)
  adding: content/distilbert-political-bias-classifier_teacher/checkpoint-9076/special_tokens_map.json (deflated 42%)
  adding: content/distilbert-political-bias-classifier_teacher/checkpoint-9076/vocab.txt (deflated 53%)
  adding: content/distilbert-political-bias-classifier_teacher/checkpoint-9076/training_args.bin (deflated 54%)
  adding: content/distilbert-political-bias-classifier_teacher/checkpoint-9076/config.json (deflated 48%)
  adding: content/distilbert-political-bias-classifier_teacher/checkpoint-9076/optimizer.pt (deflated 28%)
  adding: content/distilbert-political-bias-classifier_teache

In [None]:
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import os

output_dir = "distilbert-political-bias-classifier_teacher/"
if os.path.isdir(output_dir):
    checkpoints = [os.path.join(output_dir, d) for d in os.listdir(output_dir) if d.startswith("checkpoint-")]

from transformers import Trainer
import numpy as np
from sklearn.metrics import roc_auc_score, accuracy_score
import torch.nn.functional as F

results = []
for checkpoint in checkpoints:
    print(f"Evaluating {checkpoint}...")

    # Load model
    model = AutoModelForSequenceClassification.from_pretrained(checkpoint)

    # Create a temporary trainer just for evaluation
    trainer = Trainer(
        model=model,
        tokenizer=tokenizer,
        data_collator=data_collator
    )

    # Evaluate on test set
    predictions = trainer.predict(tokenized_dataset["test"])
    probs = F.softmax(torch.tensor(predictions.predictions), dim=-1).numpy()
    predicted_classes = np.argmax(probs, axis=1)
    auc = roc_auc_score(predictions.label_ids, probs, multi_class="ovr")
    acc = accuracy_score(predictions.label_ids, predicted_classes)
    results.append({
        "checkpoint": checkpoint,
        "Accuracy": acc,
        "AUC": auc
    })
    print(f"Acc: {acc}")
    print(f"Auc: {auc}")

# Display all results
import pandas as pd
df_results = pd.DataFrame(results)
print(df_results.sort_values("Accuracy", ascending=False).to_string())

                                                      checkpoint  Accuracy       AUC
2  distilbert-political-bias-classifier_teacher/checkpoint-22690  0.619174  0.804125
8  distilbert-political-bias-classifier_teacher/checkpoint-20421  0.618678  0.803224
1  distilbert-political-bias-classifier_teacher/checkpoint-15883  0.614050  0.796287
5  distilbert-political-bias-classifier_teacher/checkpoint-18152  0.597521  0.796855
9  distilbert-political-bias-classifier_teacher/checkpoint-13614  0.592727  0.789420
7  distilbert-political-bias-classifier_teacher/checkpoint-11345  0.571240  0.781780
0   distilbert-political-bias-classifier_teacher/checkpoint-9076  0.529587  0.771485
6   distilbert-political-bias-classifier_teacher/checkpoint-6807  0.529091  0.760570
4   distilbert-political-bias-classifier_teacher/checkpoint-4538  0.527934  0.745260
3   distilbert-political-bias-classifier_teacher/checkpoint-2269  0.522645  0.713988


Our best model was checkpoint 22690 (the latest) with an accuracy of 61% and 80% AUC.

Final product: Fine-tuned on articles

In [None]:
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch
import torch.nn.functional as F
import numpy as np

class BiasDetector():
  def __init__(self, checkpoint_path):
    # Load the model and tokenizer
    self.model = AutoModelForSequenceClassification.from_pretrained(checkpoint_path)
    self.tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")

    # Set the model to evaluation mode
    self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    self.model.to(self.device)
    self.model.eval()
    print("Model has been loaded!")

  def text_predict(self, text):
    # Tokenize the input text
    # Add batch dimension as models expect batched inputs
    tokenized_input = self.tokenizer(text, return_tensors="pt", padding=True, truncation=True)

    # Move the tokenized input to the same device as the model
    tokenized_input = {key: value.to(self.device) for key, value in tokenized_input.items()}

    # Make a prediction
    with torch.no_grad():  # Disable gradient calculation for inference
        outputs = self.model(**tokenized_input)

    # Get the logits
    logits = outputs.logits

    # Get predicted probabilities
    probs = F.softmax(logits, dim=-1).numpy()

    # Get the predicted class (index with the highest probability)
    predicted_class_id = np.argmax(probs, axis=1)[0]

    # Map the predicted class ID back to a label
    id2label = {0: "left", 1: "center", 2: "right"}
    predicted_label = id2label[predicted_class_id]

    # Return format
    return {"text": text,
            "probabilities": probs[0],
            "class_id": predicted_class_id,
            "label": predicted_label}

In [None]:
from google.colab import drive

# Mount Google Drive
drive.mount('/content/drive')

# Define the path to your saved model checkpoint
checkpoint_path = "/content/drive/MyDrive/ML stuff I've done/Sentences/checkpoint-22690/"

# Test usage
bias_detector = BiasDetector(checkpoint_path)
output = bias_detector.text_predict(text)
print(output['label'])

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
Model has been loaded!
right


# Step 4: Final product + visualizations

In [None]:
# I'm honestly not sure if we need these libraries (Transformers)
!pip install lightning
import os.path as op
import lightning as L
from lightning.pytorch.loggers import CSVLogger
from lightning.pytorch.callbacks import ModelCheckpoint
from sklearn.feature_extraction.text import CountVectorizer
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch
import torch.nn.functional as F

# Libraries we actually need (Source checker + Transformers)
import numpy as np
import pandas as pd

# Extra (Transformers)
from datasets import Dataset
from datasets.utils.logging import disable_progress_bar
import torch
from transformers import AutoTokenizer
from transformers import AutoModel
import nltk
import pickle
nltk.download('punkt_tab')

# Extra (Source checker)
!pip install thefuzz[speedup]
from thefuzz import process

# Source checking class
class SourceChecker():
  def __init__(self, df_path: str):
    dataframe = pd.read_csv(df_path)
    # bias_column = 'bias'
    # source_column = 'source'
    # counts = dataframe.groupby(source_column)[bias_column].value_counts().unstack(fill_value=0)
    # probs = counts.div(counts.sum(axis=1), axis=0)
    # probs.columns = ['P(left|source)', 'P(center|source)', 'P(right|source)'] # 0 = left, 1 = center, 2 = right
    # probs = probs.loc[dataframe[source_column].value_counts().index] # Sorted by how common a source is
    self.probs = dataframe.set_index('source')
    print("Source checking initialized.")

  def get_bias_distribution_fuzzy(self, query, prob_table, cutoff=70):
      sources = prob_table.index.tolist()
      match_result = process.extractOne(query, sources)
      if match_result is None or match_result[1] < cutoff:
          return f"No good match found for '{query}'"
      match, score = match_result
      return match, prob_table.loc[match]

  def search(self, query):
    match_info = self.get_bias_distribution_fuzzy(query, self.probs)
    if isinstance(match_info, tuple):
        match, dist = match_info
        print(f"Closest match: {match}")
        print(dist)
    else:
        print(match_info)
    return match

# Deprecated (Replaced by fine-tuned transformer)
class ArticleBiasDetector():
  def __init__(self, pkl_path: str):

    # Initialize the model itself
    with open(pkl_path, 'rb') as f:
      self.classifier = pickle.load(f)

    # Initialize PyTorch
    self.device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
    print(self.device)
    self.tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")
    print("Tokenizer input max length:", self.tokenizer.model_max_length)
    print("Tokenizer vocabulary size:", self.tokenizer.vocab_size)
    self.model = AutoModel.from_pretrained("distilbert-base-uncased")
    self.model.to(self.device);
    print("Model successfully loaded!")

  def split_sentences(self, article: str):
    return nltk.sent_tokenize(article)

  def get_embeddings(self, sentences: list[str]):

    # Tokenize sentences
    raw_sentences = Dataset.from_list([{"text": s} for s in sentences])
    tokenized_sentences = raw_sentences.map(lambda x: self.tokenizer(x["text"], padding=True, truncation=True), batched=True)

    # Embedding function
    @torch.inference_mode()
    def get_output_embeddings(batch):
        input_ids = torch.tensor(batch["input_ids"]).to(self.device)
        attention_mask = torch.tensor(batch["attention_mask"]).to(self.device)
        output = self.model(input_ids, attention_mask=attention_mask).last_hidden_state[:, 0]
        return {"features": output.cpu().numpy()}
    return tokenized_sentences.map(get_output_embeddings, batched=True, batch_size=10)

  def embedding_predict(self, sentences, embeddings, print_output=True):

    # Predict & get estimated probabilities
    X = embeddings
    predictions = self.classifier.predict(X)

    # Format for output
    output = []
    for prediction, sentence in zip(predictions, sentences):
      if prediction == 0:
        label = "left"
      elif prediction == 1:
        label = "center"
      elif prediction == 2:
        label = "right"
      if print_output:
        print(f"{label}: {sentence}")
      output.append((label, sentence))
    return output

  def text_predict(self, article: str, print_output=True):

    # Formats article, tokenizes article, gets embeddings. Then, predicts off of the embedding.
    sentences = [article]
    sentence_features = self.get_embeddings(sentences)
    return self.embedding_predict(sentences, np.array(sentence_features['features']), print_output=print_output)

# Deprecated (replaced by fine-tuned transformer)
class SentenceBiasDetector(ArticleBiasDetector):
    def __init__(self, pkl_path: str):
        super().__init__(pkl_path)  # Inherits ArticleBiasDetector init

    def embedding_predict(self, sentences, embeddings, print_output=True):

      # Predict & get estimated probabilities
      X = embeddings
      predictions = self.classifier.predict(X)

      # Format for output
      output = []
      for prediction, sentence in zip(predictions, sentences):
        if prediction == 0:
          label = "left"
        elif prediction == 1:
          label = "center"
        elif prediction == 2:
          label = "right"
        if print_output:
          print(f"{label}: {sentence}")
        output.append((label, sentence))
      return output

    # Override only text_predict
    def text_predict(self, article: str, print_output=True):

      # Splits article into sentences, tokenizes those sentences, and gets embeddings for those sentences as a dataset. Then predicts off of the embeddings.
      sentences = self.split_sentences(article)
      sentence_features = self.get_embeddings(sentences)
      return self.embedding_predict(sentences, np.array(sentence_features['features']), print_output=print_output)

# Claim detection class
class SentenceClaimDetector(ArticleBiasDetector):
  def __init__(self, pkl_path: str):
      super().__init__(pkl_path)  # Inherits ArticleBiasDetector init

  def embedding_predict(self, sentences, embeddings, print_output=True):

    # Predict & get estimated probabilities
    X = embeddings
    predictions = self.classifier.predict(X)
    from scipy.special import expit
    prob_estimates = expit(self.classifier.decision_function(X))

    # Format for output
    output = []
    for estimated_probability, prediction, sentence in zip(prob_estimates, predictions, sentences):
      label = "claim" if prediction == 1 else "not claim"
      if print_output:
        #print(f"{int(estimated_probability * 10**5) / 10**5:.5f} {label}: {sentence}")
        print(f"{estimated_probability} {label}: {sentence}")
      output.append((estimated_probability, label, sentence))
    return output

  def text_predict(self, article: str, print_output=True):

    # Splits article into sentences, tokenizes those sentences, and gets embeddings for those sentences as a dataset. Then predicts off of the embeddings.
    sentences = self.split_sentences(article)
    sentence_features = self.get_embeddings(sentences)
    return self.embedding_predict(sentences, np.array(sentence_features['features']), print_output=print_output)

# Bias detection class
class BiasDetector():
  def __init__(self, checkpoint_path):
    # Load the model and tokenizer
    self.model = AutoModelForSequenceClassification.from_pretrained(checkpoint_path)
    self.tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")

    # Set the model to evaluation mode
    self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    self.model.to(self.device)
    self.model.eval()
    print("Model has been loaded!")

  def text_predict(self, text):
    # Tokenize the input text
    # Add batch dimension as models expect batched inputs
    tokenized_input = self.tokenizer(text, return_tensors="pt", padding=True, truncation=True)

    # Move the tokenized input to the same device as the model
    tokenized_input = {key: value.to(self.device) for key, value in tokenized_input.items()}

    # Make a prediction
    with torch.no_grad():  # Disable gradient calculation for inference
        outputs = self.model(**tokenized_input)

    # Get the logits
    logits = outputs.logits

    # Get predicted probabilities
    probs = F.softmax(logits, dim=-1).numpy()

    # Get the predicted class (index with the highest probability)
    predicted_class_id = np.argmax(probs, axis=1)[0]

    # Map the predicted class ID back to a label
    id2label = {0: "left", 1: "center", 2: "right"}
    predicted_label = id2label[predicted_class_id]

    # Return format
    return {"text": text,
            "probabilities": probs[0],
            "class_id": predicted_class_id,
            "label": predicted_label}



[nltk_data] Downloading package punkt_tab to /root/nltk_data...
[nltk_data]   Package punkt_tab is already up-to-date!




In [None]:
# Initialize models
import pandas as pd
from google.colab import drive

# Mount Google Drive
drive.mount('/content/drive')

# articles_model_path = 'Log_articles.pkl'
# sentences_model_path = 'Ran_sentences.pkl'
probs_path = "/content/drive/MyDrive/ML stuff I've done/Sentences/sourcebias_probabilities.csv"
bias_model_path = "/content/drive/MyDrive/ML stuff I've done/Sentences/checkpoint-22690/"
claim_model_path = "/content/drive/MyDrive/ML stuff I've done/Sentences/claimdetection_oneClassSVM.pkl"

checker = SourceChecker(probs_path)
bias_detection_model = BiasDetector(bias_model_path)
claim_detection_model = SentenceClaimDetector(claim_model_path)

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
Source checking initialized.


tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/483 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

Model has been loaded!
cpu
Tokenizer input max length: 512
Tokenizer vocabulary size: 30522


model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

Model successfully loaded!


In [None]:
# Example usage. The following article is from this link: https://time.com/7308973/danger-trump-undercutting-bls/
source = 'Time USA'
article = '''The Historic Danger of Trump Undercutting the BLS. The United States labor market shrank in May and June. The following month, the U.S. Bureau of Labor Statistics (BLS) reported as such. Then, President Donald Trump fired the commissioner of the BLS. The move isn’t just a reflection of Trump’s extraordinary exercise of executive power. It also dealt a major blow to trust in information provided by the U.S—and thus, both to America's economy and wider reputation abroad. Unfortunately, this is not an isolated incident. President Trump has also, for instance, directed the Commerce Department to begin work on a "new" census that would exclude millions of people from the population tally and would leverage "the results and information gained from the Presidential Election of 2024." The credibility of the American government's most fundamentally important data, and its word on major issues, is critical both to American prosperity and to allied confidence. At the beginning of the Cuban missile crisis, President John F. Kennedy asked former Secretary of State Dean Acheson to meet with French President Charles de Gaulle to brief him on the reasons for America's blockade of Cuba. Acheson offered to show him pictures of Soviet nuclear weapons in Cuba to justify the decision. DeGaulle's response was clear and definitive. "No.No.No.No," he responded, "The word of the President of the United States is good enough for me." This clear demonstration of trust proved to be an enormous benefit in mobilizing French and other international support for the U.S. vis-à-vis the Soviet Union during this period. In the years that followed, it wasn't always that way. Trust in President Lyndon B. Johnson faltered when foreigners questioned the validity of the information presented to justify the Tonkin Bay Resolution to support escalating the U.S. presence in the Vietnam War. Many countries, especially France (it turns out, quite correctly), questioned the validity of the information presented to justify President George W. Bush's decision to engage in the Second Iraq War. Trust and credibility with leaders abroad, along with many Americans, have long been important components in support (or lack thereof) for Washington's foreign and military policy. But this subject is not confined to these areas of policy. Trust and credibility are critical for confidence in economic and financial matters as well—and for the sustained prosperity of our country. One of the reasons Americans and foreigners invest in the U.S. is, of course, the size and dynamism of the U.S. economy. And President Donald Trump has worked hard, and in many cases quite successfully, to encourage more domestic and foreign investment here. But another key factor that makes investment attractive here is the history of trust in statistics provided by Washington. Distrust of Government Data would work against that goal. Investors operate under the assumption, justified by practices and experience over many decades, that this country will allow market forces to drive the economy and encourage the kind of innovation that has been so vital to creating new jobs and innovative companies. They also want to have the confidence that government sources will provide accurate data on the economy; this is vital to decisions in financial markets and to those considering a wide range of investment decisions from building new factories, starting new companies, or making new products for sale here. Having worked in senior economic jobs in the White House, U.S. Trade Representative and other agencies in the administrations of five presidents—three Republicans and two Democrats—I can attest to the value foreign governments and businesses attach to the credibility of the U.S., the statements of its leaders, and the information its government agencies provide about its policies and the state of the U.S. economy. Trust in the word of the U.S. government is critical to the credibility of American leadership abroad. And the reliability of economic information Washington provides to foreign governments, businesses and investors is vital to their confidence in our economy.  When I worked as Undersecretary of State for Economic Affairs, one of my jobs was to meet with leaders of companies in other countries to encourage investment here: it was a process known as "economic statecraft." And faith in the data and transparency provided by the U.S. government on such things as inflation, growth, jobs, and fair enforcement of laws and regulations were essential as an inducement. This was seen as a sharp contrast to the often-questionable information provided (or at times totally withheld) by countries such as China and Russia. U.S. government statistics were seen as apolitical data. It was assumed that data and other information provided by the government were not susceptible to political pressure on the institutions that provided them or their leadership. The U.S. government is the source of a wide range of vital data. The Federal Reserve is highly respected worldwide for the financial and economic statements it makes. The International Trade Administration in the Commerce Department is a valuable and credible source of trade information. Information provided by the Food and Drug Administration is seen as the gold standard on new drugs. The BLS falls into this category. Data on jobs is a politically charged number; it is often a signal for decisions by the Federal Reserve, as well as by financial and corporate investors. The president has the authority to appoint or change the Commissioner. But in the past, this has been seen as a non-political position. The information provided by the BLS Commissioner simply relies on data from a wide range of non-political and highly credible civil servants. While the president has the authority to replace the Commissioner, the bigger question is whether doing so serves the national interest—or even his own economic policy objectives.In considering the latter, it is important to recognize that removal of the Commissioner adds to already significant doubts abroad about the reliability of the U.S. and raises the question as to whether information emerging from the BLS or other agencies in Washington is based on factual data or on political pressure or partisan considerations. The next Commissioner, however qualified, will enter that job under this cloud. This is hardly reassuring to efforts or policies to convince foreign companies to invest here (a commendable goal) or to reassure many of the world’s financial institutions to buy U.S. bonds in order to help finance the rising deficit we face. To reassure buyers of our debt, attract new business investment, and support American international interests, President Trump must take steps to strengthen trust in the economic data and statements of the U.S. When our president, or another senior official, next meets with de Gaulle's distant successor President Emmanuel Macron, or any other world leader, will the response to information provided by the U.S. be similar to that so powerfully given by Charles de Gaulle? The results will speak for themselves. '''

# Check the source for how likely the source itself is to be biased.
checker.search(source)

# Check article itself
article_output = bias_detection_model.text_predict(article)
article_prediction = article_output['label']

# Check individual sentences
claim_predictions = claim_detection_model.text_predict(article, print_output=False)
sentences = claim_detection_model.split_sentences(article)
sentence_predictions = []
for sentence in sentences:
  sentence_output = bias_detection_model.text_predict(sentence)
  sentence_prediction = sentence_output['label']
  sentence_predictions.append(sentence_prediction)

# Print outputs
print(f"Based off of the embeddings produced by the text, the article is predicted to be {article_prediction} leaning.\n")
for claim_pred, bias_pred in zip(claim_predictions, sentence_predictions):
  probability, pred, sentence = claim_pred
  if pred == 'claim':
    print(f"{bias_pred} biased claim: {sentence}")
  else:
    print(f"not claim: {sentence}")

Based off of the embeddings produced by the text, the article is predicted to be left leaning.

left biased claim: The Historic Danger of Trump Undercutting the BLS.
not claim: The United States labor market shrank in May and June.
right biased claim: The following month, the U.S. Bureau of Labor Statistics (BLS) reported as such.
right biased claim: Then, President Donald Trump fired the commissioner of the BLS.
left biased claim: The move isn’t just a reflection of Trump’s extraordinary exercise of executive power.
left biased claim: It also dealt a major blow to trust in information provided by the U.S—and thus, both to America's economy and wider reputation abroad.
not claim: Unfortunately, this is not an isolated incident.
left biased claim: President Trump has also, for instance, directed the Commerce Department to begin work on a "new" census that would exclude millions of people from the population tally and would leverage "the results and information gained from the Presidenti

Note: A better version of the script is located in the .py file!