# Statement Classifier

This project's goal is to train a model that can determine if a statement is either a claim that can be fact-checked, or some other statement like an opinion that cannot be fact checked. 

## TODO:

- [ ] Before training again, setup file structure for saving the 'latest' model, and moving them back into time-stamped dirs, either save some metadata file or something. 
- [ ] Config at top to control what runs when you click GO.
- [ ] function-ize processes

In [58]:
import torch
import pandas as pd
import numpy as np
from transformers import (
    AutoTokenizer,
    AutoModelForSequenceClassification,
    DataCollatorWithPadding,
    TrainingArguments,
    Trainer,
    pipeline
)
from datasets import Dataset
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, precision_recall_fscore_support
from pathlib import Path
from datetime import datetime

In [30]:
import sys
print(sys.executable)

/home/ksull18/code/iu-autonomous-fact-checker/.venv/bin/python


In [31]:
%lsmagic

Available line magics:
%alias  %alias_magic  %autoawait  %autocall  %automagic  %autosave  %bookmark  %cat  %cd  %clear  %code_wrap  %colors  %conda  %config  %connect_info  %cp  %debug  %dhist  %dirs  %doctest_mode  %ed  %edit  %env  %gui  %hist  %history  %killbgscripts  %ldir  %less  %lf  %lk  %ll  %load  %load_ext  %loadpy  %logoff  %logon  %logstart  %logstate  %logstop  %ls  %lsmagic  %lx  %macro  %magic  %mamba  %man  %matplotlib  %micromamba  %mkdir  %more  %mv  %notebook  %page  %pastebin  %pdb  %pdef  %pdoc  %pfile  %pinfo  %pinfo2  %pip  %popd  %pprint  %precision  %prun  %psearch  %psource  %pushd  %pwd  %pycat  %pylab  %qtconsole  %quickref  %recall  %rehashx  %reload_ext  %rep  %rerun  %reset  %reset_selective  %rm  %rmdir  %run  %save  %sc  %set_env  %store  %sx  %system  %tb  %time  %timeit  %unalias  %unload_ext  %uv  %who  %who_ls  %whos  %xdel  %xmode

Available cell magics:
%%!  %%HTML  %%SVG  %%bash  %%capture  %%code_wrap  %%debug  %%file  %%html  %%javascript  %%

In [32]:
torch.cuda.is_available()

True

## Step 2: Prepare Data

### Sample Data

In [33]:
here = Path().cwd()
cbdata_path = here / ".data_sets" / "ClaimBuster_Datasets" / "datasets" # ClaimBuster data location
all_sentences_csv_path = cbdata_path / "all_sentences.csv"
if not all_sentences_csv_path.exists():
    raise Exception()
all_sent_df = pd.read_csv(all_sentences_csv_path)
all_sent_df.head(5)

Unnamed: 0,Sentence_id,Text,Speaker,Speaker_title,Speaker_party,Speaker_role,File_id,Length,Line_number,Sentiment
0,1,"September 25, 1988",Information,,,,1988-09-25.txt,3,1,
1,2,The First Bush-Dukakis Presidential Debate,Information,,,,1988-09-25.txt,5,2,0.0
2,3,Good evening.,Jim Lehrer,,,Moderator,1988-09-25.txt,2,3,0.343189
3,4,On behalf of the Commission on Presidential De...,Jim Lehrer,,,Moderator,1988-09-25.txt,23,4,0.810043
4,5,I'm Jim Lehrer of the McNeil-Lehrer News Hour.,Jim Lehrer,,,Moderator,1988-09-25.txt,8,5,0.0


In [34]:
crowdsourced_path = cbdata_path / "crowdsourced.csv"
if not crowdsourced_path.exists():
    raise Exception()
crowdsourced_df = pd.read_csv(crowdsourced_path)
crowdsourced_df.head()

Unnamed: 0,Sentence_id,Text,Speaker,Speaker_title,Speaker_party,File_id,Length,Line_number,Sentiment,Verdict
0,16,I think we've seen a deterioration of values.,George Bush,Vice President,REPUBLICAN,1988-09-25.txt,8,16,0.0,-1
1,17,I think for a while as a nation we condoned th...,George Bush,Vice President,REPUBLICAN,1988-09-25.txt,16,17,-0.456018,-1
2,18,"For a while, as I recall, it even seems to me ...",George Bush,Vice President,REPUBLICAN,1988-09-25.txt,29,18,-0.805547,-1
3,19,"So we've seen a deterioration in values, and o...",George Bush,Vice President,REPUBLICAN,1988-09-25.txt,35,19,0.698942,-1
4,20,"We got away, we got into this feeling that val...",George Bush,Vice President,REPUBLICAN,1988-09-25.txt,15,20,0.0,-1


In [35]:
groundtruth_path = cbdata_path / "groundtruth.csv"
if not groundtruth_path.exists():
    raise Exception()
groundtruth_df = pd.read_csv(groundtruth_path)
groundtruth_df.head()

Unnamed: 0,Sentence_id,Text,Speaker,Speaker_title,Speaker_party,File_id,Length,Line_number,Sentiment,Verdict
0,26,"You know, I saw a movie - ""Crocodile Dundee.""",George Bush,Vice President,REPUBLICAN,1988-09-25.txt,9,26,0.0,0
1,80,We're consuming 50 percent of the world's coca...,Michael Dukakis,Governor,DEMOCRAT,1988-09-25.txt,8,80,-0.740979,1
2,129,That answer was about as clear as Boston harbor.,George Bush,Vice President,REPUBLICAN,1988-09-25.txt,9,129,0.0,-1
3,131,Let me help the governor.,George Bush,Vice President,REPUBLICAN,1988-09-25.txt,5,131,0.212987,-1
4,172,We've run up more debt in the last eight years...,Michael Dukakis,Governor,DEMOCRAT,1988-09-25.txt,22,172,-0.268506,1


In [36]:
cg_df = pd.concat([crowdsourced_df, groundtruth_df])

In [37]:
training_data = [
    # Factual claims (label = 1)
    ("John Smith was elected mayor in 2020", 1),
    ("The company reported $2 million in revenue", 1),
    ("She graduated from Harvard University", 1),
    ("The meeting was scheduled for 3 PM", 1),
    ("COVID-19 cases increased by 15% last month", 1),
    # Opinions (label = 0)
    ("This is the best restaurant in town", 0),
    ("We should invest more in education", 0),
    ("That movie was terrible", 0),
    ("This policy is unfair to working families", 0),
    ("Climate change is the most important issue", 0),
]

In [38]:
cg_df.describe()

Unnamed: 0,Sentence_id,Length,Line_number,Sentiment,Verdict
count,23533.0,23533.0,23533.0,23530.0,23533.0
mean,16860.946841,17.876386,531.862279,-0.047797,-0.414949
std,9576.117567,12.717837,326.215351,0.462105,0.850329
min,16.0,5.0,12.0,-0.978973,-1.0
25%,8862.0,9.0,258.0,-0.420186,-1.0
50%,16969.0,14.0,495.0,0.0,-1.0
75%,24755.0,23.0,786.0,0.274032,0.0
max,34458.0,152.0,1392.0,0.988349,1.0


In [39]:
part01_path = cbdata_path / "2.5xNCS.json"

assert part01_path.exists()

part01_df = None

with open(part01_path, 'r') as file:
    part01_df = pd.read_json(file)

assert part01_df is not None
assert type(part01_df) is pd.DataFrame

print("--- part 01 ---")
print(part01_df.head())
print(part01_df.describe())
# ---------------------------------
part02_path = cbdata_path / "2xNCS.json"

assert part02_path.exists()

part02_df = None

with open(part02_path, 'r') as file:
    part02_df = pd.read_json(file)

assert part02_df is not None
assert type(part02_df) is pd.DataFrame

print("--- part 02 ---")
print(part02_df.head())
print(part02_df.describe())
# ---------------------------------
part03_path = cbdata_path / "3xNCS.json"

assert part03_path.exists()

part03_df = None

with open(part03_path, 'r') as file:
    part03_df = pd.read_json(file)

assert part03_df is not None
assert type(part03_df) is pd.DataFrame

print("--- part 03 ---")
print(part03_df.head())
print(part03_df.describe())

--- part 01 ---
   sentence_id  label                                               text
0        27247      1                We're 9 million jobs short of that.
1        10766      1  You know, last year up to this time, we've los...
2         3327      1  And in November of 1975 I was the first presid...
3        19700      1  And what we've done during the Bush administra...
4        12600      1  Do you know we don't have a single program spo...
        sentence_id        label
count   9674.000000  9674.000000
mean   16268.353628     0.285714
std     9388.575939     0.451777
min       16.000000     0.000000
25%     8344.000000     0.000000
50%    16455.500000     0.000000
75%    24086.250000     1.000000
max    34458.000000     1.000000
--- part 02 ---
   sentence_id  label                                               text
0        15083      1  When I made my decision to stop all trade with...
1        16799      1  We've got the highest inflation we've had in t...
2        32570

In [65]:
all_parts_df = pd.concat([part01_df, part02_df, part03_df])
print(all_parts_df.describe())
print(f"Dataset Size: {len(all_parts_df)}")

        sentence_id         label
count  29022.000000  29022.000000
mean   16281.469161      0.285714
std     9401.659478      0.451762
min       16.000000      0.000000
25%     8384.500000      0.000000
50%    16455.500000      0.000000
75%    24089.000000      1.000000
max    34458.000000      1.000000
Dataset Size: 29022


### Additional Data Exploring

After building the model and performing some manual testing, the statement, "Barack Obama was president from 2009 to 2017," kept being returned as an opinion when it is actually a verifiable claim.

In [69]:
if False:
    for sent in all_parts_df["text"]:
        if "obama" in sent.casefold():
            print(sent)

if True:
    obama_mask = all_parts_df["text"].str.contains("Obama", case=False, na=False)
    obama_df = all_parts_df.copy()[obama_mask] # Only Obama entries
    obama_claims_df = obama_df[obama_df["label"] == 1 ]
    obama_opinions_df = obama_df[obama_df["label"] == 0 ]

    obama_mentions_count = len(obama_df)
    obama_claims_count = len(obama_claims_df)
    obama_opinions_count = len(obama_opinions_df)
    print(f"Total Obama mentions: {obama_mentions_count}")
    print(f"Obama Claims (LABEL_1): {obama_claims_count}")
    print(f"Obama Opinions (LABEL_0): {obama_opinions_count}")
    print(f"Obama Claim Percentage: {(obama_claims_count / obama_mentions_count) * 100}%")

    print("\nSample Obama Entries as Claims")
    print("---" * 5 + " Claims " + "---" * 5)
    print(obama_claims_df.head(10))
    print("---" * 5 + " Opinions " + "---" * 5)
    print(obama_opinions_df.head(10))
    # for i, text in enumerate(obama_claims_df["text"].head(10)):
    #     print(f"{i}.) \"{text}\"")
    # print("\nSample Obama Entries as Claims")


Total Obama mentions: 271
Obama Claims (LABEL_1): 147
Obama Opinions (LABEL_0): 124
Obama Claim Percentage: 54.24354243542435%

Sample Obama Entries as Claims
--------------- Claims ---------------
     sentence_id  label                                               text
76          6216      1  Senator Obama, as a member of the Illinois Sta...
162         2427      1  Right now, the CBO says up to 20 million peopl...
217        30219      1  And my successor, John Kerry, and President Ob...
320         3926      1  Now General Petraeus has praised the successes...
328         5665      1  Senator Obama has asked for nearly $1 billion ...
331         2381      1  I just don't know how the president could have...
351        33634      1  We have, during his regime, during President O...
408         2201      1  And ironically, if you repeal Obamacare, and I...
560        10916      1  Senator Obama has approved storage and reproce...
674         5970      1  By the way, when Senator Ob

## Convert to Dataframe

### Split Data

In [74]:
# Split into train/validation sets
train_texts, val_texts, train_labels, val_labels = train_test_split(
    all_parts_df["text"].tolist(), 
    all_parts_df["label"].tolist(), 
    test_size=0.2, 
    random_state=42
)

print(f"Training samples: {len(train_texts)}")
print(f"Validation samples: {len(val_texts)}")
print(type(train_texts))
print(len([x for x in train_texts if "obama" in x.casefold()]))

Training samples: 23217
Validation samples: 5805
<class 'list'>
213


## Step 3: Load and Setup BERT

### Initialize Tokenizer and Model

In [48]:
# Choose your BERT variant
model_name = "bert-base-uncased"  # Good starting point
# Alternatives: "roberta-base", "distilbert-base-uncased" (faster)

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_name)
# Ran into tokenization issue - All tensors in a batch should be same length
# Some were 100 and but one was 187.
# Use padding
data_collator = DataCollatorWithPadding(tokenizer=tokenizer)

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Using device: {device}")

model = AutoModelForSequenceClassification.from_pretrained(
    model_name, 
    num_labels=2  # Binary classification: claim vs opinion
)
model.to(device)

print(f"Model loaded: {model_name}")
print(f"Vocabulary size: {tokenizer.vocab_size}")

Using device: cuda


Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Model loaded: bert-base-uncased
Vocabulary size: 30522


### Tokenize Data

In [49]:
def tokenize_function(examples):
    return tokenizer(
        examples['text'], 
        truncation=True, 
        padding=True, 
        max_length=256  # Adjust based on your text length
    )

# Create datasets
train_dataset = Dataset.from_dict({
    'text': train_texts,
    'labels': train_labels
})

val_dataset = Dataset.from_dict({
    'text': val_texts,
    'labels': val_labels
})

# Apply tokenization
train_dataset = train_dataset.map(tokenize_function, batched=True)
val_dataset = val_dataset.map(tokenize_function, batched=True)

print("Data tokenized successfully!")

Map:   0%|          | 0/23217 [00:00<?, ? examples/s]

Map:   0%|          | 0/5805 [00:00<?, ? examples/s]

Data tokenized successfully!


## Step 4: Fine-Tune Model

We are doing **transfer learning** with **fine-tuning**. 
BERT was pre-trained to understand language - Thank you!
We fine-tuning the model for a specific task - claim vs opinion here.
The technique = Supervised learning with backpropagation

Deep dive: BERT has millions of weights to understand language. We are adjusting these to suit our classification task. Only our final classification layer is learning from scratch. The rest of BERT is merely adapting instead of being completely retrained. 
BERT (I think) expects a "[MASK]" token to predict values. 
By fine-tuning, we add a layer like: `input text -> BERT Encoder -> Classification Head -> [Claim, Opinion] probabilities`.

### Define Training Arguments

[transformers.TrainingArguments](https://huggingface.co/docs/transformers/v4.52.3/en/main_classes/trainer#transformers.TrainingArguments) has a lot of parameters. 

In [None]:
timestamp = datetime.now().strftime("%Y%m%d%H%M%S")

training_args = TrainingArguments(
    output_dir=f'./bert-claim-classifier_{timestamp}', # Working directory during training for logs and checkpoints.
    num_train_epochs=3,              # Start with 3, adjust based on results
    per_device_train_batch_size=16,  # Reduce if memory issues
    per_device_eval_batch_size=16,
    warmup_steps=500, # gradually increase learning rate over 500 steps | prevents huge descrutive changes early on
    weight_decay=0.01, # Very mild 1% to prevent memorizing training daata exactly. 
    logging_dir='./logs',
    logging_steps=10,
    eval_strategy="epoch",
    save_strategy="epoch",
    load_best_model_at_end=True,
    metric_for_best_model="eval_loss",
    greater_is_better=False,
    dataloader_pin_memory=False, # can help with GPU transfer speed
    fp16=True, # mixed precision can speedup training if supported
    dataloader_num_workers=4, # parallel data loading
)

### Define Evaluation Metrics

In [51]:
def compute_metrics(eval_pred):
    predictions, labels = eval_pred
    predictions = np.argmax(predictions, axis=1)
    
    precision, recall, f1, _ = precision_recall_fscore_support(
        labels, predictions, average='weighted'
    )
    accuracy = accuracy_score(labels, predictions)
    
    return {
        'accuracy': accuracy,
        'f1': f1,
        'precision': precision,
        'recall': recall
    }

### Initialize and Train

This is the fun part we all want to do :)

In [None]:
# Create trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=val_dataset,
    data_collator=data_collator,
    compute_metrics=compute_metrics,
)

# Start training
print("Starting training...")
trainer.train()

# Save the model
trainer.save_model('./bert-claim-classifier') # Where to save model weights and config
tokenizer.save_pretrained('./bert-claim-classifier') # for tokenizer stuff
print("Model saved!")

Starting training...


huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Av

Epoch,Training Loss,Validation Loss,Accuracy,F1,Precision,Recall
1,0.2547,0.112086,0.963997,0.963635,0.964037,0.963997
2,0.0209,0.058669,0.986736,0.986729,0.986725,0.986736
3,0.0004,0.047583,0.991042,0.991068,0.991167,0.991042


huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Av

Model saved!


## Step 5: Test Model

### Load Trained Model for Testing

In [None]:
print("\n")
# Load your fine-tuned model
classifier = pipeline(
    task="text-classification",
    model="./bert-claim-classifier",
    tokenizer="./bert-claim-classifier",
    device='cuda'
)

# Test sentences
test_sentences = [
    "Barack Obama was president from 2009 to 2017",  # Should be factual
    "Pizza is the most delicious food ever",         # Should be opinion
    "The stock market closed at 4,500 points",      # Should be factual
    "This movie deserves an Oscar",                  # Should be opinion
    "Barack Obama was born in Hawaii in 1961",       # More factual claims for Obama
    "The man Barack Obama served as Senator from Illinois before becoming president.",
    "The man John Doe served as Senator from Illinois before becoming president.",
    "Barack Obama won the Nobel Peace Prize in 2009",
    "George Washington won the Nobel Peace Prize in 2009",
    "Ada Lovelace wrote the first computer program way back in the 1840s!",
    "My name is Kevin.",
]

print("=== Testing the model ===")
print("-" * 50)
for sentence in test_sentences:
    result = classifier(sentence)
    label = "Factual Claim" if result[0]['label'] == 'LABEL_1' else "Opinion"
    confidence = result[0]['score']
    print(result)
    print(f"Text: '{sentence}'")
    print(f"Prediction: {result[0]['label']} => {label} (confidence: {confidence:.3f})")
    print("-" * 50)

Device set to use cuda




=== Testing the model ===
--------------------------------------------------
[{'label': 'LABEL_0', 'score': 0.9985219836235046}]
Text: 'Barack Obama was president from 2009 to 2017'
Prediction: LABEL_0 => Opinion (confidence: 0.999)
--------------------------------------------------
[{'label': 'LABEL_0', 'score': 0.9998867511749268}]
Text: 'Pizza is the most delicious food ever'
Prediction: LABEL_0 => Opinion (confidence: 1.000)
--------------------------------------------------
[{'label': 'LABEL_1', 'score': 0.9998519420623779}]
Text: 'The stock market closed at 4,500 points'
Prediction: LABEL_1 => Factual Claim (confidence: 1.000)
--------------------------------------------------
[{'label': 'LABEL_0', 'score': 0.999909520149231}]
Text: 'This movie deserves an Oscar'
Prediction: LABEL_0 => Opinion (confidence: 1.000)
--------------------------------------------------
[{'label': 'LABEL_0', 'score': 0.9996640682220459}]
Text: 'Barack Obama was born in Hawaii in 1961'
Prediction: LABE

### Manual Evaluation Function

In [62]:
def evaluate_model(texts, true_labels):
    """Evaluate model on a list of texts with known labels"""
    predictions = []
    
    for text in texts:
        result = classifier(text)
        # Convert to binary (0 or 1)
        pred = 1 if result[0]['label'] == 'LABEL_1' else 0
        predictions.append(pred)
    
    accuracy = accuracy_score(true_labels, predictions)
    precision, recall, f1, _ = precision_recall_fscore_support(
        true_labels, predictions, average='weighted'
    )
    
    print(f"Accuracy: {accuracy:.3f}")
    print(f"Precision: {precision:.3f}")
    print(f"Recall: {recall:.3f}")
    print(f"F1-score: {f1:.3f}")
    
    return predictions

# Example usage:
test_texts = ["Company revenue increased 20%", "I think this is wrong"]
test_labels = [1, 0]  # 1 = factual, 0 = opinion
# predictions = evaluate_model(test_texts, test_labels)

# Warning of using "pipeline" sequentially on GPU - use dataset instead.
predictions = evaluate_model(val_texts, val_labels)
# predictions = evaluate_model(val_dataset)

Accuracy: 0.991
Precision: 0.991
Recall: 0.991
F1-score: 0.991


## Step 6: Integration With Fact-Checker

In [57]:
def extract_claims_from_text(text):
    """
    Extract potential factual claims from text
    Returns list of sentences classified as factual claims
    """
    # Simple sentence splitting (you might want to use spaCy for better results)
    sentences = text.split('. ')
    print(sentences)
    
    claims = []
    for sentence in sentences:
        if len(sentence.strip()) > 10:  # Skip very short sentences
            print(sentence)
            result = classifier(sentence)
            print(result)
            if result[0]['label'] == 'LABEL_1':  # Factual claim
                claims.append({
                    'text': sentence,
                    'confidence': result[0]['score']
                })
    
    return claims

# Test with a Twitter example
twitter_text = """My opponent Denver Riggleman, running mate of Corey Stewart, was caught on camera campaigning with a white supremacist. Now he has been exposed as a devotee of Bigfoot erotica. This is not what we need on Capitol Hill."""

claims = extract_claims_from_text(twitter_text)
print(f"Extracted claims: {claims}")
for claim in claims:
    print(f"- {claim['text']} (confidence: {claim['confidence']:.3f})")

['My opponent Denver Riggleman, running mate of Corey Stewart, was caught on camera campaigning with a white supremacist', 'Now he has been exposed as a devotee of Bigfoot erotica', 'This is not what we need on Capitol Hill.']
My opponent Denver Riggleman, running mate of Corey Stewart, was caught on camera campaigning with a white supremacist
[{'label': 'LABEL_1', 'score': 0.9972979426383972}]
Now he has been exposed as a devotee of Bigfoot erotica
[{'label': 'LABEL_0', 'score': 0.9996885061264038}]
This is not what we need on Capitol Hill.
[{'label': 'LABEL_0', 'score': 0.9999184608459473}]
Extracted claims: [{'text': 'My opponent Denver Riggleman, running mate of Corey Stewart, was caught on camera campaigning with a white supremacist', 'confidence': 0.9972979426383972}]
- My opponent Denver Riggleman, running mate of Corey Stewart, was caught on camera campaigning with a white supremacist (confidence: 0.997)
