# Preliminaries

The program was run using Google Colab with GPU, Tesla T4. For finetuning the pretrained models to the desired datasets, the Hugging Face Trainer API was used. Datasets include a local fake news dataset (Filipino) and the Kaggle fake news dataset from UTK Machine Learning Club 2017.

This experiment will mainly cover creating an adversarial attack by removing degree adverbs.



In [None]:
from google.colab import drive
drive.mount('/content/drive')

In [None]:
!cp "/content/drive/My Drive/198-adversarial-ml/Kaggle-Fake-News/train.csv" "train.csv"
!cp "/content/drive/My Drive/198-adversarial-ml/Fake-News-Filipino/full.csv" "full.csv"

In [None]:
!pip install datasets
!pip install transformers

In [3]:
import torch
import numpy as np
import pandas as pd
import itertools
import string
import re
from datasets import load_dataset
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
from transformers import AutoTokenizer, AutoModelForSequenceClassification
from transformers import TrainingArguments, Trainer
from transformers import EarlyStoppingCallback

The following code will be used for training the models used in this experiment.

In [58]:
class Dataset(torch.utils.data.Dataset):
    def __init__(self, encodings, labels=None):
        self.encodings = encodings
        self.labels = labels

    def __getitem__(self, idx):
        item = {key: torch.tensor(val[idx]) for key, val in self.encodings.items()}
        if self.labels:
            item["labels"] = torch.tensor(self.labels[idx])
        return item

    def __len__(self):
        return len(self.encodings["input_ids"])

In [59]:
def compute_metrics(p):
    pred, labels = p
    pred = np.argmax(pred, axis=1)

    accuracy = accuracy_score(y_true=labels, y_pred=pred)
    recall = recall_score(y_true=labels, y_pred=pred)
    precision = precision_score(y_true=labels, y_pred=pred)
    f1 = f1_score(y_true=labels, y_pred=pred)

    return {"accuracy": accuracy, "precision": precision, "recall": recall, "f1": f1}

In [60]:
results = []

Preload pre-processed and raw datasets and models if needed.

In [26]:
!cp "/content/drive/My Drive/198-adversarial-ml/Kaggle-Fake-News/train_orig.csv" "train_kaggle.csv"
!cp "/content/drive/My Drive/198-adversarial-ml/Kaggle-Fake-News/test_orig.csv" "test_orig_kaggle.csv"
!cp "/content/drive/My Drive/198-adversarial-ml/Kaggle-Fake-News/test_adv.csv" "test_adv_kaggle.csv"

!cp "/content/drive/My Drive/198-adversarial-ml/Fake-News-Filipino/train_orig.csv" "train_fil.csv"
!cp "/content/drive/My Drive/198-adversarial-ml/Fake-News-Filipino/test_orig.csv" "test_orig_fil.csv"
!cp "/content/drive/My Drive/198-adversarial-ml/Fake-News-Filipino/test_adv.csv" "test_adv_fil.csv"

# Kaggle Fake News Dataset

Use the train.csv file from [Kaggle Fake News Dataset](https://www.kaggle.com/competitions/fake-news/data) containing over 20000 news articles labeled as 0 when reliable, and 1 when unreliable. We split the training dataset to 70-30 wherein the new training dataset forms 70% while the test dataset forms the 30%.

In [None]:
df = pd.read_csv('train.csv')

In [None]:
df = df[df['text'].notnull()]

In [None]:
train, test_orig = train_test_split(df, test_size=0.3)
train.to_csv('/content/drive/My Drive/198-adversarial-ml/Kaggle-Fake-News/train_orig.csv', index=False)
test_orig.to_csv('/content/drive/My Drive/198-adversarial-ml/Kaggle-Fake-News/test_orig.csv', index=False)

!cp "/content/drive/My Drive/198-adversarial-ml/Kaggle-Fake-News/train_orig.csv" "train_kaggle.csv"
!cp "/content/drive/My Drive/198-adversarial-ml/Kaggle-Fake-News/test_orig.csv" "test_orig_kaggle.csv"

In [None]:
print("Train size:", len(train))
print("Test size:", len(test_orig))

Train size: 14532
Test size: 6229


In [None]:
df_test = pd.read_csv('test_orig_kaggle.csv')

## Pre-processing

The pre-processing step includes modifying the test dataset to create an adversarial one.

For the **first experiement**, the *adv_list*  will contain the list of degree adverbs from (Flores et al., 2022)




In [None]:
adv_list = ['absolutely', 'amazingly', 'awfully', 'barely',
                'completely', 'considerably', 'decidedly', 'deeply', 
                'enormously', 'entirely', 'especially', 'exceptionally',
                'exclusively', 'extremely', 'fully', 'greatly', 'hardly',
                'hella', 'highly', 'hugely', 'incredibly', 'intensely',
                'majorly', 'overwhelmingly', 'really', 'remarkably',
                'substantially', 'thoroughly', 'totally', 'tremendously',
                'unbelievably', 'unusually', 'utterly', 'very']

In the following lines of code, a new dataframe is created which does not contain the adverbs in the *adv_list*

In [None]:
df_test['text_new'] = df_test['text'].apply(lambda s: ' '.join([w for w in s.split() if w.lower() not in adv_list]))

df_test_orig = df_test[['id','title','author','text','label']]
df_test_adv  = df_test[['id','title','author','text_new','label']].rename(columns={'text_new':'text'})

Copy the old and modified dataset to local storage and drive.

In [None]:
df_test_orig.to_csv('/content/drive/My Drive/198-adversarial-ml/Kaggle-Fake-News/test_orig.csv', index=False)
df_test_adv.to_csv('/content/drive/My Drive/198-adversarial-ml/Kaggle-Fake-News/test_adv.csv', index=False)

In [None]:
!cp "/content/drive/My Drive/198-adversarial-ml/Kaggle-Fake-News/test_orig.csv" "test_orig_kaggle.csv"
!cp "/content/drive/My Drive/198-adversarial-ml/Kaggle-Fake-News/test_adv.csv" "test_adv_kaggle.csv"

In [None]:
ids_old = df_test_orig.text.str.contains('really$|really-|really ', flags = re.IGNORECASE, regex = True, na = False)
ids_new = df_test_adv.text.str.contains('really$|really-|really ', flags = re.IGNORECASE, regex = True, na = False)

1204 rows with adverb "really".

In [None]:
df_test_orig[ids_old]

Unnamed: 0,id,title,author,text,label
1,1230,FBI Director Comey Asks President Putin: “Is A...,The European Union Times,\nAn absolutely astonishing Security Council (...,1
4,17365,"With Donald Trump in Charge, Republicans Have ...","Patrick Healy, Jonathan Martin and Maggie Habe...","Republican elected officials, donors and strat...",0
6,1294,One Season Ends and Another Begins: Baseball P...,Tyler Kepner,The baseball gods spend six months twisting th...,0
9,11806,Why ‘This Is Fine’ Is the Meme This Year Deser...,Katie Rogers,"During the Democratic National Convention, the...",0
21,5197,Jesus Comes Out of the Closet … Or Does He?,,"Wednesday, 16 November 2016 \nWhat a week it h...",1
...,...,...,...,...,...
6201,19029,Border Patrol Agent Tells Speaker Ryan the Wal...,Bob Price,"Border Patrol Agent Brandon Judd, speaking in ...",0
6208,3151,Maine Gets High Marks for Supporting Veterans,Arnaldo Rodgers,Maine Gets High Marks for Supporting \nBy ...,1
6216,8780,The Biggest Record-Breaking Supermoon In Nearl...,Dikran Arakelian (noreply@blogger.com),Share on Facebook If you only see one astronom...,1
6218,3749,Anti-Travel Ban Lawyer Leans on Argument that ...,Raheem Kassam,The lawyer representing the State of Hawaii in...,0


30 rows with the adverb "really" are left in the modified dataframe.Instances of punctuations and other special characters in the adverb string were not removed.

In [None]:
df_test_adv[ids_new]

Unnamed: 0,id,title,author,text,label
1,1230,FBI Director Comey Asks President Putin: “Is A...,The European Union Times,An astonishing Security Council (SC) report ci...,1
470,4861,Whoopi: Are the Trump Administration Values ’R...,Pam Key,"Tuesday on ABC’s “The View,” Whoopi Goldberg w...",0
817,12951,"Evan McMullin, Anti-Trump Republican, Mounts I...",Maggie Haberman,"Evan McMullin, a former C. I. A. official and ...",0
890,6873,Putin and Obama: the Trust Evaporates,Ray McGovern,How did the “growing trust” that Russian Presi...,1
922,8610,A Personal Trainer for Heartbreak - The New Yo...,Sophia Kercher,"After a traumatic breakup, Julia Scinto, a fas...",0
1193,20278,And Then There Was Trump - The New York Times,Thomas B. Edsall,How do you deal with an opponent immune to the...,0
1318,19895,SpaceX Says It’s Ready to Launch Rockets Again...,Kenneth Chang,After the explosion in September of one of its...,0
1526,6209,Thought-Control Technology Enables Quadriplegi...,Nate Church,At Case Western Reserve University in Clevelan...,0
1909,6749,"Now That Obamacare's Imploding, Trump Says He ...",Kyle Becker,Getty - Chip Somodevilla The Wildfire is an op...,1
1918,16819,Meet the Obama Holdovers Who Survived Trump’s ...,Mark Landler,WASHINGTON — When President Trump’s new Middle...,0


## Finetuning

In [None]:
# Load the finetuned model
pretrained = 'bert-base-cased'
tokenizer = AutoTokenizer.from_pretrained(pretrained)
model = AutoModelForSequenceClassification.from_pretrained(pretrained)

In [62]:
data = pd.read_csv('train_kaggle.csv')

X = list(data["text"])
y = list(data["label"])
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.3)
X_train_tokenized = tokenizer(X_train, padding=True, truncation=True, max_length=512)
X_val_tokenized = tokenizer(X_val, padding=True, truncation=True, max_length=512)

train_dataset = Dataset(X_train_tokenized, y_train)
val_dataset = Dataset(X_val_tokenized, y_val)

In [63]:
args = TrainingArguments(
    output_dir="output",
    evaluation_strategy="steps",
    eval_steps=500,
    per_device_train_batch_size=8,
    per_device_eval_batch_size=8,
    num_train_epochs=3,
    seed=0,
    load_best_model_at_end=True,
)
trainer = Trainer(
    model=model,
    args=args,
    train_dataset=train_dataset,
    eval_dataset=val_dataset,
    compute_metrics=compute_metrics,
    callbacks=[EarlyStoppingCallback(early_stopping_patience=3)],
)

PyTorch: setting up devices
The default value for the training argument `--report_to` will change in v5 (from all installed integrations to none). In v5, you will need to use `--report_to all` to get the same behavior as now. You should start updating your code and make this info disappear :-).


In [None]:
trainer.train()

***** Running training *****
  Num examples = 10172
  Num Epochs = 3
  Instantaneous batch size per device = 8
  Total train batch size (w. parallel, distributed & accumulation) = 8
  Gradient Accumulation steps = 1
  Total optimization steps = 3816
  Number of trainable parameters = 108311810


Step,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
500,0.1587,0.022183,0.993807,0.995854,0.991743,0.993795
1000,0.0468,0.046961,0.985092,0.972731,0.998165,0.985284
1500,0.0239,0.033689,0.994954,0.993144,0.996789,0.994963
2000,0.0216,0.029949,0.995183,0.99405,0.99633,0.995189


***** Running Evaluation *****
  Num examples = 4360
  Batch size = 8
Saving model checkpoint to output/checkpoint-500
Configuration saved in output/checkpoint-500/config.json
Model weights saved in output/checkpoint-500/pytorch_model.bin
***** Running Evaluation *****
  Num examples = 4360
  Batch size = 8
Saving model checkpoint to output/checkpoint-1000
Configuration saved in output/checkpoint-1000/config.json
Model weights saved in output/checkpoint-1000/pytorch_model.bin
***** Running Evaluation *****
  Num examples = 4360
  Batch size = 8
Saving model checkpoint to output/checkpoint-1500
Configuration saved in output/checkpoint-1500/config.json
Model weights saved in output/checkpoint-1500/pytorch_model.bin
***** Running Evaluation *****
  Num examples = 4360
  Batch size = 8
Saving model checkpoint to output/checkpoint-2000
Configuration saved in output/checkpoint-2000/config.json
Model weights saved in output/checkpoint-2000/pytorch_model.bin


Training completed. Do not forget

TrainOutput(global_step=2000, training_loss=0.06274038648605347, metrics={'train_runtime': 2291.8226, 'train_samples_per_second': 13.315, 'train_steps_per_second': 1.665, 'total_flos': 4208724441538560.0, 'train_loss': 0.06274038648605347, 'epoch': 1.57})

Copy finetuned model to local storage

In [None]:
!cp -r "output" "/content/drive/My Drive/198-adversarial-ml/Kaggle-Fake-News/output"

## Evaluation

Load existing finetuned models if needed

In [64]:
!rm -r "output" #delete previously loaded model
!cp -r "/content/drive/My Drive/198-adversarial-ml/Kaggle-Fake-News/output" "output"

Use the best model, step = 500.

In [65]:
test_data = pd.read_csv("test_orig_kaggle.csv")
X_test = list(test_data["text"])
X_test_tokenized = tokenizer(X_test, padding=True, truncation=True, max_length=512)
y_test = list(test_data["label"])

test_dataset = Dataset(X_test_tokenized)

model_path = "output/checkpoint-500"
model = AutoModelForSequenceClassification.from_pretrained(model_path, num_labels=2)

test_trainer = Trainer(model)

raw_pred, _, _ = test_trainer.predict(test_dataset)
y_pred = np.argmax(raw_pred, axis=1)

accuracy = accuracy_score(y_test, y_pred)
recall = recall_score(y_test, y_pred)
precision = precision_score(y_test, y_pred)
f1 = f1_score(y_test, y_pred)

temp_results = []
temp_results.append(accuracy)
temp_results.append(recall)
temp_results.append(precision)
temp_results.append(f1)
results.append(temp_results)
print(accuracy, recall, precision, f1)

loading configuration file output/checkpoint-500/config.json
Model config BertConfig {
  "_name_or_path": "output/checkpoint-500",
  "architectures": [
    "BertForSequenceClassification"
  ],
  "attention_probs_dropout_prob": 0.1,
  "classifier_dropout": null,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "position_embedding_type": "absolute",
  "problem_type": "single_label_classification",
  "torch_dtype": "float32",
  "transformers_version": "4.25.1",
  "type_vocab_size": 2,
  "use_cache": true,
  "vocab_size": 28996
}

loading weights file output/checkpoint-500/pytorch_model.bin
All model checkpoint weights were used when initializing BertForSequenceClassification.

All the weights of BertForSequenc

0.9940600417402472 0.9951659684176604 0.9929260450160772 0.9940447448897473


In [67]:
test_data = pd.read_csv("test_adv_kaggle.csv")
test_data = test_data[test_data['text'].notnull()]

X_test = list(test_data["text"])
X_test_tokenized = tokenizer(X_test, padding=True, truncation=True, max_length=512)
y_test = list(test_data["label"])

test_dataset = Dataset(X_test_tokenized)

model_path = "output/checkpoint-500"
model = AutoModelForSequenceClassification.from_pretrained(model_path, num_labels=2)

test_trainer = Trainer(model)

raw_pred, _, _ = test_trainer.predict(test_dataset)
y_pred = np.argmax(raw_pred, axis=1)

accuracy = accuracy_score(y_test, y_pred)
recall = recall_score(y_test, y_pred)
precision = precision_score(y_test, y_pred)
f1 = f1_score(y_test, y_pred)

temp_results = []
temp_results.append(accuracy)
temp_results.append(recall)
temp_results.append(precision)
temp_results.append(f1)
results.append(temp_results)

print(accuracy, recall, precision, f1)

loading configuration file output/checkpoint-500/config.json
Model config BertConfig {
  "_name_or_path": "output/checkpoint-500",
  "architectures": [
    "BertForSequenceClassification"
  ],
  "attention_probs_dropout_prob": 0.1,
  "classifier_dropout": null,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "position_embedding_type": "absolute",
  "problem_type": "single_label_classification",
  "torch_dtype": "float32",
  "transformers_version": "4.25.1",
  "type_vocab_size": 2,
  "use_cache": true,
  "vocab_size": 28996
}

loading weights file output/checkpoint-500/pytorch_model.bin
All model checkpoint weights were used when initializing BertForSequenceClassification.

All the weights of BertForSequenc

0.9917741935483871 0.9928034020281321 0.9905352480417755 0.9916680280999838


# Fake News Filipino Dataset

The provided dataset contains around 3000 news articles in Filipino that is perfectly split of real and fake news.

In [None]:
df = pd.read_csv('full.csv')

In [None]:
df = df[df['article'].notnull()]

In [None]:
train, test_orig = train_test_split(df, test_size=0.3)
train.to_csv('/content/drive/My Drive/198-adversarial-ml/Fake-News-Filipino/train_orig.csv', index=False)
test_orig.to_csv('/content/drive/My Drive/198-adversarial-ml/Fake-News-Filipino/test_orig.csv', index=False)

!cp "/content/drive/My Drive/198-adversarial-ml/Fake-News-Filipino/train_orig.csv" "train_fil.csv"
!cp "/content/drive/My Drive/198-adversarial-ml/Fake-News-Filipino/test_orig.csv" "test_orig_fil.csv"

In [None]:
df_test = pd.read_csv('test_orig_fil.csv')

In [None]:
print(len(df_test))

962


## Pre-processing

For the **first experiement**, the *adv_list*  will contain the list of degree adverbs commonly used in Filipino.

In [None]:
adv_list = ['masyado', 'medyo', 'tunay', 'kaagad', 'lubos', 'parang', 'bahagya', 'halos', 'lubhang', 'labis',
            'lalong', 'higit', 'talaga', 'totoo', 'pa rin', 'mabuti', 'mahirap', 'kamakailan', 'madalang', 'minsan']

In the following lines of code, a new dataframe is created which does not contain the adverbs in the *adv_list*

In [None]:
df_test['article_new'] = df_test['article'].apply(lambda s: ' '.join([w for w in s.split() if w.lower() not in adv_list]))

df_test_orig = df_test[['article','label']]
df_test_adv  = df_test[['article_new','label']].rename(columns={'article_new':'article'})

Copy the old and modified dataset to local storage and drive.

In [None]:
df_test_orig.to_csv('/content/drive/My Drive/198-adversarial-ml/Fake-News-Filipino/test_orig.csv', index=False)
df_test_adv.to_csv('/content/drive/My Drive/198-adversarial-ml/Fake-News-Filipino/test_adv.csv', index=False)

In [None]:
!cp "/content/drive/My Drive/198-adversarial-ml/Fake-News-Filipino/test_orig.csv" "test_orig_fil.csv"
!cp "/content/drive/My Drive/198-adversarial-ml/Fake-News-Filipino/test_adv.csv" "test_adv_fil.csv"

In [None]:
ids_old = df_test_orig.article.str.contains('kaagad$|kaagad-|kaagad ', flags = re.IGNORECASE, regex = True, na = False)
ids_new = df_test_adv.article.str.contains('kaagad$|kaagad-|kaagad ', flags = re.IGNORECASE, regex = True, na = False)

In [None]:
df_test_orig[ids_old]

Unnamed: 0,article,label
68,Hindi nakatakas sa mga awtoridad ng Light Rail...,1
261,"Ayon kay SPO1 Jaycee Calma, may hawak ng kaso,...",0
517,Matapos umugong ang balita kaugnay ng naging r...,1
527,Kinumpirma ng PNP na binawi nito ang dalawang ...,1
586,"MAYNILA, Pilipinas - Pinangunahan ni Pangulong...",1
673,Huli sa isinagawang entrapment operation ng Ph...,1
899,Isang magsasaka sa Urdaneta City sa Pangasinan...,1
957,17 YEARS OLD NA HIGH SCHOOL STUDENT SA CALOOCA...,1


In [None]:
df_test_adv[ids_new]

Unnamed: 0,article,label
673,Huli sa isinagawang entrapment operation ng Ph...,1


## Finetuning

The pretrained model will be finetuned to both the original dataset and the modified dataset. The pretrained model, *bert-tagalog-base-cased,* was trained using the WikiText-TL-39 dataset which is a corpus of 172,815 articles in Tagalog.

In [None]:
pretrained = 'jcblaise/bert-tagalog-base-cased'
tokenizer = AutoTokenizer.from_pretrained(pretrained)
model = AutoModelForSequenceClassification.from_pretrained(pretrained)

In [72]:
data = pd.read_csv('train_fil.csv')

X = list(data["article"])
y = list(data["label"])
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.3)
X_train_tokenized = tokenizer(X_train, padding=True, truncation=True, max_length=512)
X_val_tokenized = tokenizer(X_val, padding=True, truncation=True, max_length=512)

train_dataset = Dataset(X_train_tokenized, y_train)
val_dataset = Dataset(X_val_tokenized, y_val)

print(len(data))

2244


In [None]:
args = TrainingArguments(
    output_dir="output",
    evaluation_strategy="steps",
    eval_steps=500,
    per_device_train_batch_size=8,
    per_device_eval_batch_size=8,
    num_train_epochs=3,
    seed=0,
    load_best_model_at_end=True,
)
trainer = Trainer(
    model=model,
    args=args,
    train_dataset=train_dataset,
    eval_dataset=val_dataset,
    compute_metrics=compute_metrics,
    callbacks=[EarlyStoppingCallback(early_stopping_patience=3)],
)

In [None]:
trainer.train()

***** Running training *****
  Num examples = 1570
  Num Epochs = 3
  Instantaneous batch size per device = 8
  Total train batch size (w. parallel, distributed & accumulation) = 8
  Gradient Accumulation steps = 1
  Total optimization steps = 591
  Number of trainable parameters = 109160450


Step,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
500,0.167,0.333966,0.939169,0.921569,0.961988,0.941345


***** Running Evaluation *****
  Num examples = 674
  Batch size = 8
Saving model checkpoint to output/checkpoint-500
Configuration saved in output/checkpoint-500/config.json
Model weights saved in output/checkpoint-500/pytorch_model.bin


Training completed. Do not forget to share your model on huggingface.co/models =)


Loading best model from output/checkpoint-500 (score: 0.3339659869670868).


TrainOutput(global_step=591, training_loss=0.14444741440303435, metrics={'train_runtime': 504.6572, 'train_samples_per_second': 9.333, 'train_steps_per_second': 1.171, 'total_flos': 1239253070745600.0, 'train_loss': 0.14444741440303435, 'epoch': 3.0})

Copy the finetuned model to local storage

In [None]:
!cp -r "/content/drive/My Drive/198-adversarial-ml/Fake-News-Filipino/output" "output" 

## Evaluation

Load finetuned models if needed

In [74]:
!rm -r "output" #delete previously loaded model
!cp -r "/content/drive/My Drive/198-adversarial-ml/Fake-News-Filipino/output" "output"

Use the best model, step = 500.

In [75]:
test_data = pd.read_csv("test_orig_fil.csv")

X_test = list(test_data["article"])
X_test_tokenized = tokenizer(X_test, padding=True, truncation=True, max_length=512)
y_test = list(test_data["label"])

test_dataset = Dataset(X_test_tokenized)

model_path = "output/checkpoint-500"
model = AutoModelForSequenceClassification.from_pretrained(model_path, num_labels=2)

test_trainer = Trainer(model)

raw_pred, _, _ = test_trainer.predict(test_dataset)
y_pred = np.argmax(raw_pred, axis=1)

accuracy = accuracy_score(y_test, y_pred)
recall = recall_score(y_test, y_pred)
precision = precision_score(y_test, y_pred)
f1 = f1_score(y_test, y_pred)

temp_results = []
temp_results.append(accuracy)
temp_results.append(recall)
temp_results.append(precision)
temp_results.append(f1)
results.append(temp_results)

print(accuracy, recall, precision, f1)

loading configuration file output/checkpoint-500/config.json
Model config BertConfig {
  "_name_or_path": "output/checkpoint-500",
  "architectures": [
    "BertForSequenceClassification"
  ],
  "attention_probs_dropout_prob": 0.1,
  "classifier_dropout": null,
  "directionality": "bidi",
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "pooler_fc_size": 768,
  "pooler_num_attention_heads": 12,
  "pooler_num_fc_layers": 3,
  "pooler_size_per_head": 128,
  "pooler_type": "first_token_transform",
  "position_embedding_type": "absolute",
  "problem_type": "single_label_classification",
  "torch_dtype": "float32",
  "transformers_version": "4.25.1",
  "type_vocab_size": 2,
  "use_cache": true,
  "vocab_size": 30101
}

loading weights file output

0.9397089397089398 0.9518828451882845 0.9285714285714286 0.9400826446280992


In [77]:
test_data = pd.read_csv("test_adv_fil.csv")
X_test = list(test_data["article"])
X_test_tokenized = tokenizer(X_test, padding=True, truncation=True, max_length=512)
y_test = list(test_data["label"])

test_dataset = Dataset(X_test_tokenized)

model_path = "output/checkpoint-500"
model = AutoModelForSequenceClassification.from_pretrained(model_path, num_labels=2)

test_trainer = Trainer(model)

raw_pred, _, _ = test_trainer.predict(test_dataset)

y_pred = np.argmax(raw_pred, axis=1)

accuracy = accuracy_score(y_test, y_pred)
recall = recall_score(y_test, y_pred)
precision = precision_score(y_test, y_pred)
f1 = f1_score(y_test, y_pred)

temp_results = []
temp_results.append(accuracy)
temp_results.append(recall)
temp_results.append(precision)
temp_results.append(f1)
results.append(temp_results)

print(accuracy, recall, precision, f1)

loading configuration file output/checkpoint-500/config.json
Model config BertConfig {
  "_name_or_path": "output/checkpoint-500",
  "architectures": [
    "BertForSequenceClassification"
  ],
  "attention_probs_dropout_prob": 0.1,
  "classifier_dropout": null,
  "directionality": "bidi",
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "pooler_fc_size": 768,
  "pooler_num_attention_heads": 12,
  "pooler_num_fc_layers": 3,
  "pooler_size_per_head": 128,
  "pooler_type": "first_token_transform",
  "position_embedding_type": "absolute",
  "problem_type": "single_label_classification",
  "torch_dtype": "float32",
  "transformers_version": "4.25.1",
  "type_vocab_size": 2,
  "use_cache": true,
  "vocab_size": 30101
}

loading weights file output

0.9355509355509356 0.9518828451882845 0.9210526315789473 0.9362139917695472


# Visualization of Results

In [79]:
import plotly.graph_objects as go

res_t = np.array(results).T.tolist()

fig = go.Figure(data=[go.Table(
    header=dict(values=['Finetuned Model','Accuracy', 'Recall', 'Precision', 'F1-Score'],
                line_color='darkslategray',
                fill_color='lightskyblue',
                align='left'),
    cells=dict(values=[['Kaggle Fake News (Original)', 'Kaggle Fake News (Adversarial)', 'Fake News Filipino (Original)', 'Fake News Filipino (Adversarial)'],
                       res_t[0],
                       res_t[1],
                       res_t[2],
                       res_t[3]],
               line_color='darkslategray',
               fill_color='lightcyan',
               align='left'))
])

fig.update_layout(width=1000, height=500)
fig.show()

# Attributions


1.   [An Adversarial Benchmark for Fake News Detection Models](https://github.com/ljyflores/fake-news-adversarial-benchmark/blob/master/polarity_preprocessing.ipynb)
2.   [Fine-tuning pretrained NLP models with Huggingface’s Trainer](https://towardsdatascience.com/fine-tuning-pretrained-nlp-models-with-huggingfaces-trainer-6326a4456e7b)