In [None]:
from pipelines.data_preparation_pipeline import DataPreparationPipeline
from fake_news_classifier import FakeNewsClassifier
import pandas as pd

# Load Data


In [2]:
data_preparation_pipeline = DataPreparationPipeline(
    "configs/pipelines_config/data_preparation_config.json"
)
train_data, test_data, val_data = data_preparation_pipeline.run()

#################### Data Preparation Pipeline  ####################
===== Loading data... =====
===== Applying label mapping... =====
===== Splitting data... =====
Data preparation pipeline completed.
##################################################




# Apply Reverse Label Mapping


In [3]:
test_df = pd.DataFrame(test_data, columns=["content", "label"])

In [4]:
label_mapping = {
    "reliable": 0,
    "bias": 1,
    "conspiracy": 2,
    "fake": 3,
    "rumor": 4,
    "unreliable": 5,
    "other": 6,
}

In [5]:
reverse_label_mapping = {v: k for k, v in label_mapping.items()}

In [6]:
test_df["label"] = test_df["label"].map(reverse_label_mapping)

# Load Model


In [7]:
classifier = FakeNewsClassifier(
    "configs/classifier_config.json", test_df["label"].nunique()
)

In [8]:
model = classifier.load_pretrained("models/bert-bilstm-v9.pth")

{'accuracy': 0.971296659262163, 'precision': np.float64(0.9713146285670903), 'recall': np.float64(0.971296659262163), 'f1_score': np.float64(0.9713005984735034)}
Model loaded successfully from models/bert-bilstm-v9.pth


# Inference


### Reliable Article


In [9]:
reliable_article = test_df[test_df["label"] == "reliable"].iloc[3]

In [10]:
print(reliable_article["content"])

Omnisport [SEP] View photos
Though he is taking a relaxed approach to the last year of his career, Usain Bolt expects to be at his best at the 2017 world championships More Usain Bolt insists he will be at the top of his game when he brings the curtain down on his glittering career at next year's World Championships.
The legendary Jamaican sprinter completed the triple-triple at Rio 2016, taking his tally of Olympic gold medals to nine, and is set to retire after the World Championships in London in August.
Bolt says he is feeling less pressure heading into the final year of his career and has changed his approach to his training programme, but that has not altered what he expects of himself.
Speaking at a media conference ahead of the premiere of the documentary film 'I Aam Bolt', he was asked if he had any fear that the likes of Andre De Grasse and Justin Gatlin could defeat him.
"I'm not worried. I am always going to be prepared, I still never want to lose, we're just trying to find

In [11]:
prediction = model.predict(pd.Series(reliable_article["content"]))

Testing: 100%|██████████| 1/1 [00:10<00:00, 10.44s/it]


In [12]:
print(reverse_label_mapping[prediction[0]])

reliable


### Fake Article


In [13]:
fake_article = test_df[test_df["label"] == "fake"].iloc[0]

In [14]:
print(fake_article["content"])

The Real Revo [SEP] (Before It's News)

RINOs take phones off the hook.

Following the epic, 21-hour speech by Sen. Ted Cruz, R-Texas, supporting the defunding of Obamacare, either voters made so many calls to establishment Republicans that their phone lines melted, or those GOP leaders took their phones off the hook.

Even in this age of digital wizardry and limitless voicemail, callers could not get through at all to Sen. Minority Leader Mitch McConnell, R-Ky.

A message said the senator was experiencing a high volume of calls and directed members of the public to call back later or visit his website.

It was the same story with the man who was the face of the GOP in the 2008 elections, former GOP presidential candidate Sen. John McCain, R-Ariz.

His phone was off the hook, too. Callers got a message stating his voicemail box was full.


In [15]:
prediction = model.predict(pd.Series(fake_article["content"]))

Testing: 100%|██████████| 1/1 [00:09<00:00,  9.99s/it]


In [16]:
print(reverse_label_mapping[prediction[0]])

fake
