<a href="https://colab.research.google.com/github/PeteCrowley/MachineLearning/blob/main/FakeNewsDetection.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [13]:
import pandas
import numpy as np
from tensorflow.keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import PassiveAggressiveClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, confusion_matrix
import tensorflow as tf


# Dataset from here: https://www.kaggle.com/c/fake-news/data?select=train.csv
df = pandas.read_csv("train.csv")
clean_df = df.dropna()

# Only using first 3000 values because my RAM can't handle more data
X = np.array(clean_df[df['id'] < 3000].text)
Y = np.array(clean_df[df['id'] < 3000].label)

# Splitting into train and test data
train_X, test_X, train_Y, test_Y = train_test_split(X, Y, test_size=0.2, random_state=7)




In [15]:
# Changing the string text data into TfidfVectors which take numerical data on the text
vectorizer = TfidfVectorizer(stop_words='english', max_df=0.7)

tfidf_train = vectorizer.fit_transform(train_X)
tfidf_test = vectorizer.transform(test_X)



In [16]:
# A more lightweight model that runs quicker and is less RAM intensive
model = PassiveAggressiveClassifier(max_iter=50)
model.fit(tfidf_train, train_Y)

y_pred = model.predict(tfidf_test)
score = accuracy_score(test_Y, y_pred)
print(f'Accuracy: {round(score*100,2)}%')



Accuracy: 91.6%


In [17]:
# Prediction program using the lightweight model
text = [input("Enter Article: ")]
tfidf_text = vectorizer.transform(text)
choice = model.predict(tfidf_text)
if choice == 0:
  print("REAL")
else:
  print("FAKE")

Enter Article: Dr. Anthony Fauci and Hillary Clinton appeared to reunite like two long-lost lovers during Colin Powell’s funeral on Friday.  A photo taken at the event shows the National Institute of Allergy and Infectious Diseases director and former secretary of state beaming at each other and sharing a tender embrace before the funeral service.   Embed from Getty Images One wonders what they could be talking about: perhaps Fauci’s attempts to cover up his funding of gain of function research in China, or possibly how Clinton managed to get away with destabilizing Donald Trump’s presidency with a fake Russia collusion narrative.  Embed from Getty Images In any case, they certainly appear delighted to share in each other’s company for the solemn occasion.  Powell, the former secretary of state who lied about weapons of mass destruction to fool America into invading Iraq, died Oct. 18. of COVID complications despite receiving the shot.
FAKE


In [18]:
# Converts the vectors into dense objects which can be inputed into the Neural Network
tfidf_train_new = tfidf_train.todense()
tfidf_test_new = tfidf_test.todense()


# Setting Constants
EPOCHS = 10
NUM_CLASSES = 1
INPUT_SHAPE = (tfidf_train.shape[1], )


# Creating network with specific layers
model = Sequential()

model.add(Flatten(input_shape=INPUT_SHAPE))
model.add(Dropout(rate=0.2))
model.add(Dense(units=32, activation='relu'))
model.add(Dropout(rate=0.2))
model.add(Dense(units=16, activation='relu'))
model.add(Dense(units=NUM_CLASSES, activation='sigmoid'))

# Compiling and fitting the neural network
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(tfidf_train_new, train_Y, epochs=EPOCHS, shuffle=True)

# Printing Info about the network's accuracy
scores = model.evaluate(tfidf_test_new, test_Y, verbose=0)
print('Test accuracy:', scores[1])
model.save("Fake_News_Model")

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Test accuracy: 0.9007633328437805
INFO:tensorflow:Assets written to: Fake_News_Model/assets


In [19]:
from keras.models import load_model

model = load_model('Fake_News_Model')
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])


In [20]:
# Prediction for a given article
text = [input("Enter Article: ")]

truth_score = model.predict(vectorizer.transform(text).todense())[0]

if truth_score > 0.5:
  print(f'There is a {truth_score[0]*100}% chance this is FAKE news!')
else:
  print(f'There is a {100 - truth_score[0]*100}% chance this is REAL NEWS!')

Enter Article: HOUSTON — Concert organizers and Houston city officials knew that the crowd at a music festival planned by Travis Scott, a favorite local rapper turned megastar, could be difficult to control. That’s what happened two years earlier, the last time Mr. Scott held his Astroworld Festival.  For months, they braced themselves, adding dozens more officers from the Houston Police Department and more private security hired by Live Nation, the concert organizer.  The Houston police chief, who knows Mr. Scott personally and felt the musician had been trying to do good for his hometown, said that he visited Mr. Scott in his trailer before his show on Friday and conveyed concerns about the energy in the crowd, according to a person with knowledge of the chief’s account.  His show that day included very devoted fans, the chief, Troy Finner, said he relayed to Mr. Scott.  That energy had been building for months among fans who were drawn to see Mr. Scott because of that atmosphere and