
Author  : Shashi Bhushan

Mail    : shashirwryogi@gmail.com

Project : Fake News Detection

Dataset : https://www.kaggle.com/datasets/jainpooja/fake-news-detection

# Importing the Dependencies

In [1]:
import pandas as pd 
import numpy as np
import re
import string
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.ensemble import RandomForestClassifier

# Loading Datasets

In [2]:
df_fake=pd.read_csv("D:\Datasets\Fake.csv")
df_true=pd.read_csv("D:\Datasets\True.csv")

In [3]:
df_fake

Unnamed: 0,title,text,subject,date
0,Donald Trump Sends Out Embarrassing New Year’...,Donald Trump just couldn t wish all Americans ...,News,"December 31, 2017"
1,Drunk Bragging Trump Staffer Started Russian ...,House Intelligence Committee Chairman Devin Nu...,News,"December 31, 2017"
2,Sheriff David Clarke Becomes An Internet Joke...,"On Friday, it was revealed that former Milwauk...",News,"December 30, 2017"
3,Trump Is So Obsessed He Even Has Obama’s Name...,"On Christmas day, Donald Trump announced that ...",News,"December 29, 2017"
4,Pope Francis Just Called Out Donald Trump Dur...,Pope Francis used his annual Christmas Day mes...,News,"December 25, 2017"
...,...,...,...,...
23476,McPain: John McCain Furious That Iran Treated ...,21st Century Wire says As 21WIRE reported earl...,Middle-east,"January 16, 2016"
23477,JUSTICE? Yahoo Settles E-mail Privacy Class-ac...,21st Century Wire says It s a familiar theme. ...,Middle-east,"January 16, 2016"
23478,Sunnistan: US and Allied ‘Safe Zone’ Plan to T...,Patrick Henningsen 21st Century WireRemember ...,Middle-east,"January 15, 2016"
23479,How to Blow $700 Million: Al Jazeera America F...,21st Century Wire says Al Jazeera America will...,Middle-east,"January 14, 2016"


In [4]:
df_true

Unnamed: 0,title,text,subject,date
0,"As U.S. budget fight looms, Republicans flip t...",WASHINGTON (Reuters) - The head of a conservat...,politicsNews,"December 31, 2017"
1,U.S. military to accept transgender recruits o...,WASHINGTON (Reuters) - Transgender people will...,politicsNews,"December 29, 2017"
2,Senior U.S. Republican senator: 'Let Mr. Muell...,WASHINGTON (Reuters) - The special counsel inv...,politicsNews,"December 31, 2017"
3,FBI Russia probe helped by Australian diplomat...,WASHINGTON (Reuters) - Trump campaign adviser ...,politicsNews,"December 30, 2017"
4,Trump wants Postal Service to charge 'much mor...,SEATTLE/WASHINGTON (Reuters) - President Donal...,politicsNews,"December 29, 2017"
...,...,...,...,...
21412,'Fully committed' NATO backs new U.S. approach...,BRUSSELS (Reuters) - NATO allies on Tuesday we...,worldnews,"August 22, 2017"
21413,LexisNexis withdrew two products from Chinese ...,"LONDON (Reuters) - LexisNexis, a provider of l...",worldnews,"August 22, 2017"
21414,Minsk cultural hub becomes haven from authorities,MINSK (Reuters) - In the shadow of disused Sov...,worldnews,"August 22, 2017"
21415,Vatican upbeat on possibility of Pope Francis ...,MOSCOW (Reuters) - Vatican Secretary of State ...,worldnews,"August 22, 2017"


# Data Preprocessing

In [5]:
# shape of the datasets
df_fake.shape, df_true.shape

((23481, 4), (21417, 4))

In [6]:
# creat new coloumn named 'class' which has kind of 'text'
df_fake['class']=0
df_true['class']=1

In [7]:
df_fake

Unnamed: 0,title,text,subject,date,class
0,Donald Trump Sends Out Embarrassing New Year’...,Donald Trump just couldn t wish all Americans ...,News,"December 31, 2017",0
1,Drunk Bragging Trump Staffer Started Russian ...,House Intelligence Committee Chairman Devin Nu...,News,"December 31, 2017",0
2,Sheriff David Clarke Becomes An Internet Joke...,"On Friday, it was revealed that former Milwauk...",News,"December 30, 2017",0
3,Trump Is So Obsessed He Even Has Obama’s Name...,"On Christmas day, Donald Trump announced that ...",News,"December 29, 2017",0
4,Pope Francis Just Called Out Donald Trump Dur...,Pope Francis used his annual Christmas Day mes...,News,"December 25, 2017",0
...,...,...,...,...,...
23476,McPain: John McCain Furious That Iran Treated ...,21st Century Wire says As 21WIRE reported earl...,Middle-east,"January 16, 2016",0
23477,JUSTICE? Yahoo Settles E-mail Privacy Class-ac...,21st Century Wire says It s a familiar theme. ...,Middle-east,"January 16, 2016",0
23478,Sunnistan: US and Allied ‘Safe Zone’ Plan to T...,Patrick Henningsen 21st Century WireRemember ...,Middle-east,"January 15, 2016",0
23479,How to Blow $700 Million: Al Jazeera America F...,21st Century Wire says Al Jazeera America will...,Middle-east,"January 14, 2016",0


In [8]:
df_true

Unnamed: 0,title,text,subject,date,class
0,"As U.S. budget fight looms, Republicans flip t...",WASHINGTON (Reuters) - The head of a conservat...,politicsNews,"December 31, 2017",1
1,U.S. military to accept transgender recruits o...,WASHINGTON (Reuters) - Transgender people will...,politicsNews,"December 29, 2017",1
2,Senior U.S. Republican senator: 'Let Mr. Muell...,WASHINGTON (Reuters) - The special counsel inv...,politicsNews,"December 31, 2017",1
3,FBI Russia probe helped by Australian diplomat...,WASHINGTON (Reuters) - Trump campaign adviser ...,politicsNews,"December 30, 2017",1
4,Trump wants Postal Service to charge 'much mor...,SEATTLE/WASHINGTON (Reuters) - President Donal...,politicsNews,"December 29, 2017",1
...,...,...,...,...,...
21412,'Fully committed' NATO backs new U.S. approach...,BRUSSELS (Reuters) - NATO allies on Tuesday we...,worldnews,"August 22, 2017",1
21413,LexisNexis withdrew two products from Chinese ...,"LONDON (Reuters) - LexisNexis, a provider of l...",worldnews,"August 22, 2017",1
21414,Minsk cultural hub becomes haven from authorities,MINSK (Reuters) - In the shadow of disused Sov...,worldnews,"August 22, 2017",1
21415,Vatican upbeat on possibility of Pope Francis ...,MOSCOW (Reuters) - Vatican Secretary of State ...,worldnews,"August 22, 2017",1


In [9]:
# Split the last 10 rows from the datasets to testing data.

df_fake_test = df_fake.tail(10)
for i in range(23480,23470,-1):
    df_fake.drop(i,axis=0,inplace=True)
df_true_test = df_true.tail(10)
for i in range(21416,21406,-1):
    df_true.drop(i,axis=0,inplace=True)

In [10]:
# Merge testing data.

df_test = pd.concat([df_fake_test,df_true_test],axis=0)
df_test.to_csv('Test.csv')
df_merge = pd.concat([df_fake,df_true],axis=0)

In [11]:
df_merge

Unnamed: 0,title,text,subject,date,class
0,Donald Trump Sends Out Embarrassing New Year’...,Donald Trump just couldn t wish all Americans ...,News,"December 31, 2017",0
1,Drunk Bragging Trump Staffer Started Russian ...,House Intelligence Committee Chairman Devin Nu...,News,"December 31, 2017",0
2,Sheriff David Clarke Becomes An Internet Joke...,"On Friday, it was revealed that former Milwauk...",News,"December 30, 2017",0
3,Trump Is So Obsessed He Even Has Obama’s Name...,"On Christmas day, Donald Trump announced that ...",News,"December 29, 2017",0
4,Pope Francis Just Called Out Donald Trump Dur...,Pope Francis used his annual Christmas Day mes...,News,"December 25, 2017",0
...,...,...,...,...,...
21402,Exclusive: Trump's Afghan decision may increas...,ON BOARD A U.S. MILITARY AIRCRAFT (Reuters) - ...,worldnews,"August 22, 2017",1
21403,U.S. puts more pressure on Pakistan to help wi...,WASHINGTON (Reuters) - The United States sugge...,worldnews,"August 21, 2017",1
21404,Exclusive: U.S. to withhold up to $290 million...,WASHINGTON (Reuters) - The United States has d...,worldnews,"August 22, 2017",1
21405,Trump talks tough on Pakistan's 'terrorist' ha...,ISLAMABAD (Reuters) - Outlining a new strategy...,worldnews,"August 22, 2017",1


In [12]:
# Remove unwanted columns.

df = df_merge.drop(columns=['title','subject','date'])
df.head()

Unnamed: 0,text,class
0,Donald Trump just couldn t wish all Americans ...,0
1,House Intelligence Committee Chairman Devin Nu...,0
2,"On Friday, it was revealed that former Milwauk...",0
3,"On Christmas day, Donald Trump announced that ...",0
4,Pope Francis used his annual Christmas Day mes...,0


In [13]:
# cheking null values

df.isnull().sum()

text     0
class    0
dtype: int64

In [14]:
#  Build text convert function.

def word_drop(text):
    text = text.lower()
    text = re.sub ('\[.*?\]', '', text)
    text = re.sub("\\W"," ", text)
    text= re.sub('https?://\S+|www\.\S+','', text)
    text= re.sub ('<.*?>+','', text)
    text = re.sub('[%s]' % re.escape(string.punctuation), '', text)
    text= re.sub('\n', '', text)
    text = re.sub('\w*\d\w*','', text)
    return text

In [15]:
df

Unnamed: 0,text,class
0,Donald Trump just couldn t wish all Americans ...,0
1,House Intelligence Committee Chairman Devin Nu...,0
2,"On Friday, it was revealed that former Milwauk...",0
3,"On Christmas day, Donald Trump announced that ...",0
4,Pope Francis used his annual Christmas Day mes...,0
...,...,...
21402,ON BOARD A U.S. MILITARY AIRCRAFT (Reuters) - ...,1
21403,WASHINGTON (Reuters) - The United States sugge...,1
21404,WASHINGTON (Reuters) - The United States has d...,1
21405,ISLAMABAD (Reuters) - Outlining a new strategy...,1


In [16]:
df['text'] = df['text'].apply(word_drop)
df

Unnamed: 0,text,class
0,donald trump just couldn t wish all americans ...,0
1,house intelligence committee chairman devin nu...,0
2,on friday it was revealed that former milwauk...,0
3,on christmas day donald trump announced that ...,0
4,pope francis used his annual christmas day mes...,0
...,...,...
21402,on board a u s military aircraft reuters ...,1
21403,washington reuters the united states sugge...,1
21404,washington reuters the united states has d...,1
21405,islamabad reuters outlining a new strategy...,1


In [17]:
# Split X , y into X_train, X_test, y_train, y_test.

X = df['text']
y = df['class']

In [18]:
y

0        0
1        0
2        0
3        0
4        0
        ..
21402    1
21403    1
21404    1
21405    1
21406    1
Name: class, Length: 44878, dtype: int64

# Spliting dataset into train test Split

In [19]:
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.2,random_state=50)

# Fit vectorizer on the data.

In [20]:
vectorization = TfidfVectorizer()
xv_train = vectorization.fit_transform(X_train)
xv_test = vectorization.transform(X_test)

# Build Logistic Regression model.

In [21]:
LR = LogisticRegression()
LR.fit(xv_train,y_train)
pred_LR = LR.predict(xv_test)
print(accuracy_score(y_test,pred_LR))
print(classification_report(y_test,pred_LR))

0.9858511586452763

              precision    recall  f1-score   support



           0       0.99      0.98      0.99      4641

           1       0.98      0.99      0.99      4335



    accuracy                           0.99      8976

   macro avg       0.99      0.99      0.99      8976

weighted avg       0.99      0.99      0.99      8976




# Build Decidion Tree Classifier model.

In [22]:
DT = DecisionTreeClassifier()
DT.fit(xv_train,y_train)
pred_DT = DT.predict(xv_test)
print(accuracy_score(y_test,pred_DT))
print(classification_report(y_test,pred_DT))

0.9956550802139037

              precision    recall  f1-score   support



           0       1.00      1.00      1.00      4641

           1       1.00      1.00      1.00      4335



    accuracy                           1.00      8976

   macro avg       1.00      1.00      1.00      8976

weighted avg       1.00      1.00      1.00      8976




# Build Gradient Boosting Classifier model.


In [23]:
GB = GradientBoostingClassifier()
GB.fit(xv_train,y_train)
pred_GB = GB.predict(xv_test)
print(accuracy_score(y_test,pred_GB))
print(classification_report(y_test,pred_GB))

0.9956550802139037

              precision    recall  f1-score   support



           0       1.00      0.99      1.00      4641

           1       0.99      1.00      1.00      4335



    accuracy                           1.00      8976

   macro avg       1.00      1.00      1.00      8976

weighted avg       1.00      1.00      1.00      8976




# Build Random Forest Classifier model.

In [24]:
RF = RandomForestClassifier()
RF.fit(xv_train,y_train)
pred_RF = RF.predict(xv_test)
print(accuracy_score(y_test,pred_RF))
print(classification_report(y_test,pred_RF))

0.9906417112299465

              precision    recall  f1-score   support



           0       0.99      0.99      0.99      4641

           1       0.99      0.99      0.99      4335



    accuracy                           0.99      8976

   macro avg       0.99      0.99      0.99      8976

weighted avg       0.99      0.99      0.99      8976




# Manual Testing

In [25]:

def output_lable(n):
    if n == 0:
        return "Fake News"
    elif n == 1:
        return "True News"
def manual_testing(news):
    testing_news = {"text":[news]}
    new_def_test= pd.DataFrame (testing_news)
    new_def_test["text"] = new_def_test["text"].apply(word_drop)
    new_x_test= new_def_test["text"]
    new_xv_test = vectorization.transform(new_x_test)
    pred_LR = LR.predict(new_xv_test)
    pred_DT = DT.predict(new_xv_test)
    pred_GB = GB.predict(new_xv_test)
    pred_RF = RF.predict(new_xv_test)
    return print("\n\nLR Prediction: {} \nDT Prediction: {} \nGBC Prediction: {} \nRFC Prediction: {}".format(output_lable(pred_LR),
                                                                                                              output_lable(pred_DT),
                                                                                                              output_lable(pred_GB),
                                                                                                              output_lable(pred_RF)))

In [26]:
news = str(input('Enter your news:'))
manual_testing(news)

Enter your news:Donald Trump just couldn t wish all Americans a Happy New Year and leave it at that. Instead, he had to give a shout out to his enemies, haters and  the very dishonest fake news media.  The former reality show star had just one job to do and he couldn t do it. As our Country rapidly grows stronger and smarter, I want to wish all of my friends, supporters, enemies, haters, and even the very dishonest Fake News Media, a Happy and Healthy New Year,  President Angry Pants tweeted.  2018 will be a great year for America! As our Country rapidly grows stronger and smarter, I want to wish all of my friends, supporters, enemies, haters, and even the very dishonest Fake News Media, a Happy and Healthy New Year. 2018 will be a great year for America!  Donald J. Trump (@realDonaldTrump) December 31, 2017Trump s tweet went down about as welll as you d expect.What kind of president sends a New Year s greeting like this despicable, petty, infantile gibberish? Only Trump! His lack of d

In [27]:
news = str(input('Enter your news:'))
manual_testing(news)

Enter your news:NEW YORK (Reuters) - The U.S. Justice Department has issued new guidelines for immigration judges that remove some instructions for how to protect unaccompanied juveniles appearing in their courtrooms. A Dec. 20 memo, issued by the Executive Office for Immigration Review (EOIR) replaces 2007 guidelines, spelling out policies and procedures judges should follow in dealing with children who crossed the border illegally alone and face possible deportation.  The new memo removes suggestions contained in the 2007 memo for how to conduct “child-sensitive questioning” and adds reminders to judges to maintain “impartiality” even though “juvenile cases may present sympathetic allegations.” The new document also changes the word “child” to “unmarried individual under the age of 18” in many instances. (Link to comparison: tmsnrt.rs/2BlT0VK May 2007 document: tmsnrt.rs/2BBR8wj December 2017 document: tmsnrt.rs/2C2sWCs)  An EOIR official said the new memo contained “clarifications a

In [28]:
<------The End------>

SyntaxError: invalid syntax (3347145578.py, line 1)