In [1]:
import numpy as np
import pandas as pd
import re
import string

In [2]:
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score

In [3]:
from sklearn.metrics import confusion_matrix
from sklearn.metrics import classification_report

In [8]:
real = pd.read_csv("Dataset/True.csv/True.csv")

In [9]:
fake = pd.read_csv("Dataset/Fake.csv/Fake.csv")

In [10]:
real.head()

Unnamed: 0,title,text,subject,date
0,"As U.S. budget fight looms, Republicans flip t...",WASHINGTON (Reuters) - The head of a conservat...,politicsNews,"December 31, 2017"
1,U.S. military to accept transgender recruits o...,WASHINGTON (Reuters) - Transgender people will...,politicsNews,"December 29, 2017"
2,Senior U.S. Republican senator: 'Let Mr. Muell...,WASHINGTON (Reuters) - The special counsel inv...,politicsNews,"December 31, 2017"
3,FBI Russia probe helped by Australian diplomat...,WASHINGTON (Reuters) - Trump campaign adviser ...,politicsNews,"December 30, 2017"
4,Trump wants Postal Service to charge 'much mor...,SEATTLE/WASHINGTON (Reuters) - President Donal...,politicsNews,"December 29, 2017"


In [12]:
fake.head()

Unnamed: 0,title,text,subject,date
0,Donald Trump Sends Out Embarrassing New Year’...,Donald Trump just couldn t wish all Americans ...,News,"December 31, 2017"
1,Drunk Bragging Trump Staffer Started Russian ...,House Intelligence Committee Chairman Devin Nu...,News,"December 31, 2017"
2,Sheriff David Clarke Becomes An Internet Joke...,"On Friday, it was revealed that former Milwauk...",News,"December 30, 2017"
3,Trump Is So Obsessed He Even Has Obama’s Name...,"On Christmas day, Donald Trump announced that ...",News,"December 29, 2017"
4,Pope Francis Just Called Out Donald Trump Dur...,Pope Francis used his annual Christmas Day mes...,News,"December 25, 2017"


In [13]:
real["class"] = 1
fake["class"] = 0

In [15]:
real.head()

Unnamed: 0,title,text,subject,date,class
0,"As U.S. budget fight looms, Republicans flip t...",WASHINGTON (Reuters) - The head of a conservat...,politicsNews,"December 31, 2017",1
1,U.S. military to accept transgender recruits o...,WASHINGTON (Reuters) - Transgender people will...,politicsNews,"December 29, 2017",1
2,Senior U.S. Republican senator: 'Let Mr. Muell...,WASHINGTON (Reuters) - The special counsel inv...,politicsNews,"December 31, 2017",1
3,FBI Russia probe helped by Australian diplomat...,WASHINGTON (Reuters) - Trump campaign adviser ...,politicsNews,"December 30, 2017",1
4,Trump wants Postal Service to charge 'much mor...,SEATTLE/WASHINGTON (Reuters) - President Donal...,politicsNews,"December 29, 2017",1


In [16]:
fake.head()

Unnamed: 0,title,text,subject,date,class
0,Donald Trump Sends Out Embarrassing New Year’...,Donald Trump just couldn t wish all Americans ...,News,"December 31, 2017",0
1,Drunk Bragging Trump Staffer Started Russian ...,House Intelligence Committee Chairman Devin Nu...,News,"December 31, 2017",0
2,Sheriff David Clarke Becomes An Internet Joke...,"On Friday, it was revealed that former Milwauk...",News,"December 30, 2017",0
3,Trump Is So Obsessed He Even Has Obama’s Name...,"On Christmas day, Donald Trump announced that ...",News,"December 29, 2017",0
4,Pope Francis Just Called Out Donald Trump Dur...,Pope Francis used his annual Christmas Day mes...,News,"December 25, 2017",0


In [18]:
real.shape

(21417, 5)

In [19]:
fake.shape

(23481, 5)

In [21]:
df_data = pd.concat([real, fake], axis=0)

In [22]:
df_data.head(10)

Unnamed: 0,title,text,subject,date,class
0,"As U.S. budget fight looms, Republicans flip t...",WASHINGTON (Reuters) - The head of a conservat...,politicsNews,"December 31, 2017",1
1,U.S. military to accept transgender recruits o...,WASHINGTON (Reuters) - Transgender people will...,politicsNews,"December 29, 2017",1
2,Senior U.S. Republican senator: 'Let Mr. Muell...,WASHINGTON (Reuters) - The special counsel inv...,politicsNews,"December 31, 2017",1
3,FBI Russia probe helped by Australian diplomat...,WASHINGTON (Reuters) - Trump campaign adviser ...,politicsNews,"December 30, 2017",1
4,Trump wants Postal Service to charge 'much mor...,SEATTLE/WASHINGTON (Reuters) - President Donal...,politicsNews,"December 29, 2017",1
5,"White House, Congress prepare for talks on spe...","WEST PALM BEACH, Fla./WASHINGTON (Reuters) - T...",politicsNews,"December 29, 2017",1
6,"Trump says Russia probe will be fair, but time...","WEST PALM BEACH, Fla (Reuters) - President Don...",politicsNews,"December 29, 2017",1
7,Factbox: Trump on Twitter (Dec 29) - Approval ...,The following statements were posted to the ve...,politicsNews,"December 29, 2017",1
8,Trump on Twitter (Dec 28) - Global Warming,The following statements were posted to the ve...,politicsNews,"December 29, 2017",1
9,Alabama official to certify Senator-elect Jone...,WASHINGTON (Reuters) - Alabama Secretary of St...,politicsNews,"December 28, 2017",1


In [23]:
df_data.shape

(44898, 5)

In [28]:
df_data.info()

<class 'pandas.core.frame.DataFrame'>
Index: 44898 entries, 0 to 23480
Data columns (total 5 columns):
 #   Column   Non-Null Count  Dtype 
---  ------   --------------  ----- 
 0   title    44898 non-null  object
 1   text     44898 non-null  object
 2   subject  44898 non-null  object
 3   date     44898 non-null  object
 4   class    44898 non-null  int64 
dtypes: int64(1), object(4)
memory usage: 2.1+ MB


In [29]:
df_data.columns

Index(['title', 'text', 'subject', 'date', 'class'], dtype='object')

In [31]:
df_data.drop(["title", "subject", "date"], axis=1, inplace = True)

In [32]:
df_data.head(10)

Unnamed: 0,text,class
0,WASHINGTON (Reuters) - The head of a conservat...,1
1,WASHINGTON (Reuters) - Transgender people will...,1
2,WASHINGTON (Reuters) - The special counsel inv...,1
3,WASHINGTON (Reuters) - Trump campaign adviser ...,1
4,SEATTLE/WASHINGTON (Reuters) - President Donal...,1
5,"WEST PALM BEACH, Fla./WASHINGTON (Reuters) - T...",1
6,"WEST PALM BEACH, Fla (Reuters) - President Don...",1
7,The following statements were posted to the ve...,1
8,The following statements were posted to the ve...,1
9,WASHINGTON (Reuters) - Alabama Secretary of St...,1


In [33]:
df_data.isnull().sum()

text     0
class    0
dtype: int64

In [35]:
df_data = df_data.sample(frac=1)

In [36]:
df_data

Unnamed: 0,text,class
15280,Bernie Sanders can t effectively handle two lo...,0
10200,LONDON (Reuters) - Britain’s interior minister...,1
9390,This is fantastic! President Trump met with al...,0
3457,LONDON (Reuters) - British Prime Minister Ther...,1
1577,CARACAS (Reuters) - Venezuela on Monday accuse...,1
...,...,...
11610,ANKARA (Reuters) - Turkish President Tayyip Er...,1
20508,BEIRUT (Reuters) - Israeli jets flew low over ...,1
17398,It s worth noting that the victims of this hor...,0
3521,A major bipartisan infrastructure bill has gon...,0


In [37]:
df_data.reset_index(inplace=True)

In [38]:
df_data.head(10)

Unnamed: 0,index,text,class
0,15280,Bernie Sanders can t effectively handle two lo...,0
1,10200,LONDON (Reuters) - Britain’s interior minister...,1
2,9390,This is fantastic! President Trump met with al...,0
3,3457,LONDON (Reuters) - British Prime Minister Ther...,1
4,1577,CARACAS (Reuters) - Venezuela on Monday accuse...,1
5,10003,This woman has no shame1 Muslim activist Linda...,0
6,9789,Tennessee Titans Delanie Walker just turned o...,0
7,1046,Canadian Prime Minister Justin Trudeau partici...,0
8,8741,Wayne LaPierre has a way about him that define...,0
9,4450,There were a lot of memorable moments where Do...,0


In [39]:
df_data.drop(["index"], axis=1, inplace=True)

In [40]:
df_data.head(10)

Unnamed: 0,text,class
0,Bernie Sanders can t effectively handle two lo...,0
1,LONDON (Reuters) - Britain’s interior minister...,1
2,This is fantastic! President Trump met with al...,0
3,LONDON (Reuters) - British Prime Minister Ther...,1
4,CARACAS (Reuters) - Venezuela on Monday accuse...,1
5,This woman has no shame1 Muslim activist Linda...,0
6,Tennessee Titans Delanie Walker just turned o...,0
7,Canadian Prime Minister Justin Trudeau partici...,0
8,Wayne LaPierre has a way about him that define...,0
9,There were a lot of memorable moments where Do...,0


In [3]:
def processing(text):
    text = text.lower()  # Convert to lowercase
    text = re.sub(r'\[.*?\]', '', text)  # Remove text inside brackets
    text = re.sub(r'https?://\S+|www\.\S+', '', text)  # Remove URLs
    text = re.sub(r'<.*?>+', '', text)  # Remove HTML tags
    text = re.sub(r'[%s]' % re.escape(string.punctuation), '', text)  # Remove punctuation
    text = re.sub(r'\n', ' ', text)  # Replace newlines with space
    text = re.sub(r'\w*\d\w*', '', text)  # Remove words with numbers
    text = re.sub(r'\s+', ' ', text).strip()  # Remove extra spaces
    return text

In [44]:
df_data["text"] = df_data["text"].apply(processing)

In [46]:
X = df_data['text']
Y = df_data['class']

In [47]:
X

0        bernie sanders can t effectively handle two lo...
1        london reuters britain’s interior minister sai...
2        this is fantastic president trump met with all...
3        london reuters british prime minister theresa ...
4        caracas reuters venezuela on monday accused us...
                               ...                        
44893    ankara reuters turkish president tayyip erdoga...
44894    beirut reuters israeli jets flew low over the ...
44895    it s worth noting that the victims of this hor...
44896    a major bipartisan infrastructure bill has gon...
44897    moscow reuters former militants from bandit un...
Name: text, Length: 44898, dtype: object

In [48]:
Y

0        0
1        1
2        0
3        1
4        1
        ..
44893    1
44894    1
44895    0
44896    0
44897    1
Name: class, Length: 44898, dtype: int64

In [49]:
x_train, x_test, y_train, y_test = train_test_split(X, Y, test_size=0.2)

In [50]:
vectorization = TfidfVectorizer()
xtfid_train = vectorization.fit_transform(x_train)
xtfid_test = vectorization.transform(x_test)

In [51]:
model = LogisticRegression()

model.fit(xtfid_train, y_train)

In [53]:
x_train_pred = model.predict(xtfid_train)
train_score = accuracy_score(x_train_pred, y_train)

In [54]:
print(train_score)

0.9914249123002394


In [56]:
x_test_pred = model.predict(xtfid_test)

In [57]:
test_score = accuracy_score(x_test_pred, y_test)

In [58]:
print(test_score)

0.9855233853006682


In [59]:
class_report = classification_report(x_test_pred, y_test)
print(class_report)

              precision    recall  f1-score   support

           0       0.99      0.99      0.99      4738
           1       0.99      0.98      0.98      4242

    accuracy                           0.99      8980
   macro avg       0.99      0.99      0.99      8980
weighted avg       0.99      0.99      0.99      8980



In [60]:
conf_matrix = confusion_matrix(x_test_pred, y_test)
print(conf_matrix)

[[4677   61]
 [  69 4173]]


In [8]:
def output(n):
    if n==1:
        print("News is Real")
    elif n==0:
        print("News is Fake!!!")

def testing(news):
    new_x_test = pd.DataFrame({"text":[news]})
    new_x_test["text"] = new_x_test["text"].apply(processing)
    new_xtfid_test = vectorization.transform(new_x_test["text"])
    pred_lr = model.predict(new_xtfid_test)

    print("\n\n")
    return output(pred_lr[0])

In [86]:
news = str(input())
testing(news)

 WASHINGTON  â€”   The Justice Department told a federal appeals court on Thursday that it would not seek a rehearing of a decision that shut down President Trumpâ€™s targeted travel ban. Instead, the administration will start from scratch, issuing a new executive order, the department said. Last Thursday, a unanimous   panel of the United States Court of Appeals for the Ninth Circuit, in San Francisco, blocked the key parts of the original executive order, which suspended the nationâ€™s refugee program as well as travel from seven predominantly Muslim countries. The panel said the original ban was unlikely to survive constitutional scrutiny. The Justice Department said that the panelâ€™s decision was riddled with errors but that the flaws it noted would be addressed in the new executive order. â€œRather than continuing this litigation,â€ the Justice Departmentâ€™s brief said, â€œthe president intends in the near future to rescind the order and replace it with a new, substantially rev




News is Real


In [4]:
import joblib

In [6]:
model = joblib.load("fakenews_detection.joblib")

In [7]:
vectorization = joblib.load("tfid_vectorizer.joblib")

In [9]:
news = str(input())
testing(news)

 BRUSSELS (Reuters) - NATO allies on Tuesday welcomed President Donald Trump s decision to commit more forces to Afghanistan, as part of a new U.S. strategy he said would require more troops and funding from America s partners. Having run for the White House last year on a pledge to withdraw swiftly from Afghanistan, Trump reversed course on Monday and promised a stepped-up military campaign against  Taliban insurgents, saying:  Our troops will fight to win .  U.S. officials said he had signed off on plans to send about 4,000 more U.S. troops to add to the roughly 8,400 now deployed in Afghanistan. But his speech did not define benchmarks for successfully ending the war that began with the U.S.-led invasion of Afghanistan in 2001, and which he acknowledged had required an   extraordinary sacrifice of blood and treasure .  We will ask our NATO allies and global partners to support our new strategy, with additional troops and funding increases in line with our own. We are confident they 




News is Real


In [10]:
news = str(input())
testing(news)

 SAO PAULO (Reuters) - Cesar Mata Pires, the owner and co-founder of Brazilian engineering conglomerate OAS SA, one of the largest companies involved in Brazil s corruption scandal, died on Tuesday. He was 68. Mata Pires died of a heart attack while taking a morning walk in an upscale district of S o Paulo, where OAS is based, a person with direct knowledge of the matter said. Efforts to contact his family were unsuccessful. OAS declined to comment. The son of a wealthy cattle rancher in the northeastern state of Bahia, Mata Pires  links to politicians were central to the expansion of OAS, which became Brazil s No. 4 builder earlier this decade, people familiar with his career told Reuters last year. His big break came when he befriended Antonio Carlos Magalh es, a popular politician who was Bahia governor several times, and eventually married his daughter Tereza. Brazilians joked that OAS stood for  Obras Arranjadas pelo Sogro  - or  Work Arranged by the Father-In-Law.   After years o




News is Real


In [11]:
news = str(input())
testing(news)

 Vic Bishop Waking TimesOur reality is carefully constructed by powerful corporate, political and special interest sources in order to covertly sway public opinion. Blatant lies are often televised regarding terrorism, food, war, health, etc. They are fashioned to sway public opinion and condition viewers to accept what have become destructive societal norms.The practice of manipulating and controlling public opinion with distorted media messages has become so common that there is a whole industry formed around this. The entire role of this brainwashing industry is to figure out how to spin information to journalists, similar to the lobbying of government. It is never really clear just how much truth the journalists receive because the news industry has become complacent. The messages that it presents are shaped by corporate powers who often spend millions on advertising with the six conglomerates that own 90% of the media:General Electric (GE), News-Corp, Disney, Viacom, Time Warner, 




News is Fake!!!
