# **FAKE NEWS DETECTION**

### **Description**  

The dataset contains two types of articles fake and real News. This dataset was collected from realworld sources; the truthful articles were obtained by crawling articles from Reuters.com (News website). As for the fake news articles, they were collected from different sources. The fake news articles were collected from unreliable websites that were flagged by Politifact (a fact-checking organization in the USA) and Wikipedia. The dataset contains different types of articles on different topics, however, the majority of articles focus on political and World news topics.

![picture](https://drive.google.com/uc?export=view&id=1ggzK6EayVTjay5inC8fR-e86Rh1f23H8)

The dataset consists of two CSV files. The first file named “True.csv” contains more than 12,600 articles from reuter.com. The second file named “Fake.csv” contains more than 12,600 articles from different fake news outlet resources. Each article contains the following information: article title, text, type and the date the article was published on. To match the fake news data collected for kaggle.com, we focused mostly on collecting articles from 2016 to 2017. The data collected were cleaned and processed, however, the punctuations and mistakes that existed in the fake news were kept in the text.

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [2]:
path='drive/My Drive/Dataset'

### **Importing the Essential Libraries**

In [3]:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
import os
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import PassiveAggressiveClassifier
from sklearn.metrics import accuracy_score,confusion_matrix
import pickle

### **Creating DataFrame**

In [4]:
df=pd.read_csv(path+"/News.csv")
df.head(5)

Unnamed: 0,title,text,subject,date,Label
0,Donald Trump Sends Out Embarrassing New Year’...,Donald Trump just couldn t wish all Americans ...,News,"December 31, 2017",FAKE
1,Drunk Bragging Trump Staffer Started Russian ...,House Intelligence Committee Chairman Devin Nu...,News,"December 31, 2017",FAKE
2,Sheriff David Clarke Becomes An Internet Joke...,"On Friday, it was revealed that former Milwauk...",News,"December 30, 2017",FAKE
3,Trump Is So Obsessed He Even Has Obama’s Name...,"On Christmas day, Donald Trump announced that ...",News,"December 29, 2017",FAKE
4,Pope Francis Just Called Out Donald Trump Dur...,Pope Francis used his annual Christmas Day mes...,News,"December 25, 2017",FAKE


### **Dependent and Independent Data**

In [5]:
x=df.iloc[:,1]
y=df.iloc[:,-1]

In [6]:
x.head(5)

0    Donald Trump just couldn t wish all Americans ...
1    House Intelligence Committee Chairman Devin Nu...
2    On Friday, it was revealed that former Milwauk...
3    On Christmas day, Donald Trump announced that ...
4    Pope Francis used his annual Christmas Day mes...
Name: text, dtype: object

In [7]:
y.head(5)

0    FAKE
1    FAKE
2    FAKE
3    FAKE
4    FAKE
Name: Label, dtype: object

### **Splitting The DataSet**

In [8]:
x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.25,random_state=3)

In [9]:
x_train.head(5)

20875                                       Lefty losers  
17811    Hillary Clinton today falsely called Donald Tr...
19749                                                     
2925     Donald Trump hasn t even been in office for a ...
20235    If we had a real President who wasn t spending...
Name: text, dtype: object

In [10]:
x_test.head(5)

1702     The biggest threat to the oligarchy is for the...
40838    MOGADISHU (Reuters) - More than 200 people wer...
1970     Senator Lindsey Graham is on the attack after ...
36126    CARACAS (Reuters) - Venezuelans vote on Sunday...
16807    So can anyone please explain what the U.S. got...
Name: text, dtype: object

In [11]:
y_train.head(5)

20875    FAKE
17811    FAKE
19749    FAKE
2925     FAKE
20235    FAKE
Name: Label, dtype: object

In [12]:
y_test.head(5)

1702     FAKE
40838    REAL
1970     FAKE
36126    REAL
16807    FAKE
Name: Label, dtype: object

### **Vectorisation**

In [13]:
tfvector=TfidfVectorizer(stop_words='english',max_df=0.7)
tfid_xtrain=tfvector.fit_transform(x_train)
tfid_xtest=tfvector.transform(x_test)

### **Passive Aggressive Classifier**

In [14]:
classifier=PassiveAggressiveClassifier(max_iter=50)
classifier.fit(tfid_xtrain,y_train)

### **Prediction**

In [15]:
y_pred=classifier.predict(tfid_xtest)

### **Accuracy Score**

In [16]:
score=accuracy_score(y_test,y_pred)
print('Accuracy Score: ',score)

Accuracy Score:  0.9941118743866536


### **Implementing as Web App**

In [17]:
!pip install anvil-uplink

Collecting argparse (from anvil-uplink)
  Using cached argparse-1.4.0-py2.py3-none-any.whl (23 kB)
Installing collected packages: argparse
Successfully installed argparse-1.4.0


In [18]:
import anvil.server

In [19]:
anvil.server.connect('server_ZOCZU4G4YVLOGCFNQUYTJYZ6-65IRZY5RADKHQF27')

Connecting to wss://anvil.works/uplink
Anvil websocket open
Connected to "Default Environment" as SERVER


In [20]:
@anvil.server.callable
def fake_news_deter(news):
    data=[news]
    vectorised_data=tfvector.transform(data)
    prediction=classifier.predict(vectorised_data)
    return prediction[0]

In [21]:
anvil.server.wait_forever()

KeyboardInterrupt: ignored

### **Web App Link**

visit link :-https://fantastic-scrawny-foundation.anvil.app