Fake News Identification

The goal is to build a Machine Learning model that can classify a given news headline as real or fake. To achieve the task, we will be using a few popular news datasets as well as scraping data from sites for fake news(if the need arises). The first step to solving the problem is the creation of a dataset containing headlines and their respective class labels

Datasets

Fake and real news dataset: This is a collection of both fake and real news articles with features like title, text,subject and date.

Files: ./data/sources/Fake (2).csv and ./data/sources/True.csv

Getting Real about Fake News: This dataset is only a first step in understanding and tackling this problem. It contains text and metadata scraped from 244 websites tagged as "bullshit" by the BS Detector Chrome Extension by Daniel Sieradski. This is a combination of fake news and conspiracy theories (which by default are still fake).

Files: ./data/sources/fake.csv

Fake News: A binary classification dataset for both fake and real news articles.

Files: ./data/sources/fake_or_real_news.csv

Source based Fake News Classification: A binary classification dataset for both fake and real news posts from social media. In an era where fake WhatsApp forwards and Tweets are capable of influencing naive minds, tools and knowledge have to be put to practical use in not only mitigating the spread of misinformation but also to inform people about the type of news they consume.

Files: ./data/sources/news_articles.csv

AG News Classification Dataset: AG is a collection of more than 1 million news articles. News articles have been gathered from more than 2000 news sources by ComeToMyHead in more than 1 year of activity. ComeToMyHead is an academic news search engine which has been running since July, 2004. The dataset is provided by the academic comunity for research purposes in data mining. The AG's news topic classification dataset is constructed by choosing 4 largest classes from the original corpus.

Files: ./data/sources/News Classification train.csv and ./data/sources/News Classification test.csv

The final dataset created for our prupose of News Classification is saved in ./data/TARP_Project_Final_Dataset.zip. The dataset thus created was an approximately balanced one with very few null values.active

Running The Application

As of now we do not have an application but an example flask application

it is based on the modular structure to serve as an example of the same..

It can be launched by first init the VirtualEnv and then running the run.py in the home folder

for linux/mac

source ./env/bin/activate

For Windows with Python 3.7

pip install -r requirements.txt
set FLASK_APP=run.py
set FLAKS_DEBUG=1
flask run

The App folder contains all the information regarding the server

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
analyse		analyse
data		data
input		input
models		models
notebooks		notebooks
output		output
tests		tests
.gitignore		.gitignore
README.md		README.md
app.db		app.db
config.py		config.py
cspell.json		cspell.json
requirements.txt		requirements.txt
run.bat		run.bat
run.py		run.py
test.bat		test.bat

ashwiniyer176/Fake-News-Checker

Folders and files

Latest commit

History

Repository files navigation

Fake News Identification

Datasets

Running The Application

About

Resources

Stars

Watchers

Forks

Languages