# Fake news detection using machine learning
Fake news is one of the biggest problems facing online social media and even some news sites. Most of the time, we see a lot of fake news about politics. So using machine learning to detect fake news is a very difficult task.

Fake news is one of the biggest problems because it leads to a lot of misinformation in a particular area. Most of the time, spreading fake news about the political and religious beliefs of a community may lead to riots and violence as you have seen in the country you live in. So, to detect fake news, we can find relationships between fake news headlines so that we can train a machine learning model that can tell us whether a particular piece of information is fake or real just by observing the headline in the news. So in the section below, I will present to you a machine learning project on detecting fake news using Python programming language.g

### Importing libraries and loading the dataset

In [1]:
import pandas as pd
import numpy as np
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import MultinomialNB

In [3]:
df = pd.read_csv('fake_or_real_news.csv')
df.drop(columns='Unnamed: 0', inplace=True)
df.head()

Unnamed: 0,title,text,label
0,You Can Smell Hillary’s Fear,"Daniel Greenfield, a Shillman Journalism Fello...",FAKE
1,Watch The Exact Moment Paul Ryan Committed Pol...,Google Pinterest Digg Linkedin Reddit Stumbleu...,FAKE
2,Kerry to go to Paris in gesture of sympathy,U.S. Secretary of State John F. Kerry said Mon...,REAL
3,Bernie supporters on Twitter erupt in anger ag...,"— Kaydee King (@KaydeeKing) November 9, 2016 T...",FAKE
4,The Battle of New York: Why This Primary Matters,It's primary day in New York and front-runners...,REAL


In [4]:
df.shape

(6335, 3)

In [5]:
df.isnull().sum()

title    0
text     0
label    0
dtype: int64

### Building the model, training it, and making predictions

In [6]:
x = np.array(df['title'])
y = np.array(df['label'])

cv = CountVectorizer()

X = cv.fit_transform(x)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
model = MultinomialNB()
model.fit(X_train, y_train)
print(model.score(X_test, y_test))

0.8074191002367798


In [7]:
news_headline = "CA Exams 2021: Supreme Court asks ICAI to extend optout option for July exams, final order tomorrow"
data = cv.transform([news_headline]).toarray()
print(model.predict(data))

['REAL']


In [8]:
news_headline = "Cow dung can cure Corona Virus"
data = cv.transform([news_headline]).toarray()
print(model.predict(data))

['FAKE']
