In [1]:
import pandas as pd
import numpy as np
import spacy

In [None]:
!python -m spacy download en_core_web_lg

In [5]:
nlp = spacy.load("en_core_web_lg")

In [6]:
news_df = pd.read_csv("fake_and_real_news.csv")
news_df.head()

Unnamed: 0,Text,label
0,Top Trump Surrogate BRUTALLY Stabs Him In The...,Fake
1,U.S. conservative leader optimistic of common ...,Real
2,"Trump proposes U.S. tax overhaul, stirs concer...",Real
3,Court Forces Ohio To Allow Millions Of Illega...,Fake
4,Democrats say Trump agrees to work on immigrat...,Real


In [7]:
news_df.shape

(9900, 2)

In [8]:
# Checking for class imbalance
news_df['label'].value_counts()

label
Fake    5000
Real    4900
Name: count, dtype: int64

In [12]:
samp_news = news_df.iloc[0,0]
samp_news

' Top Trump Surrogate BRUTALLY Stabs Him In The Back: ‘He’s Pathetic’ (VIDEO) It s looking as though Republican presidential candidate Donald Trump is losing support even from within his own ranks. You know things are getting bad when even your top surrogates start turning against you, which is exactly what just happened on Fox News when Newt Gingrich called Trump  pathetic. Gingrich knows that Trump needs to keep his focus on Hillary Clinton if he even remotely wants to have a chance at defeating her. However, Trump has hurt feelings because many Republicans don t support his sexual assault against women have turned against him, including House Speaker Paul Ryan (R-WI). So, that has made Trump lash out as his own party.Gingrich said on Fox News: Look, first of all, let me just say about Trump, who I admire and I ve tried to help as much as I can. There s a big Trump and a little Trump. The little Trump is frankly pathetic. I mean, he s mad over not getting a phone call? Trump s referr

In [14]:
# Using spacy to identify the vectors of the document
nlp(samp_news).vector.shape

(300,)

In [19]:
# Simple converting text to vector without removing stopwords and punctuations
def text_to_vect(text):
    doc = nlp(text)
    return doc.vector

In [40]:
# Mapping the target to numerical values
target_map = {cat: num for num,cat in enumerate(news_df['label'].unique())}
news_df['label_num'] = news_df['label'].map(target_map)
news_df['text_vect'] = news_df['Text'].map(text_to_vect)

In [56]:
news_df.head()

Unnamed: 0,Text,label,label_num,label_vect,text_vect
0,Top Trump Surrogate BRUTALLY Stabs Him In The...,Fake,0,"[-0.27634, -1.4136, 0.96135, 1.4931, 3.4228, 0...","[-0.6759837, 1.4263071, -2.318466, -0.451093, ..."
1,U.S. conservative leader optimistic of common ...,Real,1,"[4.6369, 1.4167, 4.419, 3.4824, 0.24608, 2.565...","[-1.8355803, 1.3101058, -2.4919677, 1.0268308,..."
2,"Trump proposes U.S. tax overhaul, stirs concer...",Real,1,"[4.6369, 1.4167, 4.419, 3.4824, 0.24608, 2.565...","[-1.9851209, 0.14389805, -2.4221718, 0.9133005..."
3,Court Forces Ohio To Allow Millions Of Illega...,Fake,0,"[-0.27634, -1.4136, 0.96135, 1.4931, 3.4228, 0...","[-2.7812982, -0.16120885, -1.609772, 1.3624227..."
4,Democrats say Trump agrees to work on immigrat...,Real,1,"[4.6369, 1.4167, 4.419, 3.4824, 0.24608, 2.565...","[-2.2010763, 0.9961637, -2.4088492, 1.128273, ..."


In [52]:
X = news_df['text_vect']
# X = np.stack(news_df['text_vect'].values)
y = news_df['label_num'] 

In [53]:
# Splitting train and test data
from sklearn.model_selection import train_test_split
Xtrain, Xtest, ytrain, ytest = train_test_split(X, y, test_size=0.2, random_state=42, stratify=y)

In [63]:
Xtrain.shape, Xtest.shape

((7920,), (1980,))

In [68]:
from sklearn.neighbors import KNeighborsClassifier

In [71]:
knn = KNeighborsClassifier(n_neighbors=5, n_jobs=-1)
knn.fit(np.stack(Xtrain.values), ytrain)
ypred = knn.predict(np.stack(Xtest.values))

In [72]:
# Importing metrics for evaluation
from sklearn.metrics import classification_report, confusion_matrix

In [73]:
print(classification_report(ytest, ypred))

              precision    recall  f1-score   support

           0       1.00      0.99      0.99      1000
           1       0.99      1.00      0.99       980

    accuracy                           0.99      1980
   macro avg       0.99      0.99      0.99      1980
weighted avg       0.99      0.99      0.99      1980



In [94]:
print(news_df.loc[Xtest.index[0],'Text'][:1000])
print("\nActual: ",news_df.loc[Xtest.index[0],'label'])
print("\nPredicted: ", list(map(lambda x: 'Real' if x==1 else 'Fake' , knn.predict([Xtest.iloc[0]])))[0] )

Trump urges India's Modi to fix deficit, but stresses strong ties WASHINGTON (Reuters) - U.S. President Donald Trump urged Indian Prime Minister Narendra Modi to do more to relax Indian trade barriers on Monday during talks in which both leaders took great pains to stress the importance of a strong U.S.-Indian relationship. At a closely watched first meeting between the two, Trump and Modi appeared to get along well. Modi pulled in Trump for a bear hug on the stage as the cameras rolled in the Rose Garden. “I deeply appreciate your strong commitment to the enhancement of our bilateral relations,” Modi told him. “I am sure that under your leadership a mutually beneficial strategic partnership will gain new strength, new positivity, and will reach new heights.” Trump was also warm but made clear he sees a need for more balance in the U.S.-India trade relationship in keeping with his campaign promise to expand American exports and create more jobs at home. Last year the U.S. trade deficit

In [93]:
print(news_df.loc[Xtest.index[42],'Text'][:1000])
print("\nActual: ",news_df.loc[Xtest.index[42],'label'])
print("\nPredicted: ", list(map(lambda x: 'Real' if x==1 else 'Fake' , knn.predict([Xtest.iloc[42]]))) [0])

 Once Again, Trump Proves How INCREDIBLY Ignorant He Is About The World (DETAILS) President Donald Trump does not know anything about foreign policy. Now, it appears, he knows just as little about trade. This is more troubling as one of his selling points was that he was a successful businessman with a global company. You d think that a person with such a vast empire (there are Trump Towers all over the planet) would know a thing or two about the subject. This idea was blown out of the water this week by comments from a German official about Chancellor Angela Merkel s recent visit to Washington, DC.The official told The Times of London, Ten times Trump asked [German chancellor Angela Merkel] if he could negotiate a trade deal with Germany. Every time she replied,  You can t do a trade deal with Germany, only the EU.  On the eleventh refusal, Trump finally got the message,  Oh, we ll do a deal with Europe then.' For a man who claimed to be able to  make the best deals  for the United St

In [107]:
mynews = """
Elizabeth Warren endorsed Bernie Sanders
"""


In [108]:
print("Prediction: ", list(map(lambda x: 'Real' if x==1 else 'Fake' , knn.predict([text_to_vect(mynews)])) )[0])

Prediction:  Fake
