In this script, we will import data with article titles, convert those titles into meaningful floating point values of TF * IDF (term frequency * inverse document frequency), then create a linear support vector classifier machine learning model to predict which article titles are fake news. 

In [109]:
import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt
from matplotlib import style
style.use("ggplot")
from sklearn import svm
from sklearn.model_selection import train_test_split
import math 
from sklearn.svm import SVC
from sklearn.metrics import classification_report, confusion_matrix

We read in two csvs containing the true articles and the fake news articles, add the target column, and merge them. These csvs were downloaded from Kaggle at https://www.kaggle.com/clmentbisaillon/fake-and-real-news-dataset

In [3]:
true_df = pd.read_csv('./True.csv')
fake_df = pd.read_csv('./Fake.csv')

In [4]:
fake_df['target'] = 'fake'
true_df['target'] = 'true'

In [112]:
fake_df

Unnamed: 0,title,text,subject,date,target
0,Donald Trump Sends Out Embarrassing New Year’...,Donald Trump just couldn t wish all Americans ...,News,"December 31, 2017",fake
1,Drunk Bragging Trump Staffer Started Russian ...,House Intelligence Committee Chairman Devin Nu...,News,"December 31, 2017",fake
2,Sheriff David Clarke Becomes An Internet Joke...,"On Friday, it was revealed that former Milwauk...",News,"December 30, 2017",fake
3,Trump Is So Obsessed He Even Has Obama’s Name...,"On Christmas day, Donald Trump announced that ...",News,"December 29, 2017",fake
4,Pope Francis Just Called Out Donald Trump Dur...,Pope Francis used his annual Christmas Day mes...,News,"December 25, 2017",fake
...,...,...,...,...,...
23476,McPain: John McCain Furious That Iran Treated ...,21st Century Wire says As 21WIRE reported earl...,Middle-east,"January 16, 2016",fake
23477,JUSTICE? Yahoo Settles E-mail Privacy Class-ac...,21st Century Wire says It s a familiar theme. ...,Middle-east,"January 16, 2016",fake
23478,Sunnistan: US and Allied ‘Safe Zone’ Plan to T...,Patrick Henningsen 21st Century WireRemember ...,Middle-east,"January 15, 2016",fake
23479,How to Blow $700 Million: Al Jazeera America F...,21st Century Wire says Al Jazeera America will...,Middle-east,"January 14, 2016",fake


In [5]:
df = true_df.append(fake_df, ignore_index=True)

In order to convert the title into meaningful floating point values that can predict the target, we must extract the most important words out of the title and add features based on those words. We can do this with nltk. We will tokenize the title, or turn it into an array of words. Then we will remove stopwords, like "the" and "a", that are common and don't contribute to the target. Finally, we use porter stemmer to remove differences in varying prefixes or forms of a word, so that "working" and "works" count as the same word. 

In [6]:
import nltk
from nltk.tokenize import word_tokenize
from nltk.corpus import stopwords
from nltk.stem.wordnet import WordNetLemmatizer
from nltk.stem.porter import PorterStemmer

In [7]:
nltk.download('punkt')
nltk.download('stopwords')

[nltk_data] Downloading package punkt to /Users/Seth/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to /Users/Seth/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


True

In [8]:
df['title_token'] = df['title'].apply(word_tokenize) 
df['text_token'] = df['text'].apply(word_tokenize) 

In [9]:
stop = stopwords.words('english')
df['title_stop'] = df['title_token'].apply(lambda x: [item for item in x if item not in stop])
df['text_stop'] = df['text_token'].apply(lambda x: [item for item in x if item not in stop])

In [10]:
porter = PorterStemmer()
def stem_sentences(tokens):
    stems = []
    for t in tokens:
        t = t.lower()
        if t.isupper() or t.islower():
            stems.append(porter.stem(t))
    return stems

df['title_stemmed'] = df['title_stop'].apply(stem_sentences)
df['text_stemmed'] = df['text_stop'].apply(stem_sentences)

Next, we will create two dictionaries: 
allWords is a dictionary with keys of every word that occurs in any title, and the value is the number of times it occurred.
wordFreq is a dictionary with keys of every word that occurs in any title, and the value is the number of documents it occurred in. 
We will then reduce wordFreq to the top 

In [89]:
allWords = {}

wordFreq = {}

def BagOfWords(tokens):
    for t in tokens:
        if t not in allWords:
            allWords[t] = 1
        else:
            allWords[t] += 1
    

def perDoc(tokens):
    inDoc = False
    for t in tokens:
        if t in wordFreq:
            if inDoc == False:
                inDoc == True
                wordFreq[t] += 1
        else:
            inDoc = True
            wordFreq[t] = 1
        
    
        
df['title_stemmed'].apply(BagOfWords)    
df['title_stemmed'].apply(perDoc)    

0        None
1        None
2        None
3        None
4        None
         ... 
44893    None
44894    None
44895    None
44896    None
44897    None
Name: title_stemmed, Length: 44898, dtype: object

We then sort allWords, convert it back to a dictionary under the name "test_dict" and then only take the 300 most common words. Then we reduce wordFreq to those same 300 words and save it under the new name "numDocs"

In [33]:
test_arr = sorted(allWords.items(), key=lambda x: x[1], reverse=True)
test_dict = {}
for i in test_arr:
    test_dict[i[0]] = i[1]

In [35]:
import itertools  
test_dict = dict(itertools.islice(test_dict.items(), 300))  

In [93]:
numDocs = {}
for key in test_dict:
    numDocs[key] = wordFreq[key]

Then, we create a copy of our dataframe and add new columns for each of the 300 words. we apply TF which calculates term frequency for every row, for every word column. Next, we will apply IDF and multiply our values out for each word column. Finally, we have the TF * IDF value for every word for each title. Now we're ready to create our machine learning model.

In [24]:
new_df = df

In [38]:
for key in test_dict:
    new_df[key] = 0

In [63]:
def TF(row, keyWord):
    
    doc = row['title_stemmed']
    
    freq = 0
    for word in doc:
        if word == keyWord:
            freq += 1

    numWords = len(doc)
    tf = freq/numWords
    
    return tf

In [66]:
for key in test_dict:
     new_df[key] = new_df.apply(TF, keyWord=key, axis=1)


['as', 'u.s.', 'budget', 'fight', 'loom', 'republican', 'flip', 'fiscal', 'script']
['u.s.', 'militari', 'accept', 'transgend', 'recruit', 'monday', 'pentagon']
['senior', 'u.s.', 'republican', 'senat', "'let", 'mr.', 'mueller', 'job']
['fbi', 'russia', 'probe', 'help', 'australian', 'diplomat', 'tip-off', 'nyt']
['trump', 'want', 'postal', 'servic', 'charg', "'much", 'amazon', 'shipment']
['white', 'hous', 'congress', 'prepar', 'talk', 'spend', 'immigr']
['trump', 'say', 'russia', 'probe', 'fair', 'timelin', 'unclear', 'nyt']
['factbox', 'trump', 'twitter', 'dec', 'approv', 'rate', 'amazon']
['trump', 'twitter', 'dec', 'global', 'warm']
['alabama', 'offici', 'certifi', 'senator-elect', 'jone', 'today', 'despit', 'challeng', 'cnn']
['jone', 'certifi', 'u.s.', 'senat', 'winner', 'despit', 'moor', 'challeng']
['new', 'york', 'governor', 'question', 'constitution', 'feder', 'tax', 'overhaul']
['factbox', 'trump', 'twitter', 'dec', 'vaniti', 'fair', 'hillari', 'clinton']
['trump', 'twitter

['homeland', 'secur', 'nomine', 'say', 'need', 'full', 'u.s.-mexico', 'border', 'wall']
['democrat', 'win', 'bitter', 'virginia', 'governor', "'s", 'race', 'setback', 'trump']
['republican', 'tax', 'bill', 'would', 'add', 'trillion', 'deficit', 'cbo']
['u.s.', 'senat', 'still', 'track', 'releas', 'tax', 'bill', 'thursday', 'aid']
['china', "'s", 'xi', 'fete', 'trump', 'first', 'day', 'beij', 'trip']
['u.s.', 'lawmak', 'introduc', 'bipartisan', 'bill', 'foreign', 'invest', 'amid', 'china', 'worri']
['congress', 'final', 'billion', 'defens', 'spend', 'plan']
['ryan', 'open', 'door', 'later', 'corpor', 'tax', 'rate', 'cut', 'fox', 'news']
['trump', 'warn', "'rogu", 'regim', 'north', 'korea', 'grave', 'danger']
['commerc', 'secretari', 'say', 'trump-xi', 'talk', 'address', 'trade', 'imbal']
['busi', 'group', 'push', 'u.s.', 'flood', 'insur', 'reform', 'decemb', 'deadlin', 'loom']
['white', 'hous', 'condemn', 'missil', 'attack', 'saudi', 'yemen', "'s", 'houthi']
['iran', 'call', 'houthi', '

['trump', 'say', 'wealthi', 'might', 'accept', 'higher', 'tax', 'tax', 'plan']
['trump', 'want', 'work', 'schumer', 'pelosi', 'tax', 'reform', 'white', 'hous']
['trump', 'applaud', 'senat', 'new', 'healthcar', 'reform', 'bill']
['former', 'u.', 'senat', 'pete', 'domenici', 'heavyweight', 'budget', 'energi', 'issu', 'die']
['former', 'trump', 'advis', 'flynn', "'s", 'son', 'probe', 'russia', 'nbc']
['trump', 'administr', 'su', 'phone', 'search', 'u.s.', 'border']
['republican', 'senat', 'cruz', 'push', 'percent', 'corpor', 'tax', 'rate']
['ryan', 'urg', 'daca', 'fix', 'broader', 'immigr', 'reform', 'ap']
['hous', 'speaker', 'ryan', 'leav', 'door', 'open', 'tax', 'deficit']
['factbox', 'trump', 'twitter', 'sept', 'tax', 'reform', 'florida']
['democrat', 'leader', 'schumer', 'pelosi', 'dine', 'trump', 'wednesday']
['tax', 'reform', 'framework', 'appear', 'end-sept.', 'congress', 'sourc']
['ryan', 'say', 'tax', 'reform', 'outlin', 'reflect', 'consensu', 'lawmak', 'administr']
['suprem', 'c

['anthem', 'cut', 'back', 'obamacar', 'plan', 'offer', 'california']
['wray', 'confirm', 'senat', 'lead', 'fbi', 'comey', 'fire']
['tillerson', 'tighten', 'rein', 'state', 'depart']
['democrat', 'want', 'grill', 'well', 'fargo', 'execut', 'auto', 'insur']
['not', 'hallway', 'text', 'committe', 'warn', 'u.s.', 'hous', 'campaign']
['trump', 'fire', 'commun', 'director', 'scaramucci', 'new', 'white', 'hous', 'upheav']
['factbox', 'scaramucci', 'latest', 'leav', 'trump', 'administr']
['trump', 'extol', 'corpor', 'profit', 'seek', 'corpor', 'tax', 'cut']
['senat', 'democrat', 'ron', 'wyden', 'demand', 'justic', 'dept', 'releas', 'new', 'crime', 'reduct', 'polici']
['trump', "'weigh", 'son', "'s", 'russia', 'attorney', 'statement', 'white', 'hous']
['mcconnel', 'say', 'u.s.', 'debt', 'ceil', 'debat', 'could', 'stretch', 'septemb']
['illinoi', 'governor', 'reject', 'school', 'fund', 'legisl']
['senat', 'plan', 'septemb', 'hear', 'health', 'insur', 'market']
['trump', 'administr', 'seek', 'pro

['kremlin', 'deni', 'accus', 'russia', 'tri', 'hack', 'u.s.', 'vote', 'equip']
['top', 'u.s.', 'diplomat', 'china', 'quit', 'trump', 'climat', 'polici']
['white', 'hous', 'want', 'healthcar', 'vote', 'summer', 'tax', 'reform', 'fall']
['trump', 'renew', 'critic', 'london', 'mayor', 'khan', 'attack']
['kellyann', 'conway', "'s", 'husband', 'say', 'trump', 'tweet', 'hurt', 'legal', 'case']
['trump', 'lay', 'plan', 'privat', 'air', 'traffic', 'control', 'system']
['trump', 'block', 'ex-fbi', 'chief', 'comey', "'s", 'testimoni', 'white', 'hous']
['u.s.', 'embassi', 'egypt', 'ban', 'personnel', 'visit', 'religi', 'site', 'outsid', 'cairo']
['panama', "'s", 'presid', 'visit', 'trump', 'white', 'hous', 'june']
['u.s.', 'plan', 'updat', 'self-driv', 'guidelin', 'come', 'month']
['no', 'rehear', 'ex-illinoi', 'governor', 'blagojevich', 'appeal', 'plan']
['trump', 'complic', 'travel', 'ban', 'case', 'grumbl', 'justic', 'depart']
['u.s.', 'top', 'court', 'fault', 'north', 'carolina', 'vote', 'rig

['healthcar', 'bill', 'pull', 'hous', 'republican', 'leadership', 'aid']
['tillerson', 'meet', 'nato', 'march', 'end', 'no-show', 'furor']
['investor', 'buoy', 'trump', "'s", 'readi', 'move', 'obamacar']
['hous', 'intellig', 'panel', 'leader', 'split', 'session', 'russia']
['virginia', 'court', 'rule', 'trump', 'travel', 'ban', 'disput', 'order', 'still', 'halt']
['hous', 'speaker', 'ryan', 'brief', 'trump', 'healthcar', 'bill', 'vote', 'white', 'hous']
['hous', 'speaker', 'tell', 'trump', 'healthcar', 'bill', 'lack', 'vote', 'cnn']
['hous', 'appropri', 'chairman', 'oppos', 'healthcar', 'bill']
['hous', 'speaker', 'visit', 'white', 'hous', 'ahead', 'healthcar', 'vote']
['top', 'senat', 'democrat', 'promis', 'fight', 'block', 'trump', 'high', 'court', 'pick']
['mnuchin', 'say', 'u.s.', 'growth', 'prospect', 'fulli', 'reflect', 'market']
['still', 'short', 'healthcar', 'vote', 'republican', 'whip', 'team', 'member']
['hous', 'clear', 'way', 'debat', 'trump-back', 'healthcar', 'bill']
['p

['skeptic', 'trump', "'s", "'wall", 'cost', 'simmer', 'among', 'democrat', 'border', 'republican']
['appeal', 'court', 'weigh', 'rehear', 'trump', 'travel', 'ban']
['in', 'trump', 'travel', 'ban', 'fight', 'justic', 'kennedi', "'s", 'opinion', 'loom', 'larg']
['senat', 'question', 'goldman', 'sach', 'role', 'trump', 'bank', 'polici']
['japan', "'s", 'love', 'tini', 'car', 'sore', 'spot', 'trump', 'abe', 'meet']
['eu', "'s", 'mogherini', 'u.s.', 'say', 'fulli', 'implement', 'iran', 'nuclear', 'deal']
['with', 'eye', 'obamacar', 'price', 'take', 'helm', 'u.s.', 'health', 'secretari']
['trump', 'say', 'administr', 'commit', 'japan', "'s", 'secur']
['trump', 'nix', 'abram', 'no', 'state', 'depart', 'job', 'sourc']
['lebeouf', "'s", 'anti-trump', 'new', 'york', 'exhibit', 'shutter', 'safeti', 'concern']
['ethic', 'review', 'sought', 'trump', 'advis', 'tout', 'ivanka', 'trump', 'product']
['trump', 'say', 'israel', "'reason", 'peac', 'process']
['under', 'armour-sponsor', 'athlet', 'oppos', 

['senat', 'call', 'probe', 'cyber', 'attack', 'russia']
['senat', 'call', 'panel', 'investig', 'russian', 'hack']
['under', 'threat', 'washington', 'first', 'ladi', "'s", 'food', 'legaci', 'may', 'live', 'elsewher']
['with', 'water', 'cannon', 'southern', 'bell', 'trump', 'end', "'thank", 'tour']
['trump', 'build', 'team', 'boss', 'shake', 'washington']
['trump', 'say', 'pick', 'mulvaney', 'white', 'hous', 'budget', 'director']
['obama', 'say', 'china', 'would', 'take', 'chang', 'u.s.', 'polici', 'taiwan', 'lightli']
['u.s.', 'disclos', 'estim', 'number', 'american', 'surveil']
['north', 'carolina', 'lawmak', 'pass', 'curb', 'incom', 'democrat', 'governor']
['obama', 'point', 'finger', 'putin', 'hack', 'u.s.', 'elect']
['obama', 'say', "'great", 'confid', 'russia', 'behind', 'hack']
['factbox', 'contend', 'senior', 'job', 'trump', "'s", 'administr']
['factbox', 'trump', 'fill', 'top', 'job', 'administr']
['trump', 'pick', 'rep.', 'mulvaney', 'head', 'white', 'hous', 'budget', 'offic']


['clinton', 'warn', 'u.s.', 'would', "'ring", 'china', 'missil', 'defens', 'hack', 'email']
['trump', "n't", 'su', 'newspap', 'libel', 'decad', 'record', 'show']
['trump', 'accus', 'mexico', "'s", 'carlo', 'slim', 'tri', 'help', 'clinton']
['factbox', 'mexican', 'tycoon', 'carlo', 'slim', 'u.s.', 'elect']
['new', 'hampshir', 'senat', 'race', 'tight', 'despit', 'trump', 'focu']
['u.s.', 'polici', 'chang', 'cuba', 'tough', 'undo', 'offici']
['dozen', 'u.s.', 'lawmak', 'request', 'brief', 'yahoo', 'email', 'scan']
['djokov', 'hope', 'kyrgio', 'stop', 'squander', 'gift']
['exclus', 'obama', 'aid', 'expect', 'weigh', 'syria', 'militari', 'option', 'friday']
['obama', 'plan', 'overhaul', 'child', 'support', 'payment', 'rule', 'prison']
['thai', 'king', "'s", 'death', 'add', 'uncertainti', 'obama', "'s", 'falter', 'asia', 'pivot']
['canadian', 'court', 'rule', 'trump', 'face', 'claim', 'toronto', 'tower', 'case']
['u.s.', 'conserv', 'group', 'back', 'republican', 'favor', 'clean', 'energi']
[

['as', 'racial', 'polit', 'loom', 'elect', 'obama', "'s", 'legaci', 'seen', 'mix']
['obama', "'s", 'judg', 'leav', 'liber', 'imprint', 'u.s.', 'law']
['white', 'hous', 'propos', 'rule', 'welcom', 'immigr', 'entrepreneur']
['former', 'bush', 'advis', 'wolfowitz', 'vote', 'clinton', 'spiegel']
['obama', 'expand', 'hawaii', 'marin', 'reserv', 'final', 'push', 'climat']
['hillari', 'clinton', 'say', 'famili', "'s", 'foundat', 'look', 'partner']
['factbox', 'case', 'obama', 'judg', 'appeal', 'court', 'left', 'mark']
['judg', 'order', 'search', 'new', 'clinton', 'email', 'releas', 'septemb']
['white', 'hous', 'meet', 'clinton', 'trump', 'team', 'discuss', 'transit']
['clinton', 'trump', 'clash', 'best', 'u.s.', 'minor']
['mylan', 'offer', 'discount', 'epipen', 'amid', 'wave', 'critic']
['obama', "'s", 'tpp', 'deal', 'wo', "n't", 'get', 'senat', 'vote', 'year', 'mcconnel']
['hispan', 'coalit', 'ask', 'trump', 'stop', "'attack"]
['clinton', 'ramp', 'asian', 'outreach', 'three', 'close', 'fough

['senat', 'bank', 'committe', 'approv', 'two', 'sec', 'nomine']
['aclu', 'threaten', 'cleveland', 'delay', 'convent', 'protest', 'permit']
['trump', 'tap', 'advis', 'manafort', 'campaign', 'chairman', 'cnn']
['hillari', 'clinton', 'say', 'donald', 'trump', 'unqualifi', 'presid']
['top', 'democrat', 'senat', 'probe', 'swift', 'ny', 'fed', 'bangladesh', 'heist']
['clinton', 'call', 'world', 'break', 'barrier', 'hold', 'back', 'girl']
['uaw', 'chief', 'say', 'union', 'endors', 'either', 'clinton', 'sander', "'soon"]
['senat', 'introduc', 'bill', 'block', 'expans', 'fbi', 'hack', 'author']
['hous', 'lift', 'block', 'google-host', 'app', 'yahoo', 'mail', 'remain', 'blacklist']
['guantanamo', 'right', 'issu', 'defens', 'bill', 'divid', 'congress']
['hous', 'approv', 'million', 'combat', 'zika', 'viru']
['facebook', "'s", 'zuckerberg', 'meet', 'u.s.', 'conserv', 'bia', 'controversi']
['democrat', 'chide', 'sander', 'nevada', 'mutini']
['hacker', 'target', 'presidenti', 'campaign', 'u.s.', 'sp

['hous', 'ethic', 'panel', 'investig', 'democrat', 'corrin', 'brown']
['starbuck', 'univis', 'democraci', 'work', 'aim', 'boost', 'u.s.', 'voter', 'turnout']
['oppon', 'cruz', 'white', 'hous', 'bid', 'take', 'case', 'n.y.', 'appeal', 'panel']
['pennsylvania', 'budget', 'go', 'effect', 'end', 'long', 'stalem']
['labor', 'depart', 'unveil', 'controversi', 'union', "'persuad", 'rule']
['obama', 'argentina', "'s", 'macri', 'discuss', 'brazil', "'s", 'polit', 'crisi']
['macri', 'vow', 'matur', 'relationship', 'u.']
['lawmak', 'say', 'nsa', 'plan', 'expand', 'share', 'data', 'unconstitut']
['trump', 'clinton', 'win', 'big', 'arizona', 'cruz', 'sander', 'show', 'fight']
['trump', 'say', 'muslim', 'enough', 'prevent', 'attack']
['trump', "'plain", 'wrong', 'say', 'muslim', 'help', 'extremist', 'uk', 'minist']
['jeb', 'bush', 'endors', 'ted', 'cruz', 'republican', 'nomin']
['for', "'apprentic", 'insid', 'trump', "'s", 'bid', 'echo', 'realiti', 'tv']
['republican', 'critic', 'partisan', 'divid',

['trump', 'threaten', 'cut', 'aid', 'u.n.', 'member', 'jerusalem', 'vote']
['arab', 'coalit', 'say', 'keep', 'yemen', 'port', 'open', 'air', 'raid', 'kill', 'least', 'nine']
['uk', "'s", 'may', 'visit', 'china', 'around', 'jan.', 'sky', 'news']
['south', 'africa', "'s", 'anc', 'call', 'nation', 'central', 'bank', 'land', 'expropri']
['congo', 'uganda', 'launch', 'joint', 'oper', 'rebel', 'adf']
['ugandan', 'parliament', 'pass', 'law', 'allow', 'museveni', 'seek', 're-elect']
['canada', "'s", 'trudeau', 'broke', 'ethic', 'rule', 'visit', 'aga', 'khan', 'island']
['ugandan', 'parliament', 'pass', 'law', 'allow', 'museveni', 'seek', 're-elect']
['divid', 'catalan', 'prepar', 'vote', 'close-run', 'elect']
['six', 'bodi', 'hung', 'bridg', 'near', 'mexico', "'s", 'lo', 'cabo', 'tourist', 'resort']
['burst', 'tire', 'may', 'caus', 'deadli', 'tourist', 'bu', 'crash', 'mexico', '-polic']
['exclus', 'cameroonian', 'troop', 'enter', 'nigeria', 'without', 'seek', 'author', 'sourc', 'nigeria', 'say

['germani', "'s", 'far-right', 'afd', 'choos', 'nationalist', 'co-lead']
['franc', "'s", 'macron', 'call', 'iraq', 'dismantl', 'militia']
['romanian', 'protest', 'halt', 'build', 'xma', 'fair', 'protest', 'site']
['brazil', 'environmentalist', 'marina', 'silva', 'run', 'presid']
['peru', "'s", 'busi', 'commun', 'sour', 'kuczynski', 'survey']
['egyptian', 'ex-pm', 'shafik', 'arriv', 'cairo', 'say', 'airport', 'sourc']
['middl', 'east', 'leader', 'paint', "'dark", 'pictur', 'rome', 'confer']
['yemen', "'s", 'saleh', 'say', 'readi', "'new", 'page', 'saudi-l', 'coalit']
['uae', 'say', 'egyptian', 'ex-premi', 'shafik', 'left', 'egypt', 'famili', 'still', 'uae']
['matti', 'eye', 'move', 'away', 'arm', 'syrian', 'kurdish', 'fighter']
['u.n.', 'council', 'meet', 'north', 'korea', 'right', 'abus', 'nuclear', 'program', 'decemb']
['turkey', "'s", 'erdogan', 'say', 'u.s.', 'court', 'put', 'turkey', 'trial']
['suspect', 'boko', 'haram', 'suicid', 'bomber', 'kill', 'least', 'nigeria', 'offici']
['p

['lebanon', "'s", 'presid', 'welcom', 'hariri', "'s", 'plan', 'return']
['eu', 'sign', 'defens', 'pact', 'decades-long', 'quest']
['freeport', 'indonesia', 'reopen', 'mine', 'access', 'shoot']
['franc', 'pay', 'tribut', 'pari', 'dead', 'two', 'year']
['green', 'hold', 'climat', 'german', 'coalit', 'talk']
['iran', 'say', 'interfer', 'lebanes', 'state', 'affair', 'tv']
['trump', 'vaunt', 'trade', 'progress', 'red', 'carpet', "'fruit", 'asia', 'trip']
['weaker', 'ever', 'may', 'face', 'test', 'uk', 'parliament', 'brexit', 'plan']
['uk', 'minist', 'offer', 'parliament', 'new', 'vote', 'brexit', 'deal']
['london', 'author', 'fail', 'peopl', 'displac', 'deadli', 'fire', 'lawmak']
['activist', 'appeal', 'new', 'south', 'african', 'nuclear', 'plant', 'decis']
['tanzania', 'investig', 'ex-minist', 'day', 'join', 'opposit']
['netanyahu', 'signal', 'israel', 'act', 'free', 'hand', 'syria']
['u.s.', 'launch', 'media', 'fund', 'hungari', 'aid', 'press', 'freedom']
['spain', 'see', 'russian', 'inte

['south', 'korea', 'china', 'nuclear', 'envoy', 'meet', 'beij', 'south', 'korea', 'govern']
['png', 'say', 'australia', 'respons', 'hundr', 'asylum', 'seeker', 'detent', 'camp', 'close']
['at', 'least', 'dead', 'islamist', 'attack', 'somali', 'hotel']
['u.s.', 'envoy', 'haley', "'s", 'blunt', 'diplomaci', 'target', 'south', 'sudan', 'congo']
['macedonia', "'s", 'opposit', 'reject', 'result', 'municip', 'vote']
['greek', 'govern', 'spat', 'spain', "'s", 'ambassador', 'catalonia']
['liberia', "'s", 'rule', 'parti', 'back', 'challeng', 'presidenti', 'result']
['russian', 'helicopt', 'miss', 'norway', 'found', 'rescu', 'center']
['franc', 'arrest', 'brother', 'ex-burkina', 'presid', 'compaor']
['brazil', "'s", 'lula', 'bolsonaro', 'well', 'posit', 'elect', 'poll']
['merkel', 'parti', 'leader', 'meet', 'rev', 'german', 'coalit', 'talk', 'media']
['philippin', 'dutert', 'say', 'deal', 'trump', "'most", 'righteou', 'way']
['hundr', 'ralli', 'franc', 'protest', 'sexual', 'harass', 'weinstein',

['soccer', 'star', 'vp', 'maintain', 'earli', 'lead', 'liberia', 'elect']
['pentagon', 'identifi', 'new', 'area', 'pressur', 'iran', 'review', 'plan']
['iraq', "'s", 'kurd', 'beef', 'move', 'back', 'defens', 'line', 'around', 'oil-rich', 'kirkuk']
['brazil', 'suprem', 'court', 'block', 'extradit', 'italian', 'leftist', 'ex-guerilla', 'battisti']
['despit', 'death', 'german', 'militari', 'eye', 'recruit', 'bump', 'new', 'realiti', 'show']
['canada', "'s", 'trudeau', 'call', 'treatment', 'women', 'mexico', "'unaccept"]
['who', 'say', 'attack', 'syria', 'vaccin', 'store', 'leav', 'children', 'risk']
['rouhani', 'say', 'iran', 'stay', 'nuclear', 'deal', 'serv', 'interest', 'tv']
['matti', 'say', 'u.s.', 'work', 'ensur', 'situat', 'around', 'kirkuk', 'escal']
['bid', "'fix", 'iran', 'nuclear', 'deal', 'face', 'uphil', 'climb', 'u.s.', 'congress']
['iran', 'eu', 'russia', 'defend', 'nuclear', 'deal', 'trump', 'threat']
['russia', 'question', 'futur', 'syria', 'chemic', 'weapon', 'inquiri']
[

['u.s.', 'air', 'strike', 'kill', "'sever", 'islam', 'state', 'milit', 'libya']
['turkey', 'rais', 'oil', 'threat', 'iraqi', 'kurd', 'back', 'independ']
['turkey', "'s", 'erdogan', 'call', 'iraqi', 'kurdish', 'referendum', 'illegitim']
['islam', 'state', "'s", 'baghdadi', 'undat', 'audio', 'urg', 'milit', 'keep', 'fight']
['nigeria', 'hold', 'presidenti', 'parliamentari', 'elect', 'feb.']
['senat', 'urg', 'trump', 'administr', 'act', 'myanmar', 'rohingya']
['u.s.', 'will', 'ask', 'facilit', 'talk', 'kurd', 'baghdad', 'state', 'depart']
['myanmar', 'violenc', 'could', 'spread', 'displac', 'rohingya', 'u.n.', 'chief']
['u.s.', 'say', 'countri', 'suspend', 'provid', 'weapon', 'myanmar']
['fincantieri', 'naval', 'group', 'may', 'exchang', 'stake', 'futur', 'militari', 'allianc']
['u.n.', 'offer', 'help', 'resolv', 'baghdad', 'kurdistan', 'region', 'crisi', 'iraq', 'foreign', 'ministri']
['pakistan', 'ministri', 'seek', 'ban', 'new', 'parti', 'back', 'promin', 'islamist']
['merkel', "'s", '

['hungarian', 'pm', 'orban', 'say', 'fight', 'eu', 'rule', 'migrant', 'quota']
['minor', 'new', 'zealand', 'parti', 'focu', 'hotli', 'contest', 'elect', 'get', 'tighter']
['bahrain', 'reject', 'amnesti', 'report', 'cite', 'crackdown', 'dissent']
['india', 'bar', "'unruli", 'passeng', 'fli', 'three', 'month', 'two', 'year']
['russia', 'say', 'air', 'strike', 'kill', 'sever', 'top', 'islam', 'state', 'command', 'syria']
['sweep', 'chang', 'china', "'s", 'militari', 'point', 'firepow', 'xi']
['taiwan', "'s", 'new', 'premier', 'vow', "'build", 'countri', 'scrap', 'invest', 'hurdl']
['most', 'south', 'korean', 'doubt', 'north', 'start', 'war', 'poll']
['pope', 'franci', 'bless', 'colombia', "'s", 'war', 'victim']
['singapor', 'man', 'woman', 'arrest', "'terrorism-rel", 'activ']
['rohingya', 'say', 'villag', 'lost', 'myanmar', "'s", 'spiral', 'conflict']
['hurrican', 'storm', 'surg', 'warn', 'issu', 'florida', 'ahead', 'irma', 'nhc']
['hurrican', 'irma', 'kill', 'five', 'sweep', 'island', 's

['short', 'film', 'envis', 'effect', 'global', 'warm', 'will', 'have', 'on', 'new', 'york', 'with', 'terrifi', 'accuraci', 'video']
['trump', 'boast', 'about', 'open', 'of', 'first', 'new', 'coal', 'mine', 'of', 'hi', 'presid', 'tweet']
['christian', 'fb', 'page', 'threaten', 'to', 'ban', 'peopl', 'who', 'use', 'rainbow', 'emoji', 'instantli', 'backfir']
['how', 'the', 'gop', 'becam', 'the', 'parti', 'of', 'white', 'supremacist', 'under', 'donald', 'trump']
['sean', 'spicer', 'bomb', 'press', 'brief', 'fail', 'to', 'defend', 'trump', 'over', 'comey', 'tape', 'video']
['thi', 'histor', 'lawsuit', 'may', 'final', 'expos', 'trump', 'tax', 'return']
['dem', 'senat', 'mock', 'trump', 'ridicul', 'cabinet', 'meet', 'with', 'hilari', 'video']
['watch', 'trump', 'declar', 'himself', 'one', 'of', 'the', 'best', 'presid', 'in', 'histori', 'with', 'histor', 'low', 'approv', 'rate']
['even', 'the', 'secret', 'servic', 'is', 'say', 'trump', 'is', 'full', 'of', 'sh', 'there', 'are', 'no', 'tape']
['t

['watch', 'trump', 'assum', 'black', 'report', 'can', 'set', 'up', 'a', 'meet', 'with', 'black', 'lawmak', 'for', 'him']
['trump', 'think', 'rush', 'limbaugh', 'is', 'real', 'news', 'and', 'the', 'internet', 'mercilessli', 'respond']
['watch', 'fox', 'news', 'host', 'defend', 'cnn', 'report', 'tell', 'trump', 'to', 'go', 'f', 'ck', 'himself']
['gop', 'senat', 'desper', 'worri', 'about', 'trump', 'mental', 'stabil', 'text', 'cnn', 'dure', 'press', 'confer', 'video']
['watch', 'nbc', 'report', 'call', 'trump', 'out', 'for', 'lie', 'repeatedli', 'dure', 'live', 'press', 'confer']
['tapper', 'slam', 'trump', 'for', 'bizarr', 'behavior', 'at', 'wild', 'and', 'unhing', 'presidenti', 'presser', 'video']
['lol', 'putin', 'is', 'angri', 'now', 'becaus', 'trump', 'get', 'more', 'media', 'coverag', 'in', 'russia', 'than', 'he', 'doe']
['watch', 'cnn', 'host', 'fare', 'zakaria', 'roast', 'trump', 'for', 'embarrass', 'press', 'confer', 'with', 'netanyahu']
['promin', 'psychiatrist', 'give', 'one', 

['trump', 'tri', 'to', 'throw', 'michel', 'obama', 'under', 'the', 'bu', 'fall', 'flat', 'on', 'hi', 'face', 'instead', 'video']
['salma', 'hayek', 'turn', 'donald', 'trump', 'down', 'for', 'a', 'date', 'and', 'hi', 'reveng', 'wa', 'classic', 'trump']
['the', 'daili', 'caller', 'edit', 'woman', 'tragic', 'pregnanc', 'stori', 'in', 'the', 'sickest', 'most', 'twist', 'way', 'imagin']
['hillari', 'respons', 'to', 'trump', 'tweet', 'he', 'won', 'the', 'debat', 'is', 'pure', 'comedi', 'geniu', 'tweet']
['kkk', 'poster', 'boy', 'david', 'duke', 'qualifi', 'for', 'louisiana', 'senat', 'debat', 'be', 'held', 'at', 'black', 'univers']
['trump', 'declar', 'that', 'he', 'unanim', 'won', 'the', 'third', 'debat', 'and', 'the', 'internet', 'laugh', 'at', 'him']
['former', 'rnc', 'chair', 'michael', 'steel', 'refus', 'to', 'vote', 'for', 'donald', 'trump']
['watch', 'weird', 'al', 'yankov', 'reliv', 'the', 'third', 'debat', 'and', 'it', 'beyond', 'compar']
['gop', 'lawmak', 'a', 'ladi', 'need', 'to',

['trump', 'million', 'loan', 'is', 'the', 'latest', 'way', 'he', 'fleec', 'hi', 'campaign', 'for', 'money']
['accord', 'to', 'employe', 'trump', 'often', 'spi', 'on', 'mar-a-lago', 'guest', 'phone', 'call']
['watch', 'canadian', 'parliament', 'teach', 'republican', 'how', 'presid', 'obama', 'should', 'be', 'treat']
['foreign', 'leader', 'to', 'team', 'trump', 'quit', 'beg', 'us', 'for', 'money', 'for', 'your', 'repugn', 'campaign', 'tweet']
['nativ', 'american', 'woman', 'hit', 'trump', 'make', 'america', 'great', 'again', 'right', 'where', 'it', 'realli', 'hurt']
['how', 'fox', 'news', 'broke', 'elect', 'law', 'and', 'it', 'help', 'trump']
['obama', 'expos', 'trump', 'for', 'the', 'world', 'to', 'see', 'and', 'it', 'onli', 'took', 'a', 'few', 'second', 'video']
['watch', 'the', 'nra', 'just', 'got', 'caught', 'desecr', 'a', 'nation', 'cemeteri', 'for', 'polit', 'gain']
['white', 'mom', 'furiou', 'that', 'black', 'women', 'call', 'cop', 'after', 'she', 'left', 'her', 'kid', 'in', 'a', 

['with', 'zero', 'liabil', 'a', 'gun', 'wa', 'just', 'made', 'to', 'look', 'like', 'thi', 'toy', 'imag']
['christian', 'author', 'rant', 'that', 'women', 'who', 'masturb', 'are', 'go', 'to', 'hell']
['donald', 'trump', 'shame', 'kid', 'for', 'not', 'regist', 'to', 'vote', 'for', 'him', 'they', 'feel', 'veri', 'veri', 'guilti']
['god', 'gave', 'me', 'illeg', 'stock', 'share', 'at', 'dairi', 'queen', 'fair', 'squar', 'claim', 'texa', 'ag', 'accus', 'of', 'fraud']
['meet', 'zari', 'the', 'first', 'feminist', 'muppet', 'from', 'afghanistan']
['fume', 'trump', 'call', 'hi', 'colorado', 'loss', 'the', 'biggest', 'stori', 'in', 'polit', 'after', 'cruz', 'win', 'all', 'deleg', 'tweet']
['joel', 'mchale', 'perform', 'in', 'bigot', 'north', 'carolina', 'but', 'then', 'did', 'someth', 'extraordinari', 'imag']
['rock', 'legend', 'bryan', 'adam', 'tell', 'bigot', 'mississippi', 'republican', 'to', 'go', 'f', 'ck', 'themselv']
['john', 'kasich', 'wa', 'just', 'embarrass', 'by', 'republican', 'lawmak

['watch', 'the', 'unedit', 'video', 'of', 'oregon', 'militiaman', 'lavoy', 'tarp', 'man', 'finicum', 'be', 'shot', 'by', 'law', 'enforc']
['saudi', 'princ', 'epic', 'burn', 'donald', 'trump', 'rememb', 'those', 'two', 'time', 'i', 'bail', 'you', 'out', 'financi']
['watch', 'ted', 'cruz', 'open', 'the', 'gop', 'debat', 'with', 'a', 'wick', 'burn', 'on', 'donald', 'trump']
['snl', 'rachel', 'dratch', 'expos', 'the', 'terrifi', 'lunaci', 'of', 'open', 'carri', 'law', 'in', 'less', 'than', 'minut', 'video']
['republican', 'committe', 'pass', 'bill', 'that', 'would', 'prohibit', 'almost', 'all', 'abort', 'in', 'florida']
['whoopi', 'goldberg', 'open', 'up', 'a', 'can', 'of', 'whoop-a', 'on', 'trump', 'video']
['michigan', 'silent', 'gave', 'clean', 'water', 'to', 'state', 'employe', 'for', 'month', 'befor', 'flint', 'crisi', 'broke']
['thi', 'chart', 'captur', 'everi', 'sexist', 'slur', 'trump', 'support', 'tweet', 'at', 'megyn', 'kelli', 'imag']
['desper', 'a', 'hole', 'jeb', 'bush', 'expl

['new', 'video…antifa', 'terror', 'group', 'infiltrated…transgend', 'leader', 'use', 'knive', 'to', 'stab', 'opponents…hav', 'ak-47', 'readi', 'to', 'shut', 'down', 'free', 'speech']
['jare', 'kushner', 'never', 'regist', 'to', 'vote', 'as', 'a', 'femal', '…media', 'lie', 'backfires…shin', 'bright', 'light', 'on', 'how', 'easili', 'voter', 'fraud', 'can', 'occur']
['new', 'nra', 'ad', 'featur', 'former', 'navi', 'seal', 'speak', 'out', 'on', 'nfl', 'anthem', 'protest', 'i', 'stand', 'brother', 'stand', 'anymor', 'video']
['hey', 'packer', 'bear', 'we', 'don', 'lock', 'arm', 'nation', 'anthem', 'video']
['lol', 'one', 'hilari', 'cartoon', 'perfectli', 'illustr', 'how', 'embarrassingli', 'polit', 'correct', 'the', 'nfl', 'ha', 'becom']
['nfl', 'player', 'if', 'kneel', 'bother', 'come', 'game', 'video']
['actor', 'jame', 'wood', 'destroy', 'leftist', 'time', 'for', 'articl', 'suggest', 'u.s.', 'lie', 'about', 'n.', 'korea', 'tortur', 'otto', 'warmbier', 'who', 'die', 'after', 'return', 'i

['break', 'h.r', 'mcmaster', 'explain', 'whi', 'washington', 'post', 'hit', 'piec', 'on', 'trump', 'wa', 'fake', 'news', 'video']
['confirm', 'bombshel', 'seth', 'rich', 'sent', 'over', 'dnc', 'email', 'to', 'journalist', 'best', 'friend', 'of', 'wikileak', 'founder…dc', 'polic', 'offic', 'claim', 'they', 'were', 'told', 'to', 'stand', 'down', 'on', 'case', 'video']
['tucker', 'carlson', 'confront', 'nasti', 'activist', 'i', 'say', 'illegal…it', 'aw', 'video']
['clueless', 'nba', 'coach', 's', 'rant', 'compar', 'trump', 'to', 'a', 'game', 'show', 'rais', 'eyebrow', 'video']
['liber', 'pb', 'anchor', 'get', 'brutal', 'truth', 'on', 'trump', 'vs', 'obama', 'immigr', 'polici', 'that', 'miss', 'last', 'year', 'video']
['nation', 'secur', 'advisor', 'call', 'out', 'liber', 'press', 'for', 'fake', 'news', 'the', 'story…i', 'fals', 'video']
['judg', 'napolitano', 'drop', 'a', 'bomb', 'about', 'obama', 'surveil', 'video']
['thailand', 'threaten', 'to', 'prosecut', 'facebook', 'over', 'embarras

['whi', 'ugli', 'hate', 'and', 'divis', 'in', 'america', 'is', 'actual', 'obama', 'fault', 'video']
['updat', 'on', 'monster', 'mom', 'who', 'kick', 'littl', 'boy', 'out', 'of', 'home', 'for', 'vote', 'trump', 'at', 'school…', 'we', 'donald', 'trump', 'video']
['nail', 'it', 'mike', 'row', 'on', 'whi', 'trump', 'won…hillari', 'support', 'won', 'like', 'thi', 'video']
['seattl', 'citi', 'councilwoman', 'incit', 'riot…vow', 'to', 'shut', 'down', 'trump', 'inaugur', 'video']
['boom', 'thi', 'is', 'how', 'presid', 'reagan', 'handl', 'protest', 'negoti', 'what', 'negoti', 'video']
['portland', 'polic', 'call', 'violent', 'anti-trump', 'protest', 'anarchist', '…upgrad', 'protest', 'to', 'full-blown', 'riots…on', 'person', 'hit', 'by', 'car…kil']
['dear', 'anti-trump', 'protest', 'your', 'behavior', 'is', 'whi', 'trump', 'won', 'in', 'the', 'first', 'place', 'video']
['arrog', 'obama', 'might', 'want', 'to', 'scrub', 'thi', 'video', 'from', 'the', 'internet', 'at', 'least', 'i', 'go', 'presid

['lol', 'actress', 'charli', 'theron', 'tell', 'south', 'african', 'aid', 'is', 'not', 'transmit', 'sex…', 'it', 'transmit', 'sexism', 'racism', 'poverti', 'homophobia', 'video']
['stuck', 'on', 'stupid', 'while', 'liber', 'trash', 'melania', 'trump…shouldn', 'we', 'be', 'more', 'concern', 'with', 'thi', 'bit', 'from', 'obama', 'video']
['lone', 'survivor', 'marcu', 'luttrel', 's', 'power', 'gop', 'convent', 'speech', 'video']
['hyster', 'video', 'clinton', 'ad', 'destroy', 'with', 'comment', 'inserted…shar', 'thi', 'everywher']
['brilliant', 'nigel', 'farag', 'on', 'how', 'the', 'gop', 'can', 'win', 'real', 'america', 'back', 'video']
['whoa', 'black', 'woman', 'fed', 'up', 'with', 'black', 'racist', 'nail', 'it', 'mani', 'black', 'peopl', 'vote', 'for', 'barack', 'obama', 'simpli', 'becaus', 'he', 'wa', 'black…and', 'now', 'your', 'black', 'god', 'ha', 'fail', 'you', 'video']
['univers', 'presid', 'apolog', 'to', 'traumat', 'student', 'for', 'allow', 'cop', 'to', 'sleep', 'in', 'camp

['bombshel', 'us', 'gener', 'admit', 'obama', 'willingli', 'arm', 'isi', 'video']
['whi', 'thi', 'new', 'book', 'by', 'lib', 'writer', 'and', 'radio', 'host', 'will', 'send', 'shock', 'wave', 'through', 'the', 'democrat', 'parti']
['undercov', 'nypd', 'cop', 'bust', 'women', 'build', 'bomb', 'plan', 'to', 'wage', 'jihad…mayor', 'deblasio', 'say', 'unfair']
['german', 'volunt', 'hold', 'welcom', 'ralli', 'applaud', 'as', 'muslim', 'migrant', 'sing', 'jihadist', 'song', 'video']
['jimmi', 'kimmel', 'hyster', 'take', 'on', 'the', 'republican', 'debat', 'video']
['the', 'state', 'of', 'our', 'nation', 'is', 'perfectli', 'illustr', 'in', 'these', 'hyster', 'halloween', 'meme']
['media', 'lie', 'expos', 'hundr', 'of', 'student', 'ralli', 'in', 'support', 'of', 'fire', 'sc', 'school', 'cop', 'fight', 'back', 'against', 'cop-hat', 'race', 'bait', 'media', 'video']
['cnbc', 'debat', 'hack', 'prove', 'allegi', 'to', 'democrat', 'parti', 'with', 'thi', 'tweet', 'celebr', 'mass', 'murder']
['state

['sara', 'carter', 'uncov', 'explos', 'evid', 'of', 'violat', 'of', 'american', 'civil', 'liberti', 'by', 'obama', 'video']
['he', 's', 'baaack', 'judg', 'napolitano', 'on', 'fox', 'news', 'not', 'back', 'down', 'from', 'obama', 'spi', 'claim', 'video']
['watch', 'tsa', 's', 'pat-down', 'at', 'dalla', 'airport', 'leav', 'mother', 'enrag', 'we', 'hell', 'morn', 'video']
['beauti', 'melania', 'wear', 'lbd', 'to', 'host', 'recept', 'senators…guess', 'who', 'wa', 'there', 'wear', 'a', 'smirk', 'on', 'hi', 'face', 'video']
['break', 'd.c.', 'driver', 'plow', 'into', 'capitol', 'police…shot', 'fire']
['trey', 'gowdi', 'on', 'spi', 'on', 'american', 'citizens…lik', 'presid', 'trump', 'video']
['outrag', 'intimid', 'citizen', 'oppos', 'mosqu', 'goe', 'court']
['go', 'for', 'it', 'russia', 'threaten', 'to', 'leak', 'thing', 'obama', 'want', 'to', 'keep', 'secret']
['just', 'do', 'it', 'alabama', 'congressman', 'file', 'one', 'sentenc', 'bill', 'to', 'repeal', 'largest', 'welfar', 'plan', 'by', 

['time', 'deport', 'mexican', 'laugh', 'while', 'be', 'sentenc', 'for', 'sodomi', 'kidnap', 'sexual', 'assault', 'in', 'sanctuari', 'state', 'oregon…tel', 'victim', 'rel', 'see', 'all', 'you', 'guy', 'in', 'hell', 'video']
['walmart', 'is', 'sell', 'made', 'in', 'mexico', 'apparel', 'featur', 'domest', 'terror', 'group', 'video']
['flashback', 'to', 'wapo', 'headlin', 'obama', 'should', 'fire', 'john', 'brennan', 'for', 'lying…onli', 'one', 'year', 'after', 'jame', 'clapper', 'wa', 'caught', 'lie', 'under', 'oath']
['abc', 'news', 'get', 'destroy', 'on', 'twitter', 'for', 'wait', 'sever', 'hour', 'to', 'admit', 'they', 'got', 'major', 'detail', 'in', 'flynn', 'stori', 'wrong…', 'fakenewsabc']
['whi', 'is', 'al', 'sharpton', 's', 'half-broth', 'regist', 'thousand', 'of', 'felon', 'to', 'vote', 'in', 'alabama', 'controversi', 'senat', 'race', 'video']
['matt', 'lauer', 'call', 'out', 'by', 'sandra', 'bullock', 'for', 'creepi', 'sex', 'talk', 'dure', 'interview', 'i', 'seen', 'nake', 'vid

['disgust', 'seattl', 'mayor', 'who', 'announc', 'he', 'su', 'trump', 'over', 'sanctuari', 'citi', 'exec', 'order', 'is', 'accus', 'of', 'rape', '15-yr', 'old', 'boy', 'two', 'other']
['sore', 'loser', 'war-hawk', 'john', 'mccain', 'blame', 'presid', 'trump', 'for', 'syrian', 'chemic', 'attack', 'video']
['msnbc', 'pinhead', 'host', 'threaten', 'fox', 's', 'bill', 'o', 'reilli', 'come', 'sue', 'me…i', 'dare', 'video']
['camp', 'nightmar', 'machet', 'wield', 'refuge', 'drag', '23-yr', 'old', 'woman', 'from', 'tent…forc', 'boyfriend', 'to', 'watch', 'the', 'unthink']
['how', 'gorsuch', 'will', 'have', 'immedi', 'effect', 'on', 'histor', '2nd', 'amend', 'decis', 'and', 'these', 'signific', 'controversi', 'case']
['subway', 'rider', 'attack', 'with', 'hammer', 'for', 'ask', 'passeng', 'to', 'stop', 'man-spread']
['paul', 'joseph', 'watson', 'is', 'not', 'happi', 'about', 'the', 'air', 'strike', 'on', 'syria…her', 'whi', 'video']
['watch', 'craze', 'lefti', 'protest', 'trump', 'shut', 'down

['race-bait', 'cop', 'hater', 'dealt', 'major', 'blow', 'baltimor', 'judg', 'find', 'no', 'evid', 'of', 'crime', 'commit', 'against', 'freddi', 'grey']
['break', 'us', 'suprem', 'court', 'rule', 'king', 'obama', 'overstep', 'authority…execut', 'amnesti', 'for', 'million', 'illeg', 'aliens/', 'democrat', 'voter', 'not', 'go', 'to', 'happen']
['boom', 'rep', 'louie', 'gohmert', 'r-tx', 'rip', 'into', 'obama', 'gun', 'grab', 'legisl', 'minion', 'radic', 'islam', 'kill', 'these', 'peopl', 'video']
['break', 'us', 'suprem', 'court', 'uphold', 'u', 'of', 'tx-austin', 'admiss', 'abil', 'to', 'choos', 'black', 'hispan', 'student', 'befor', 'white', 'asian', 'student']
['watch', 'indoctrin', 'colleg', 'student', 'are', 'stun', 'by', 'ugli', 'truth', 'about', 'hillari', 'which', 'candid', 'said']
['flashback', 'hillari', 'receiv', '500k', 'in', 'jewelri', 'from', 'king', 'of', 'barbar', 'nation', 'who', 'brutal', 'oppress', 'women']
['someth', 'wick', 'is', 'happen', 'with', 'refuge', 'in', 'ida

['mainstream', 'media', 'ignor', 'massiv', 'protest', 'against', 'obama', 'sweetheart', 'deal', 'for', 'corpor', 'biggest', 'protest', 'countri', 'seen', 'mani', 'mani', 'year']
['obama', 'tell', 'minut', 'he', 'could', 'win', 'a', 'third', 'term', 'but', 'say', 'he', 'won', 'run…watch', 'surprisingli', 'hard-hit', 'interview', 'here']
['colleg', 'punish', 'success', 'by', 'not', 'allow', 'yacht', 'club', 'at', 'prestigi', 'school', 'video']
['citi', 'across', 'america', 'are', 'replac', 'columbu', 'day', 'with', 'indigen', 'peopl', 'day']
['boycott', 'pro-gun', 'control', 'seth', 'racist', 'rogen', 'star', 'of', 'newli', 'releas', 'steve', 'job', 'movi', 'send', 'vulgar', 'tweet', 'f', 'ck', 'you', 'ben', 'carson']
['rabid', 'pro-amnesti', 'legisl', 'lui', 'gutiérrez', 'on', 'paul', 'ryan', 'for', 'speaker', 'he', 'would', 'work', 'democrat', 'order', 'solv', 'problem', 'america', 'video']
['gq', 'magazin', 'pen', 'repuls', 'articl', 'on', 'brilliant', 'neurosurgeon', 'f', 'ck', 'ben'

['will', 'trumponom', 'bankrupt', 'america']
['the', 'existenti', 'question', 'of', 'whom', 'to', 'trust']
['boiler', 'room', 'did', 'israel', 'attack', 'damascu', 'bill', 'nye', 'the', 'psyop', 'guy']
['the', 'cia', 'doesn', 'need', 'to', 'spi', 'on', 'free', 'thinker', 'the', 'privat', 'sector', 'doe', 'it', 'for', 'free']
['whi', 'not', 'a', 'probe', 'israel-g']
['tax', 'march', 'where', 'were', 'you', 'obama', 'wreck', 'libya']
['mass', 'integr', 'the', 'race', 'capit', 'virtual', 'futur']
['russia-g', 'wa', 'all', 'rage', 'across', 'us', 'media', 'where', 'did', 'go', 'whi']
['boiler', 'room', 'quantum', 'swamp', 'chess']
['easili', 'dupe', 'trump', 'surpass', 'bush', 'fall', 'chemic', 'weapon', 'theatric']
['professor', 'polit', 'ignor', 'go', 'to', 'have', 'consequ']
['is', 'spicer', 'flap', 'cover', 'media', 'tie', 'up', 'white', 'hous', 'global', 'affair', 'scuttl', 'trump', 'domest', 'agenda']
['trump', 'wag', 'dog', 'moment']
['tulsi', 'gabbard', 'trigger', 'the', 'war', 'ha

In [98]:
def IDF(freq):
    return math.log(44898 / freq)

In [99]:
for key in numDocs:
     new_df[key] = new_df[key].apply(lambda x: x*IDF(numDocs[key]))

In [102]:
new_df.describe()

Unnamed: 0,trump,video,to,the,for,in,u.s.,of,'s,say,...,must,anti-trump,depart,famili,germani,next,most,reveal,defens,big
count,44898.0,44898.0,44898.0,44898.0,44898.0,44898.0,44898.0,44898.0,44898.0,44898.0,...,44898.0,44898.0,44898.0,44898.0,44898.0,44898.0,44898.0,44898.0,44898.0,44898.0
mean,0.037691,0.026349,0.022378,0.020088,0.016496,0.016299,0.026978,0.015258,0.025931,0.02445,...,0.003152,0.002737,0.00359,0.002893,0.00374,0.003326,0.002513,0.002633,0.003731,0.002864
std,0.061454,0.058322,0.055224,0.063119,0.054561,0.055387,0.086536,0.052941,0.085379,0.08034,...,0.03984,0.035457,0.045645,0.037711,0.047959,0.042926,0.032091,0.033922,0.047899,0.037219
min,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
25%,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
50%,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
75%,0.085127,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
max,0.425635,0.650043,0.462027,0.637369,0.498403,0.561759,0.712697,0.629066,0.717439,0.627622,...,0.84445,0.85415,1.02724,0.864449,1.014759,1.171051,0.856033,0.851058,1.302723,1.031064


In [72]:
new_df

Unnamed: 0,title,text,subject,date,target,title_token,text_token,title_stop,text_stop,title_stemmed,...,must,anti-trump,depart,famili,germani,next,most,reveal,defens,big
0,"As U.S. budget fight looms, Republicans flip t...",WASHINGTON (Reuters) - The head of a conservat...,politicsNews,"December 31, 2017",true,"[As, U.S., budget, fight, looms, ,, Republican...","[WASHINGTON, (, Reuters, ), -, The, head, of, ...","[As, U.S., budget, fight, looms, ,, Republican...","[WASHINGTON, (, Reuters, ), -, The, head, cons...","[as, u.s., budget, fight, loom, republican, fl...",...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,U.S. military to accept transgender recruits o...,WASHINGTON (Reuters) - Transgender people will...,politicsNews,"December 29, 2017",true,"[U.S., military, to, accept, transgender, recr...","[WASHINGTON, (, Reuters, ), -, Transgender, pe...","[U.S., military, accept, transgender, recruits...","[WASHINGTON, (, Reuters, ), -, Transgender, pe...","[u.s., militari, accept, transgend, recruit, m...",...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Senior U.S. Republican senator: 'Let Mr. Muell...,WASHINGTON (Reuters) - The special counsel inv...,politicsNews,"December 31, 2017",true,"[Senior, U.S., Republican, senator, :, 'Let, M...","[WASHINGTON, (, Reuters, ), -, The, special, c...","[Senior, U.S., Republican, senator, :, 'Let, M...","[WASHINGTON, (, Reuters, ), -, The, special, c...","[senior, u.s., republican, senat, 'let, mr., m...",...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,FBI Russia probe helped by Australian diplomat...,WASHINGTON (Reuters) - Trump campaign adviser ...,politicsNews,"December 30, 2017",true,"[FBI, Russia, probe, helped, by, Australian, d...","[WASHINGTON, (, Reuters, ), -, Trump, campaign...","[FBI, Russia, probe, helped, Australian, diplo...","[WASHINGTON, (, Reuters, ), -, Trump, campaign...","[fbi, russia, probe, help, australian, diploma...",...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Trump wants Postal Service to charge 'much mor...,SEATTLE/WASHINGTON (Reuters) - President Donal...,politicsNews,"December 29, 2017",true,"[Trump, wants, Postal, Service, to, charge, 'm...","[SEATTLE/WASHINGTON, (, Reuters, ), -, Preside...","[Trump, wants, Postal, Service, charge, 'much,...","[SEATTLE/WASHINGTON, (, Reuters, ), -, Preside...","[trump, want, postal, servic, charg, 'much, am...",...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
44893,McPain: John McCain Furious That Iran Treated ...,21st Century Wire says As 21WIRE reported earl...,Middle-east,"January 16, 2016",fake,"[McPain, :, John, McCain, Furious, That, Iran,...","[21st, Century, Wire, says, As, 21WIRE, report...","[McPain, :, John, McCain, Furious, That, Iran,...","[21st, Century, Wire, says, As, 21WIRE, report...","[mcpain, john, mccain, furiou, that, iran, tre...",...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
44894,JUSTICE? Yahoo Settles E-mail Privacy Class-ac...,21st Century Wire says It s a familiar theme. ...,Middle-east,"January 16, 2016",fake,"[JUSTICE, ?, Yahoo, Settles, E-mail, Privacy, ...","[21st, Century, Wire, says, It, s, a, familiar...","[JUSTICE, ?, Yahoo, Settles, E-mail, Privacy, ...","[21st, Century, Wire, says, It, familiar, them...","[justic, yahoo, settl, e-mail, privaci, class-...",...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
44895,Sunnistan: US and Allied ‘Safe Zone’ Plan to T...,Patrick Henningsen 21st Century WireRemember ...,Middle-east,"January 15, 2016",fake,"[Sunnistan, :, US, and, Allied, ‘, Safe, Zone,...","[Patrick, Henningsen, 21st, Century, WireRemem...","[Sunnistan, :, US, Allied, ‘, Safe, Zone, ’, P...","[Patrick, Henningsen, 21st, Century, WireRemem...","[sunnistan, us, alli, safe, zone, plan, take, ...",...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
44896,How to Blow $700 Million: Al Jazeera America F...,21st Century Wire says Al Jazeera America will...,Middle-east,"January 14, 2016",fake,"[How, to, Blow, $, 700, Million, :, Al, Jazeer...","[21st, Century, Wire, says, Al, Jazeera, Ameri...","[How, Blow, $, 700, Million, :, Al, Jazeera, A...","[21st, Century, Wire, says, Al, Jazeera, Ameri...","[how, blow, million, al, jazeera, america, fin...",...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


Finally, we'll train a linear support vector classifier model to predict fake or true news if given the title of a new unseen article. We'll separate our dataframe with 80% for the test set and 20% for the training set. Then, we run the model, and print the confusion matrix to see how it did!

In [57]:
from sklearn import svm
from sklearn.model_selection import train_test_split
import math 

In [56]:
def convertTarget(x):
    if x == 'fake':
        return 1
    else:
        return 0

In [104]:
new_df['target'] = new_df['target'].apply(lambda x: convertTarget(x))

X = new_df.drop(['target', 'title', 'text', 'subject', 'date', 'title_token', 'text_token', 'title_stop', 'text_stop', 'title_stemmed', 'text_stemmed'], axis=1)
y = new_df['target']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.20)


In [107]:
svclassifier = SVC(kernel='linear')
svclassifier.fit(X_train, y_train)

SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
    decision_function_shape='ovr', degree=3, gamma='auto_deprecated',
    kernel='linear', max_iter=-1, probability=False, random_state=None,
    shrinking=True, tol=0.001, verbose=False)

In [110]:
y_pred = svclassifier.predict(X_test)

print(confusion_matrix(y_test,y_pred))
print(classification_report(y_test,y_pred))

[[4164  111]
 [ 210 4495]]
              precision    recall  f1-score   support

           0       0.95      0.97      0.96      4275
           1       0.98      0.96      0.97      4705

    accuracy                           0.96      8980
   macro avg       0.96      0.96      0.96      8980
weighted avg       0.96      0.96      0.96      8980



As can be seen above, the model had high precision when predicting that an article was fake news. 98% accurate, in fact! This is what we want. In the application of this model, you would prefer to miss a few fake news articles than incorrectly label a true news article as fake, as that would lose credibility. 

Social media giants that are dealing with public pressure to label fake news are in need of machine learning algorithms to help them identify fake news based only on the title of the article. This model could have a huge impact and make it much harder for misinformation to affect elections and public perception of varying issues. 