# US Politics - Twitter analysis

This is some code to look at tweets by and related to prominent US political figures. As a starting point we will be looking at:

- Donald Trump
- Hillary Clinton
- Barack Obama

We're also looking at the responses of media outlets to these tweets, namely:

- BBC World News
- Breitbart
- CBC News
- CBS News
- CNN
- FOX News
- The Guardian
- MSNBC
- NBC News
- Reuters



In [67]:
from IPython.display import display
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import datetime as dt
import os
from keys import WD
%matplotlib inline

Let's define some functions to read in the json log files.

In [147]:
def file_date(t):
    date_new = t.strftime("%Y%m%d")
    return date_new

def read_tweets(search_type, name, start, end):
    no_days = (end-start).days
    dtype_dict= {"location":str, "hastags":list, "mentions":list, "t":dt.datetime, \
                 "screen_name":str, "log_time":dt.datetime, "id":int, "content":str, "source":str, \
                 "fav_count":int, "retweet_count":int}

    count = 0
    for i in range(no_days+1):
        new_date = file_date(start+dt.timedelta(i))
        file_tmp = WD+'/{}-{}-{}.log'.format(search_type, name, new_date)
         
        try:
            df_tmp = pd.read_json(file_tmp, lines=True, orient='records', dtype=dtype_dict)
            if count == 0:
                df_tweets = df_tmp
            else:
                df_tweets = pd.concat([df_tweets,df_tmp])
            count +=1
            
        except:
            print 'No tweets on:', start+dt.timedelta(i)
    df_tweets = df_tweets.set_index('t')
    #display(df_tweets.head(2))
    print len(df_tweets), 'tweets collected on', count, 'days for', search_type, name,'.'
    return df_tweets


In [149]:
start_date = dt.date(2018,1,1)
end_date = dt.date.today()

# Politicians
DJT_tweets = read_tweets('tweets', 'realDonaldTrump',start_date, end_date)
HRC_tweets = read_tweets('tweets', 'HillaryClinton',start_date, end_date)
BHO_tweets = read_tweets('tweets', 'BarackObama',start_date, end_date)

# News outlets
BBC_tweets = read_tweets('tweets', 'BBCWorld',start_date, end_date)
Breitbart_tweets = read_tweets('tweets', 'BreitbartNews',start_date, end_date)
CBC_tweets = read_tweets('tweets', 'cbcnews',start_date, end_date)
CBS_tweets = read_tweets('tweets', 'CBSNews',start_date, end_date)
CNN_tweets = read_tweets('tweets', 'cnn',start_date, end_date)
FOX_tweets = read_tweets('tweets', 'foxnews',start_date, end_date)
Guardian_tweets = read_tweets('tweets', 'guardian',start_date, end_date)
MSNBC_tweets = read_tweets('tweets', 'MSNBC',start_date, end_date)
NBC_tweets = read_tweets('tweets', 'NBCNews',start_date, end_date)
Reuters_tweets = read_tweets('tweets', 'Reuters',start_date, end_date)

# Need to filter by Trump/Hillary/Obama related tweets

No tweets on: 2018-01-30
201 tweets collected on 30 days for tweets realDonaldTrump .
No tweets on: 2018-01-01
No tweets on: 2018-01-03
No tweets on: 2018-01-04
No tweets on: 2018-01-05
No tweets on: 2018-01-06
No tweets on: 2018-01-07
No tweets on: 2018-01-08
No tweets on: 2018-01-10
No tweets on: 2018-01-11
No tweets on: 2018-01-13
No tweets on: 2018-01-14
No tweets on: 2018-01-16
No tweets on: 2018-01-17
No tweets on: 2018-01-18
No tweets on: 2018-01-21
No tweets on: 2018-01-22
No tweets on: 2018-01-23
No tweets on: 2018-01-24
No tweets on: 2018-01-28
No tweets on: 2018-01-29
No tweets on: 2018-01-30
16 tweets collected on 10 days for tweets HillaryClinton .
No tweets on: 2018-01-01
No tweets on: 2018-01-02
No tweets on: 2018-01-03
No tweets on: 2018-01-04
No tweets on: 2018-01-05
No tweets on: 2018-01-06
No tweets on: 2018-01-07
No tweets on: 2018-01-08
No tweets on: 2018-01-09
No tweets on: 2018-01-10
No tweets on: 2018-01-11
No tweets on: 2018-01-12
No tweets on: 2018-01-13
No tw