# President tweet sentiment analysis

We are going to show you how to download, process and analyse the sentiment of real tweets. 
We have chosen to download tweets from the two last presidents of USA.

# Tweepy and Twitter API
First you need to setup your libraries and your API
We have chosen to use the python library Tweepy for downloading tweets and managing our Twitter API. 

In [8]:
import tweepy
from tweepy import API
from tweepy import OAuthHandler
import twitter_credentials

auth = OAuthHandler(twitter_credentials.CONSUMER_KEY, twitter_credentials.CONSUMER_SECRET)
auth.set_access_token(twitter_credentials.ACCESS_TOKEN, twitter_credentials.ACCESS_TOKEN_SECRET)
twitter_client = API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True)

# Downloading Tweets
Now that you have your API set up you can start downloading tweets. There are several ways to do this, but we have chosen to use Tweepy's Cursor function, because it's easy to understand and it required the least amount of code.
We will start by downloading the tweets from Barack Obama, with the Twitter handle @BarackObama.
Since we want to analyse the sentiment of Obamas own tweets we will exclude both manual and regular retweets.
We have also chosen to analyse tweets from when he was president as we will do with Donald Trump.

In [13]:
import datetime
from datetime import date
from tweepy import Cursor

twitter_user = 'BarackObama'
obama_tweets = []

startDate = datetime.datetime(2009, 1, 20, 0, 0 ,0)
endDate = datetime.datetime(2017, 1, 20, 0, 0 ,0)

for tweet in Cursor(twitter_client.user_timeline, twitter_user, tweet_mode='extended').items():
    if (not tweet.retweeted) and ('RT' not in tweet.full_text) and (tweet.created_at < endDate) and (tweet.created_at > startDate):
        obama_tweets.append(tweet)


print(len(obama_tweets))

2579


We can now do the same for Donald Trump. Since he somehow still is the sitting president we will set the end date for his tweets as 1 Jan 2021, which has not yet happened. His Twitter handle is @realDonaldTrump

In [15]:
twitter_user = 'realDonaldTrump'
trump_tweets = []

startDate = datetime.datetime(2017, 1, 20, 0, 0 ,0)
endDate = datetime.datetime(2021, 1, 1, 0, 0 ,0)

for tweet in Cursor(twitter_client.user_timeline, twitter_user, tweet_mode='extended').items():
    if (not tweet.retweeted) and ('RT' not in tweet.full_text) and (tweet.created_at < endDate) and (tweet.created_at > startDate):
        trump_tweets.append(tweet)


print(len(trump_tweets))

3


# Converting lists to dataframes
To use these lists of tweets as datasets we need to convirt them. We will do this with the help of pandas DataFrame() funcion and create our own function where we can decide what information we want to save. In this case we choose to save the text of the tweet, date of the tweet, tweet ID, number of favorites and number of retweets.

In [18]:
import pandas as pd
import numpy as np

def tweets_to_data_frame(tweets):
        df = pd.DataFrame(data=[tweet.full_text for tweet in tweets], columns=['Tweets'])

        df['date'] = np.array([tweet.created_at for tweet in tweets])
        df['id'] = np.array([tweet.id_str for tweet in tweets])
        df['retweets'] = np.array([tweet.retweet_count for tweet in tweets])
        df['favorites'] = np.array([tweet.favorite_count for tweet in tweets])

        return df
    
trump_df = tweets_to_data_frame(trump_tweets)
obama_df = tweets_to_data_frame(obama_tweets)

NameError: name 'np' is not defined

# Saving datasets as CSV files for later use
We don't want to have to download tweets every time we want to run our program. Thus we can save and load our dataframes to and from CSV files.

In [None]:
import csv

trump_df.to_csv('trump_tweets.csv', sep='\t', encoding='utf-8', index=False)
obama_df.to_csv('obama_tweets.csv', sep='\t', encoding='utf-8', index=False)

# Issues with this method
As you can see we only managed to find a few tweets from Donald Trump, while we found more than 2000 from Barack Obama. We do not know why this is the case, but for some reason it's harder to download tweets from Trump. 

Luckily there are other methods to find tweets from Donald Trump. We have used http://www.trumptwitterarchive.com/archive to download his tweets and make a CSV file. We have also previously saved Obamas tweets to another CSV file. We will now load them both and use as our Dataframes. 

In [17]:
trump_df = pd.read_csv("trump.csv", error_bad_lines=False, sep='\t')
obama_df = pd.read_csv("obama.csv", error_bad_lines=False, sep=',')

NameError: name 'pd' is not defined