# Getting to Know the Twitter API

Twitter is a goldmine for unstructured text data. The purpose of this notebook is to familiarize myself with how to associate with the Twitter API as well as practice natrual language processing techniques. 

In [1]:
import json
import pandas as pd
import tweepy

In [2]:
'''
OAuth Process
With keys, tokens, setting up instance of API
Adapted from https://medium.freecodecamp.org/basic-data-analysis-on-twitter-with-python-251c2a85062e
'''

def load_api():
    # Credentials
    # Mine have been scrubbed--if you want to try this out, insert your own keys :)
    consumer_key = ''
    consumer_secret = ''
    access_token = ''
    access_token_secret = ''

    # Authorization
    auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
    auth.set_access_token(access_token, access_token_secret)
    return tweepy.API(auth)

## Mining the Tweets and Making a Dataframe

I'll be mining a small number of tweets from a specific user to see how I can manipulate the text and what analysis I can perform. The Twitter API status objects are very well organized into relevant categorical data.

In [3]:
'''
Get the Tweets
'''

api = load_api()

def grab_tweets(handle, limit=100):
    tweets = pd.DataFrame(columns=['created_at', 'user', 'tweet_id', 'text', 'retweet_count'])

    for status in tweepy.Cursor(api.user_timeline, id=handle, include_rts=False, exclude_replies=True).items(limit):
        tweets = tweets.append({'created_at' : status.created_at, 'user' : status.user.name, 'tweet_id' : status.id, \
                                'text' : status.text, 'retweet_count' : status.retweet_count}, ignore_index=True)
    return tweets

In [4]:
trump_tweets = grab_tweets('realDonaldTrump')

## Sentiment Analysis with TextBlob

TextBlob is a library used to processing text data. It is a powerful tool that can tag parts of speech, extract noun phrases, and analyze sentiment among others. I want to perform a simple correlation between sentiment and retweet count of this user's text. 

The sentiment analysis provides two values: polarity [-1.0, 1.0] and subjectivity [0.0, 1.0].

In [5]:
'''
Sentiment Analysis with TextBlob
Adapted from https://www.analyticsvidhya.com/blog/2018/02/natural-language-processing-for-beginners-using-textblob/
'''
from textblob import TextBlob

# Add sentiment scores to dataframe
trump_tweets['polarity'] = trump_tweets['text'].apply(lambda x: TextBlob(x).sentiment.polarity)
trump_tweets['subjectivity'] = trump_tweets['text'].apply(lambda x: TextBlob(x).sentiment.subjectivity)

In [6]:
trump_tweets.head()

Unnamed: 0,created_at,user,tweet_id,text,retweet_count,polarity,subjectivity
0,2018-10-12 18:09:17,Donald J. Trump,1050810652271529984,People have no idea how hard Hurricane Michael...,4158,0.254167,0.645833
1,2018-10-12 16:43:13,Donald J. Trump,1050788995377049601,"PROMISES MADE, PROMISES KEPT! https://t.co/2lk...",8477,0.0,0.0
2,2018-10-12 15:57:52,Donald J. Trump,1050777580553588738,REGISTER TO VOTE! https://t.co/0pWiwCHGbh http...,7173,0.0,0.0
3,2018-10-12 14:26:00,Donald J. Trump,1050754462405537798,PASTOR BRUNSON JUST RELEASED. WILL BE HOME SOON!,18847,0.0,0.0
4,2018-10-12 13:59:06,Donald J. Trump,1050747691062493185,My thoughts and prayers are with Pastor Brunso...,9127,0.0,0.0


In [7]:
trump_tweets.describe()

Unnamed: 0,polarity,subjectivity
count,100.0,100.0
mean,0.269573,0.403643
std,0.373378,0.343753
min,-0.5,0.0
25%,0.0,0.0
50%,0.121591,0.472917
75%,0.55625,0.650568
max,1.0,1.0
