## Using the Twitter API: Tutorial

##### Background: What is an API?

An Application Programming Interface (API) is the means by which a piece of software exposes some of its underlying functionality. Ideally an API is well documented so that application programmers can easily interact with it. 

We will look at a specific type of API: an interface exposed by a web site or a Web API. 

The practice of publishing APIs has allowed web communities to create an open architecture for sharing content and data. In this way, content that is created in one place can be dynamically posted and updated in multiple locations on the web. For example, Amazon or eBay APIs allow developers to use the existing retail infrastructure to create specialized web stores. Other APIs allow for:

-Smartphone applications (for accessing Twitter, LinkedIn, Facebook etc.);<br>
-Maps with location data (like Yelp);<br>
-Online purchases (verification of credit-card data); and <br>
-Sharing content between social networking sites.

###### Twitter API 

Many APIs will require you to establish an authorization key. For the twitter API, you must create an application here: https://apps.twitter.com/app/new<br>

I can provide you with temporary keys to my bot's account during class. 

Otherwise, to fill out the application linked above following these instructions: 
Write in a Name and Desciption. You can put in a place filler like http://www.google.com for the Website.<br>
Leave the Callback URL empty.<br> 
Submit the form.

On the following page go to the Keys and Access Tokens tab and make a note of the <strong>API Key</strong> and <strong>API Secret</strong>. Scroll down and create an Access Token. Make a note of the <strong>Access Token</strong> and <strong>Access Token Secret</strong>.

In [None]:
api_key = #get your own using insturctions above or I'll give you one during class
api_secret = ##
access_token = ##
access_secret = ##

###### We will be using Tweepy to access data from Twitter

If you have not installed this package yet, go to the "Anaconda Prompt" terminal on your machine and execute:
<code>pip install tweepy</code>

In [14]:
import pandas as pd
import tweepy #package we will use to interact with the API. 

In [3]:
#Authorization
auth = tweepy.OAuthHandler(api_key, api_secret)
auth.set_access_token(access_token, access_secret)

api = tweepy.API(auth)

Now Let's have some fun with all the twitter data at your disposal!

In [4]:
#Print to the console the tweets from your timeline
public_tweets = api.home_timeline()
for tweet in public_tweets:
    print(tweet.text)

Picture Books to Help Kids Weather Our Age of Anxiety https://t.co/nhXfwDFNJ9 https://t.co/NGRaF9059q
".........." by Svetlana Melik-Nubarova
Follow the photographer: https://t.co/UfJqk2id1F

#500px #photography… https://t.co/nYVOTtYKd8
"Trail Runner" by Chaun Goins
Follow the photographer: https://t.co/fItyPeQu4I

#500px #photography #dog https://t.co/4fW0WQ0wrQ
How to road trip with your significant other, from a couple that does it for a living https://t.co/GUaZAlsNBo https://t.co/ydNZADxwl5
These are the best food cities in the world right now 🍽  https://t.co/O9O24M6ok3 https://t.co/ZimaCJrRcr
The best national park adventure for every zodiac sign https://t.co/SZUEzpFokg https://t.co/R3FUQd1gnZ
Today at 2pm, curators John Carpenter and Soyoung Lee along with Professor
Robert E. Harrist, Jr., will discuss tra… https://t.co/1sxSX2j2Py
#Kiser4Sullivan 🔷⚔️🔶

@kiser_rollin is a semifinalist for the prestigious Sullivan Award. 

Fan Vote is very import… https://t.co/y1l5RV5JY2
The Coney 

In [5]:
# Get information about a user
user = api.get_user('twitter')

print(user.screen_name)
#get follwer count
print(user.followers_count)

Twitter
62576785


In [23]:
user

User(_api=<tweepy.api.API object at 0x000001D12B270828>, _json={'id': 873188775165382656, 'id_str': '873188775165382656', 'name': 'Rome Vacations', 'screen_name': 'rome_suggestbot', 'location': 'Rome, Lazio', 'profile_location': {'id': '7d588036fe12e124', 'url': 'https://api.twitter.com/1.1/geo/id/7d588036fe12e124.json', 'place_type': 'unknown', 'name': 'Rome, Lazio', 'full_name': 'Rome, Lazio', 'country_code': '', 'country': '', 'contained_within': [], 'bounding_box': None, 'attributes': {}}, 'description': 'Sending out great travel options for adventurers on a budget. All from an Airbnb and yelp data informed bot.', 'url': 'https://t.co/a4Gv7rV7Im', 'entities': {'url': {'urls': [{'url': 'https://t.co/a4Gv7rV7Im', 'expanded_url': 'http://www.yelp.com', 'display_url': 'yelp.com', 'indices': [0, 23]}]}, 'description': {'urls': []}}, 'protected': False, 'followers_count': 1, 'friends_count': 31, 'listed_count': 0, 'created_at': 'Fri Jun 09 14:43:21 +0000 2017', 'favourites_count': 1, 'ut

## JSON

JSON stands for JavaScript Object Notation<br>
JSON is a lightweight data-interchange format<br>
JSON is "self-describing" and easy to understand<br>
JSON is language independent *<br>
Let's see an example below:

In [11]:
public_tweets #here's what the data looks like

[Status(_api=<tweepy.api.API object at 0x000001D12B270828>, _json={'created_at': 'Sun Mar 18 15:00:33 +0000 2018', 'id': 975386480720957440, 'id_str': '975386480720957440', 'text': 'Picture Books to Help Kids Weather Our Age of Anxiety https://t.co/nhXfwDFNJ9 https://t.co/NGRaF9059q', 'truncated': False, 'entities': {'hashtags': [], 'symbols': [], 'user_mentions': [], 'urls': [{'url': 'https://t.co/nhXfwDFNJ9', 'expanded_url': 'http://nyti.ms/2FRie0u', 'display_url': 'nyti.ms/2FRie0u', 'indices': [54, 77]}], 'media': [{'id': 975386478099361792, 'id_str': '975386478099361792', 'indices': [78, 101], 'media_url': 'http://pbs.twimg.com/media/DYlE1YHU8AAmxB7.jpg', 'media_url_https': 'https://pbs.twimg.com/media/DYlE1YHU8AAmxB7.jpg', 'url': 'https://t.co/NGRaF9059q', 'display_url': 'pic.twitter.com/NGRaF9059q', 'expanded_url': 'https://twitter.com/goodreads/status/975386480720957440/photo/1', 'type': 'photo', 'sizes': {'thumb': {'w': 150, 'h': 150, 'resize': 'crop'}, 'medium': {'w': 1092, 'h

As you can see, the JSON isn't the most readable-- there are tons of brakets, arrays, and colons that organize the object.<br>

Most web data is stored with JSON or XML. You can use Beautiful Soup, knowledge of dictionaries (json is essentially a combination of lists and dictionaries) and other packages to work with the data. Here we just use a for loop to re-structure the data as rows and columns (dataframe).

In [15]:
#let's structure it so we can use it
def toDataFrame(tweets):

    DataSet = pd.DataFrame()

    DataSet['tweetID'] = [tweet.id for tweet in tweets]
    DataSet['tweetText'] = [tweet.text for tweet in tweets]
    DataSet['tweetRetweetCt'] = [tweet.retweet_count for tweet 
    in tweets]
    DataSet['tweetFavoriteCt'] = [tweet.favorite_count for tweet 
    in tweets]
    DataSet['tweetSource'] = [tweet.source for tweet in tweets]
    DataSet['tweetCreated'] = [tweet.created_at for tweet in tweets]


    DataSet['userID'] = [tweet.user.id for tweet in tweets]
    DataSet['userScreen'] = [tweet.user.screen_name for tweet 
    in tweets]
    DataSet['userName'] = [tweet.user.name for tweet in tweets]
    DataSet['userCreateDt'] = [tweet.user.created_at for tweet 
    in tweets]
    DataSet['userDesc'] = [tweet.user.description for tweet in tweets]
    DataSet['userFollowerCt'] = [tweet.user.followers_count for tweet 
    in tweets]
    DataSet['userFriendsCt'] = [tweet.user.friends_count for tweet 
    in tweets]
    DataSet['userLocation'] = [tweet.user.location for tweet in tweets]
    DataSet['userTimezone'] = [tweet.user.time_zone for tweet 
    in tweets]

    return DataSet

In [17]:
df = toDataFrame(public_tweets)
df.head()

Unnamed: 0,tweetID,tweetText,tweetRetweetCt,tweetFavoriteCt,tweetSource,tweetCreated,userID,userScreen,userName,userCreateDt,userDesc,userFollowerCt,userFriendsCt,userLocation,userTimezone
0,975386480720957440,Picture Books to Help Kids Weather Our Age of ...,2,9,Sprout Social,2018-03-18 15:00:33,15898172,goodreads,goodreads,2008-08-19 00:02:44,The largest site for readers and book recommen...,3726968,9326,,Pacific Time (US & Canada)
1,975386455366356992,""".........."" by Svetlana Melik-Nubarova\nFollo...",1,2,Hootsuite,2018-03-18 15:00:27,20431922,500px,500px,2009-02-09 12:33:25,A community for passionate photographers every...,3562405,3505,Worldwide,Pacific Time (US & Canada)
2,975386452757446657,"""Trail Runner"" by Chaun Goins\nFollow the phot...",1,7,Hootsuite,2018-03-18 15:00:26,20431922,500px,500px,2009-02-09 12:33:25,A community for passionate photographers every...,3562405,3505,Worldwide,Pacific Time (US & Canada)
3,975382609646751744,"How to road trip with your significant other, ...",8,18,trueAnthem,2018-03-18 14:45:10,16211434,TravelLeisure,Travel + Leisure,2008-09-09 21:16:34,"Your connection to the world of travel, brough...",4023859,967,"New York, NY",Eastern Time (US & Canada)
4,975377559801204736,These are the best food cities in the world ri...,8,28,SocialFlow,2018-03-18 14:25:06,17219108,CNTraveler,Condé Nast Traveler,2008-11-06 20:49:41,At home in the world.,3226304,46134,"New York, NY",Eastern Time (US & Canada)


In [19]:
df.iloc[0,1]`

'Picture Books to Help Kids Weather Our Age of Anxiety https://t.co/nhXfwDFNJ9 https://t.co/NGRaF9059q'

### Getting tweets from a single user

In [30]:
#last 50 tweets of user
user_tweets_last50 = api.user_timeline('twitter', count=50)

In [34]:
df_last50 = toDataFrame(user_tweets_last50) #how we did it previously

Unnamed: 0,tweetID,tweetText,tweetRetweetCt,tweetFavoriteCt,tweetSource,tweetCreated,userID,userScreen,userName,userCreateDt,userDesc,userFollowerCt,userFriendsCt,userLocation,userTimezone
0,974750027292823553,Survivors. Students. Activists.\n\n@Emma4Chang...,1528,5124,Twitter Web Client,2018-03-16 20:51:31,783214,Twitter,Twitter,2007-02-20 14:35:54,Your official source for what’s happening. Ne...,62576610,145,"San Francisco, CA",Pacific Time (US & Canada)
1,974647538845396992,RT @TwitterDublin: #StPatricksDay is trending!...,151,0,Twitter Web Client,2018-03-16 14:04:15,783214,Twitter,Twitter,2007-02-20 14:35:54,Your official source for what’s happening. Ne...,62576610,145,"San Francisco, CA",Pacific Time (US & Canada)
2,973973955517218817,@thegodkooper We like to keep our options open.,3,11,Twitter Web Client,2018-03-14 17:27:41,783214,Twitter,Twitter,2007-02-20 14:35:54,Your official source for what’s happening. Ne...,62576610,145,"San Francisco, CA",Pacific Time (US & Canada)
3,973641637065785344,"RT @TwitterLive: Sit center stage, and watch @...",52,0,Twitter Web Client,2018-03-13 19:27:10,783214,Twitter,Twitter,2007-02-20 14:35:54,Your official source for what’s happening. Ne...,62576610,145,"San Francisco, CA",Pacific Time (US & Canada)
4,973281312709718016,@SAMGREIS 👏 @sally_field for playing Twitter m...,2,28,Twitter Web Client,2018-03-12 19:35:22,783214,Twitter,Twitter,2007-02-20 14:35:54,Your official source for what’s happening. Ne...,62576610,145,"San Francisco, CA",Pacific Time (US & Canada)


We can use this toDataFrame function when we use the twitter API, but *other APIs* will have <b> different organization</b>. Thus, you will need to look at the data, treat is as a combination of lists and dictionaries (key-value pairs) in order to work with it.<br> 
#### Potential quick ways to get what you are looking for:

pd.read_json(json) might work<br>
https://stackoverflow.com/questions/21104592/json-to-pandas-dataframe<br>
https://stackoverflow.com/questions/41584225/how-to-convert-json-to-a-dataframe-in-python


<strong>Important Note:</strong> To prevent bots from spamming, Twitter will restrict your access or boot you if you perform too many actions automatically. 
You can limit the usage of your cursor to stay within the rate limit.

#### Try out more actions on your own using the tweepy documentation here:
http://docs.tweepy.org/en/v3.5.0/api.html
#### Challenges
1. Get your twitter data or someone you're interested in<br>
    -analyze your sentiment over time<br>
    -visualize data using skills learned previously<br>
    -see the sample trump mini project I posted...<br>
2. Get different types of twitter data (based on location or a specific hashtag...)<br>
     -use tweepy documentation, https://galeascience.wordpress.com/2016/03/18/collecting-twitter-data-with-python/
3. Work with another API (we've shown you Reddit and Twitter-- I have used yelp and can help with that)
