# Interacting with REST APIs

In this lecture we will use the public Twitter APIs to interact with Twitter.  

## Obtaining my Twitter Keys

To interact with Twitter we need to deal with OAuth security authentication.  (This is complicated!!)  Fortunately, Twitter offers a simple way to obtain a secure Token for single user applications.  You will learn how to create a Twitter developer account and access token in Project 1.  

https://twitter.com/data_100

I have setup a developer account and created an application.  In the process Twitter generates a set of keys for me which I have stored in my key file:

In [None]:
keyfile = '/Users/jegonzal/Documents/keys/data100_twitter_api.json'

Would you like to see it's contents?  (Do I want students tweeting as me?)

Here is what a fake file looks like (these keys are not real...):

In [None]:
with open(keyfile + ".fake", "r") as f:
    print(f.read())

Naturally the values are all secret.  Now let's load the real deal.

### Json loader

The following reads the JSON file into a python dictionary

In [None]:
import json
key_file = '/Users/jegonzal/Documents/keys/data100_twitter_api.json'
with open(key_file) as f:
    auth_keys = json.load(f)

I can examine my keys:

In [None]:
auth_keys.keys()

I won't examine the values ... (Why?)

## Twitter Requests Session

I will use the request `oauth` support to start an Authenticated session:

In [None]:
from requests_oauthlib import OAuth1Session
session = OAuth1Session(auth_keys["consumer_key"],
                        client_secret=auth_keys["consumer_secret"],
                        resource_owner_key=auth_keys["access_token"],
                        resource_owner_secret=auth_keys["access_token_secret"])

# The Twitter REST APIs

We want to get all the Tweets from a user.  To do this we will use the timeline API:

https://developer.twitter.com/en/docs/tweets/timelines/api-reference/get-statuses-user_timeline.html

Skim the above web page to see how we call this API.  What do we need to know:

1. What kind of request (`GET`, `POST`, ...)?
1. What are the parameters or data that we should send?
1. What are the returned fields?
1. Rate limits ...


## Getting the Timeline

The following will get the Timeline for `"UCBIDS"`

In [None]:
url = "https://api.twitter.com/1.1/statuses/user_timeline.json"
resp = session.get(url, params={"screen_name": "UCBIDS"})
resp

### Examining the Response
1. What is it's format?

In [None]:
dict(resp.headers)

In [None]:
resp.content

## Parsing the JSON Content

The response is encoded in JSON (see headers or the content).  We will use the json parsing library built into python:

In [None]:
bd_tweets = json.loads(resp.content)
bd_tweets

How many tweets did we get:

In [None]:
len(bd_tweets)

### Examining a Tweet

1. What fields do we have?
1. What is the recursive structure?

In [None]:
bd_tweets[0]

### Examining First 5 Tweets:

We can loop over the dictionaries and print fields:

In [None]:
for t in bd_tweets[:5]:
    print(t['created_at'])
    print(t['text'], "\n")

## Making a Dataframe

Pandas can build a DataFrame from the dictionaries and even the raw JSON

In [None]:
import pandas as pd
df = pd.DataFrame(bd_tweets)
df.head()

In [None]:
import pandas as pd
df = pd.read_json(resp.content)
df.head()

# Getting Lots of Tweets

The Twitter API limits how many tweets you can obtain in a single request to 200 tweets.  Therefore to go back in time you need to repeatedly call requesting earlier tweets than the oldest tweet you have.  This will return 200 more tweets that are at least as old (including the oldest tweet again ...).

This is an excellent example of being stateless.  The Twitter Server does not need to remember which tweets it sent us.  Instead we tell it where to start reading in each request. 


The following block of code iterates until no new Tweets are returned. 

In [None]:
pd.to_datetime(bd_tweets[0]['created_at'])

In [None]:
def get_timeline(session, screen_name):
    """
    Constructs a dictionary of all available tweets from a given screen name.
    session: a request session that has been auntheticated 
    screen_name: the screen name from which to get the timeline
    
    returns: a list all tweets
    """
    url = "https://api.twitter.com/1.1/statuses/user_timeline.json"
    tweets = {}
    # Make an initial request
    resp = session.get(url, params = {"screen_name": screen_name, "count": 200})
    old_tweetid_len = -1
    # Loop while the response is OK and we are still getting new tweets
    while resp.ok and old_tweetid_len < len(tweets):
        new_tweets = {t['id']: t for t in json.loads(resp.content)}
        old_tweetid_len = len(tweets)
        tweets.update(new_tweets)
        min_id = min(tweets.keys())
        resp = session.get(url, params = {"screen_name": screen_name, "count": 200, "max_id": min_id})
        print("Oldest Tweet:", tweets[min_id]['created_at'], "\"", tweets[min_id]["text"], "\"")
    return list(tweets.values())

In [None]:
all_ds_tweets = get_timeline(session, "UCBIDS")

## Construct a DataFrame from the Tweets

In [None]:
df = pd.DataFrame(all_ds_tweets)[['id', 'created_at', 'text']]
df['created_at'] = pd.to_datetime(df['created_at'])
df['len'] = df['text'].str.len()
df = df.sort_values("created_at", ascending=False)
df.head()

# Post Tweets

We can use the Twitter API to also post tweets.  Examine the following page:

https://developer.twitter.com/en/docs/tweets/post-and-engage/overview


## Posting "Hello World"

We will post to the class Twitter Account https://twitter.com/data_100

In [None]:
url = "https://api.twitter.com/1.1/statuses/update.json"
resp = session.post(url, data = {"status": "Hello World!"})
resp

### Look at the Website

https://twitter.com/data_100

In [None]:
tweet = json.loads(resp.content)

## Deleting the Tweet

https://developer.twitter.com/en/docs/tweets/post-and-engage/api-reference/post-statuses-destroy-id

In [None]:
url = "https://api.twitter.com/1.1/statuses/destroy/{old_id}.json".format(old_id = tweet['id'])
resp = session.post(url)
resp

### Look at the Website

https://twitter.com/data_100

## Liking Everything Culler

https://developer.twitter.com/en/docs/tweets/post-and-engage/api-reference/post-favorites-create

In [None]:
df[df['text'].str.contains("Culler")]

In [None]:
for tweetid in df[df['text'].str.contains("Culler")]['id']:
    url = "https://api.twitter.com/1.1/favorites/create.json"
    resp = session.post(url, data = {"id": tweetid})
    print(resp)

### Look at the Website

https://twitter.com/data_100

### Disliking Everything Culler

https://developer.twitter.com/en/docs/tweets/post-and-engage/api-reference/post-favorites-destroy

In [None]:
for tweetid in df[df['text'].str.contains("Culler")]['id']:
    url = "https://api.twitter.com/1.1/favorites/destroy.json"
    resp = session.post(url, data = {"id": tweetid})
    print(resp)

### Look at the Website

https://twitter.com/data_100

## Following a Hashtag

In [None]:
url = "https://stream.twitter.com/1.1/statuses/filter.json"
resp = session.post(url, data={"track": "datascience"}, stream=True)
resp

In [None]:
try:
    for line in resp.iter_lines():
        if len(line) > 0:
            try:
                tweet = json.loads(line)
            except:
                print(line)
            print(tweet['text'])
except:
    # It is important to close the connections since there is a limit on the number of active sessions.
    print("Closing Connection")
    resp.close()
