## Print Timelines of Users

This script is used to download the latest tweets of a set of specified users and then save the tweet IDs and dates/times to .txt files for later use. It uses Twitter's [GET statuses/user_timeline REST API](https://developer.twitter.com/en/docs/tweets/timelines/api-reference/get-statuses-user_timeline).

In this case, I'm using it to download the latest tweets of four news sources: Mother Jones, Rabble, Breitbart, and Rebel News.

#### Rate Limits

For GET statuses/user_timeline, the rate limits are as follows:
* 900 requests every 15 minutes
* 100,000 requests every 24 hours

As you can attempt to download up to 3,200 tweets from each user, this would mean each user costs you 16 requests. This means you are permitted to download the timelines of:
* 56 users every 15 minutes
* 6,250 users every 24 hours

### 1. Importing Libraries

In [None]:
import tweepy as tw
from tweepy import OAuthHandler

In [None]:
import json

### 2. Gaining Access

Second, I have to list my user credentials to access the Twitter API.

In [None]:
access_token = 'ACCESS TOKEN'
access_token_secret = 'SECRET ACCESS TOKEN'
consumer_key = 'CONSUMER KEY'
consumer_secret = 'SECRET CONSUMER KEY'

Third, I have to provide the keys and tokens to Twitter as part of the authorization process and, then, create the actual interface.

In [None]:
auth = OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)

api = tw.API(auth)

### 3. Functions

Next, I define the necessary functions.

The function below returns a specified user's timeline and returns it as a dictionary where each item within it is another dictionary representing each tweet.

In [None]:
def timeline(username):
    nested_dict = {}
    num_dict = 1
    results = tw.Cursor(api.user_timeline, screen_name=username, tweet_mode="extended", include_rts = False).items()
    for status in results:
        nested_dict[num_dict] = status._json
        num_dict += 1
    return(nested_dict)
        

The function below is used to name the .json file to which the timeline will be printed.

In [None]:
def name_json(username):
    
    document_name = str("/Users/kateschneider/Documents/U of T 2019-2020/COVID-19 Research Project/News Source Timelines/" + username + "_timeline.json")
    return(document_name)

### 4. Specifying Users

List the users for downloading their timelines (latest tweets).

In [None]:
usernames = ['MotherJones', 'rabbleca', 'BreitbartNews', 'RebelNewsOnline']

### 5. Printing to JSON File

In [None]:
for user in usernames:
    try: 
        tweets = timeline(user)
        with open(name_json(user), 'w', encoding='utf8') as f:
            json.dump(tweets, f)
    except TweepError:
        print('Error! Failed to complete request.')
        print(TweepError.response.status_code)
        break

### 6. Pulling Tweet IDs
Pulling all the IDs of the latest 3,200 tweets and saving them to a .txt file for later.

In [None]:
handle = 'RebelNewsOnline'

In [None]:
with open(name_json(handle), 'r') as g:
    datastore = json.load(g)

The cell below prints all tweet IDs and dates as a list to .txt files.

In [None]:
num = 1
id_list = []
date_list = []
id_document_name = str("/Users/kateschneider/Documents/U of T 2019-2020/COVID-19 Research Project/News Source Tweet IDs/" + handle + "_tweet_ids.txt")
date_document_name = str("/Users/kateschneider/Documents/U of T 2019-2020/COVID-19 Research Project/News Source Tweet IDs/" + handle + "_tweet_dates.txt")

while num < 3201:
    try: 
        id_list.append(datastore[str(num)]['id'])
        date_list.append(datastore[str(num)]['created_at'])
    except KeyError:
        print("Finished printing tweet IDs and dates. Total tweet IDs collected is " + str(num - 1))
        break
    num += 1

if num == 3200:
    print("Finished printing tweet IDs and dates. Total tweet IDs collected is " + str(num - 1)

print("Next, saving tweet IDs and tweet dates to .txt files")    
with open(id_document_name, 'w') as f:
    f.write(str(id_list))
print("Finished saving tweet IDs.")    
with open(date_document_name, 'w') as f:
    f.write(str(date_list))
print("Finished saving tweet dates.")    

You can use the cell below to find out the earliest and latest dates/times for tweets.

In [None]:
print(datastore['1']['created_at'])
print(datastore['3210']['created_at'])