## Get Latest Retweets Using Tweet ID

This script is used to download the latest retweets of a set of specified tweets using their IDs. It uses Twitter's [GET statuses/retweets/:id REST API](https://developer.twitter.com/en/docs/tweets/post-and-engage/api-reference/get-statuses-retweets-id).

In this case, I'm using it to download the latest retweets of tweets from four news sources: Mother Jones, Rabble, Breitbart, and Rebel News.

#### Rate Limits

For GET statuses/retweets/:id, the rate limits are as follows:
* 75 requests every 15 minutes
* No 24 hour limit

As each request is 100 tweets, this means you are permitted to download 7500 tweets every 15 minutes.

#### Tweepy Documentation

For Tweepy's documentation on GET statuses/retweets/:id, see [here](http://docs.tweepy.org/en/latest/api.html#API.retweets). 

### 1. Importing Libraries

In [1]:
import tweepy as tw
from tweepy import OAuthHandler

In [2]:
import json
import ast
import time

### 2. Gaining Access

Second, I have to list my user credentials to access the Twitter API.

In [3]:
# access_token = [ACCESS TOKEN HERE]
# access_token_secret = [SECRET ACCESS TOKEN HERE]
# consumer_key = [CONSUMER KEY HERE]
# consumer_secret = [SECRET CONSUMER KEY HERE]

access_token = '1149713584349470726-Jhg4lRPDr4KhQvTptLzln95EXJbvjp'
access_token_secret = 'kPbL8n2wqZJ467ci8F3UJfkPGSN56qqyFfu9xFq16UPnI'
consumer_key = 'fgLl7cExe3Wi3ufYPtqCsmtse'
consumer_secret = 'Q0wsv6CJ12gLjq9Gw3aq6eL259HWG9dOGuPe46xM4K60WIXPgR'

Third, I have to provide the keys and tokens to Twitter as part of the authorization process and, then, create the actual interface.

In [4]:
auth = OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)

api = tw.API(auth)

### 3. Functions

Next, I define the necessary functions.

The function below returns the 100 latest retweets of a tweet (specified using its tweet id) and returns it as a dictionary where each item within it is another dictionary representing each retweet.

In [5]:
def retweets(tweet_id):
    nested_dict = {}
    num_dict = 1
    results = api.retweets(id=tweet_id, count=100)
    for status in results:
        nested_dict[num_dict] = status._json
        num_dict += 1
    return(nested_dict)

The function below pulls the list of tweet IDs from the relevant .txt document and evaluates it as a Python expression (i.e. reads it properly as a list instead of as a string).

In [6]:
def read_file(document):
    with open(document, 'r') as f:
        data = ast.literal_eval(f.read())
    f.close()
    return(data)

The function below is used to name the .json file to which the retweets will be printed.

In [7]:
def name_json(username):
    document_name = str(username + "_retweets.json")
    return(document_name)

### 4. Driver Code

I load the list of tweet IDs from the relevant .txt document as id_list.

In [8]:
username = 'RebelNewsOnline' # this can be changed between each of the four news sources
document = str(username + "_tweet_ids.txt") #no need to change this line

temp_list = read_file(document)
id_list = temp_list[0:75]
print("Completing", len(id_list), "tweets out of", len(temp_list))

Completing 75 tweets out of 1714


I first create an empty dictionary, master_tweet_dict. This will contain *all* of the retweets of *all* tweets. Therefore, this will contain 100 retweets multiplied by the number of tweets collected per news source (something under 3,200). 

In [None]:
master_tweet_dict = {}
master_num = 1
running = True
counter = 0
error = False
batch_num = 0

for tweet_id in id_list:
    counter += 1
    try:
        single_tweet_dict = retweets(tweet_id) # This collects a dictionary of 100 retweets associated with one original tweet
        retweet_num = 1 # Next, it's going to go through each retweet in the dictionary and add it to the master dictionary
        inner_running = True
        while inner_running == True:
            try:
                master_tweet_dict[master_num] = single_tweet_dict[retweet_num]
            except KeyError: # Not all tweets will have 100 retweets, so if it reaches the limit in the dictionary, it'll stop the loop.
                master_num -= 1
                inner_running = False
            if inner_running == True:
                retweet_num += 1
                master_num += 1
    except tw.TweepError: # This terminates the loop if TweepError raised and prints out the error code.
        print("Error! Encountered error while trying to collect retweets from tweet number " + str(counter))
        try:
            print("Error code is " + str(tw.TweepError.response.text))
        except:
            print("Unknown error.")
        finally:
            running = False
            error = True
    if counter == 75:
        if running == True:
            print("Just finished collecting the retweets of 75 original tweets. Going to pause for 15 minutes.")
            time.sleep(300) # pauses the program for 5 minutes after completing the 75th request to avoid exceeding the rate limits
            print("Paused for 5 minutes so far. 10 minutes remaining.")
            time.sleep(300)
            print("Paused for 10 minutes so far. 5 minutes remaining.")
            time.sleep(300)
            counter = 0 # resets the counter to 0
            batch_num +=1
            print("Finished pausing for 15 minutes.")
        else:
            print("Failed on the 75th request.")
    if running == False:
        break
            
if error == False:
    print("Finished all requests. The total number of tweets collected is", batch_num*75)

Just finished collecting the retweets of 75 original tweets. Going to pause for 15 minutes.
Paused for 5 minutes so far. 10 minutes remaining.


### 5. Printing to JSON File

In [10]:
with open(name_json(username), 'w', encoding='utf8') as f:
    json.dump(master_tweet_dict, f)

### 6. Reading the JSON File

In [11]:
with open(name_json(username), 'r') as g:
    datastore = json.load(g)

In [33]:
print(datastore['1']) # This is just to test everything went properly.

KeyError: '133598'

### 7. Saving just the Users to a .txt File as a List

In [20]:
user_num = 1
users = []

while user_num < len(id_list)*75:
    try:
        users.append(datastore[str(user_num)]['user']['screen_name'])
        user_num += 1
    except KeyError:
        break

print("Finished adding", len(users), "users to list using", len(id_list)*75, "tweets.")

Finished adding 133597 users to list using 186975 tweets.


In [34]:
with open((str(username) + "users.txt"), 'w') as h:
    print(users, file=h)