# ReThink Media Twitter API: Tutorial and Examples

This notebook will provide a user manual and example use cases for using ReThink Media's Twitter API functions. The functions in this notebook will provide the capabilities to:
- Search Tweets relevant to a query, over different time periods
- Save Tweets and Tweet metadata to a .csv file for later reference and use
- Create wordclouds for frequent keywords and hashtags
- Create plots of Tweet counts over time, with adjustable titles and axes

As an example use case for these functions, this notebook will compare the discussions around the coming out of two transgender celebrities: Caitlin Jenner and Elliot Page.

## Defining Functions

The first part of this notebook is dedicated to defining and explaining the functions mentioned above, with the example use case to follow.

### Authentication & Utility Functions

These functions are utility functions that are embedded within the main ones, and must be initialized before the others are used. Run the cells below before running the other functions.

**IMPORTANT NOTE:** The Twitter API requires API keys and other authentication tokens in order to function properly. A user must have a Twitter Developer account with these keys available in order to use the functions in this notebook. If you have these keys available, create a text file named `.env` in the home folder for your notebook environment with the following format:

```
API_KEY="your_api_key"
API_KEY_SECRET="your_secret_api_key"
BEARER_TOKEN="your_bearer_token"
ACCESS_TOKEN="your_access_token"
ACCESS_SECRET="your_secret_access_token"
```

In [2]:
# function to initialize Twitter API v1.1 instance (for 30-day and full archive search)
def init_api_1():
    
    # importing necessary modules and loading .env file
    from dotenv import load_dotenv
    import os
    import tweepy
    load_dotenv()
    
    # retrieving environment variables from .env file
    consumer_key = os.getenv("API_KEY")
    consumer_secret = os.getenv("API_KEY_SECRET")
    bearer_token = os.getenv("BEARER_TOKEN")
    access_token = os.getenv("ACCESS_TOKEN")
    access_secret = os.getenv("ACCESS_SECRET")
    
    # Twitter API authentication
    auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
    auth.set_access_token(access_token, access_secret)
    
    # instantiating Twitter API v1.1 reference
    api_1 = tweepy.API(auth, wait_on_rate_limit=True)
    
    return api_1

In [3]:
# function to initialize Twitter API v2 instance (for 7-day search)
def init_api_2():
    # importing necessary modules and loading .env file
    from dotenv import load_dotenv
    import os
    import tweepy
    load_dotenv()
    
    # retrieving environment variables from .env file
    consumer_key = os.getenv("API_KEY")
    consumer_secret = os.getenv("API_KEY_SECRET")
    bearer_token = os.getenv("BEARER_TOKEN")
    access_token = os.getenv("ACCESS_TOKEN")
    access_secret = os.getenv("ACCESS_SECRET")
    
    # instantiating Twitter API v2 reference
    api_2 = tweepy.Client(bearer_token=bearer_token,
                         consumer_key=consumer_key,
                         consumer_secret=consumer_secret,
                         access_token=access_token,
                         access_token_secret=access_secret,
                         wait_on_rate_limit=True)
    
    return api_2

In [4]:
# function to parse Twitter API v2 response into a DataFrame of Tweet data
def tweet_df(df, response, tweet_fields):
    
    users = response.includes['users']
    user_data = {user['id']: [user['public_metrics']['followers_count'], user['verified']] for user in users}
        
    # looping through each Tweet in response, parsing data
    for i in range(len(response.data)):
        tweet = response.data[i]
        tweet_id = tweet.id
        tweet_data = {}
        for field in tweet_fields:
            if tweet[field]:
                tweet_data[field] = tweet[field]
                
                # extracting hashtags from "entities" field and adding it as its own column
                if field == "entities":
                    try:
                        hashtag_data = tweet[field]['hashtags']
                        hashtags = [hashtag['tag'] for hashtag in hashtag_data]
                        tweet_data['entities_hashtags'] = hashtags
                    except KeyError:
                        tweet_data['entities_hashtags'] = None
                
                # separating metrics from "public_metrics" field and adding them as their own column
                if field == "public_metrics":
                    metrics = list(tweet[field].keys())
                    for metric in metrics:
                        tweet_data[metric] = tweet[field][metric]
                
            else:
                tweet_data[field] = None
                if field == "entities":
                    tweet_data['entities_hashtags'] = None
        
        # adding user data to DataFrame
        user = user_data[tweet['author_id']]
        tweet_data['followers_count'] = user[0]
        tweet_data['verified'] = user[1]
        
        df.loc[tweet_id] = tweet_data
    
    return df

### Tweet Search Functions

The Twitter API has different limits on how many API requests a user can make and how many Tweets they can receive, depending on how far back the user wants to search. For this reason, there are three different Tweet search functions, and the user should choose the function that best fits their use case:

- `search_7()`: Search Tweets within the past 7 days. Unlimited API requests, 500,000 Tweets per month.
- `search_30()`: Search Tweets within the past 30 days. 250 API requests, 25,000 Tweets per month.
- `search_full()`: Search Tweets from the full archive. 50 API requests, 5,000 Tweets per month.

The Twitter API also has a limit of 100 API requests per 15-minute interval, regardless of which function is used. If the quota runs out, the functions will wait until the time limit resets, and then continue collecting Tweets.

The arguments for these functions are:
- `query`: The query to search the Twitter API for
- `start_date`: The date to start the search (default `None`). If `None`, the function will default to 7 days ago.
- `end_date`: The date to end the search (default `None`). If `None`, the function will default to today.
- `max_results`: The maximum amount of Tweets to return in the DataFrame (default 20).
- `write_csv`: Boolean, whether to save the DataFrame as a csv file or not. Default `False`.
- `filename`: Filename for the csv if `write_csv` is `True`. Default name is `search_7.csv`, `search_30.csv`, or `search_full.csv`, depending on the function used.

In [14]:
# function to retrieve Tweets from the past 7 days relevant to a query
def search_7(query, start_date=None, end_date=None, max_results=20, write_csv=False, filename="search_7.csv"):
    
    # initializing API v1.1 instance
    api_2 = init_api_2()
    
    # parsing dates passed into function
    from dateutil import parser
    from datetime import datetime
    if start_date:
        start_date = parser.parse(start_date)
        start_date = start_date.strftime("%Y%m%d%H%M")
    if end_date:
        end_date = parser.parse(end_date)
        end_date = end_date.strftime("%Y%m%d%H%M")
    
    # setting Tweet and user data to be included in response
    tweet_fields = ["text", "attachments", "author_id", "context_annotations", "conversation_id", "created_at",
                   "entities", "geo", "in_reply_to_user_id", "lang", "public_metrics", "referenced_tweets"]
    user_fields = ["public_metrics", "verified"]
    
    # initializing variables for API calls and DataFrame for Tweet data
    import pandas as pd
    next_token = None
    num_tweets = 0
    tweets = pd.DataFrame(columns=tweet_fields+['followers_count', 'verified']+
                          ['entities_hashtags','retweet_count','reply_count','like_count','quote_count'])
    tweets.index.name = "Tweet ID"
    
    # making my own pagination loop to further examine the rate limit
    num_loops = 0
    while num_tweets < max_results:
        
        # the API only retrieves between 10 and 100 Tweets per call
        # NOTE: number of API results isn't consistent. max_results=100 doesn't guarantee 100 Tweets in response
        if max_results - num_tweets >= 100:
            num_results = 100
        else:
            num_results = max_results - num_tweets if max_results - num_tweets > 10 else 10
        
        # calling API and searching Tweets over past 7 days
        response = api_2.search_recent_tweets(f"{query} lang:en", 
                                              start_time=start_date,
                                              end_time=end_date,
                                              max_results=num_results,
                                              next_token=next_token,
                                              tweet_fields=tweet_fields,
                                              expansions='author_id',
                                              user_fields=user_fields)
        
        # setting variables for the next loop
        try:
            next_token = response[3]['next_token']
        except KeyError:
            next_token = None
        num_tweets += len(response.data)
        num_loops += 1
        
        # adding Tweet data to DataFrame
        tweets = tweet_df(tweets, response, tweet_fields)
        
    # dropping "public_metrics" since all the values are unpacked, adding "total_engagements"
    tweets.drop('public_metrics', axis=1, inplace=True)
    total_engagements = tweets["retweet_count"] + tweets["reply_count"] + tweets["like_count"] + tweets["quote_count"]
    tweets["total_engagements"] = total_engagements
        
    # writing Tweet DataFrame to csv file
    if write_csv:
        tweets.to_csv(filename)
    
    return tweets

In [6]:
# function to search Tweets within the past 30 days
# utilizes both API v1.1 and v2 to be consistent with 7-day search.
def search_30(query, start_date=None, end_date=None, max_results=20, write_csv=False, filename="search_30.csv"):
    # initializing API v1.1 instance
    api_1 = init_api_1()
    
    # parsing dates passed into function
    from dateutil import parser
    from datetime import datetime
    if start_date:
        start_date = parser.parse(start_date)
        start_date = start_date.strftime("%Y%m%d%H%M")
    if end_date:
        end_date = parser.parse(end_date)
        end_date = end_date.strftime("%Y%m%d%H%M")
    
    # retrieving Tweets from the past 30 days relevant to query using tweepy's pagination function
    import tweepy
    response_1 = tweepy.Cursor(api_1.search_30_day,
                               label="30day",
                               query=f"{query} lang:en",
                               fromDate=start_date,
                               toDate=end_date
                              ).items(max_results)
    
    # gathering Tweet ID's in a list
    tweet_ids = [tweet._json['id'] for tweet in response_1]
    
    # setting Tweet data to be included in response_2
    tweet_fields = ["text", "attachments", "author_id", "context_annotations", "conversation_id", "created_at",
                   "entities", "geo", "in_reply_to_user_id", "lang", "public_metrics", "referenced_tweets"]
    user_fields = ["public_metrics", "verified"]
    
    # initializing variables for API v2 calls and DataFrame for Tweet data
    import pandas as pd
    num_tweets = 0
    tweets = pd.DataFrame(columns=tweet_fields+['followers_count', 'verified']+
                          ['entities_hashtags','retweet_count','reply_count','like_count','quote_count'])    
    tweets.index.name = "Tweet ID"
    
    # loop to retrieve Tweets from ID's through API v2, 100 at a time
    api_2 = init_api_2()
    
    while num_tweets < max_results:
        # slicing tweet_ids since API v2 get_tweets only takes max 100 ID's per request
        try:
            slice_ids = tweet_ids[num_tweets:num_tweets+100]
        except IndexError:
            slice_ids = tweet_ids[num_tweets:]
        if len(slice_ids) == 0:
            break

        # retrieving Tweet data from API v2 and adding to DataFrame
        response_2 = api_2.get_tweets(slice_ids, tweet_fields=tweet_fields, 
                                      expansions='author_id', user_fields=user_fields)
        tweets = tweet_df(tweets, response_2, tweet_fields)
        num_tweets += len(response_2.data)
    
    # dropping "public_metrics" since all the values are unpacked, adding "total_engagements"
    tweets.drop('public_metrics', axis=1, inplace=True)
    total_engagements = tweets["retweet_count"] + tweets["reply_count"] + tweets["like_count"] + tweets["quote_count"]
    tweets["total_engagements"] = total_engagements
    
    # writing Tweet DataFrame to csv file
    if write_csv:
        tweets.to_csv(filename)
    
    return tweets

In [7]:
# function to search Tweets within the full Tweet archive
# utilizes both API v1.1 and v2 to be consistent with 7-day search.
def search_full(query, start_date=None, end_date=None, max_results=20, write_csv=False, filename="search_full.csv"):
    # initializing API v1.1 instance
    api_1 = init_api_1()
    
    # parsing dates passed into function
    from dateutil import parser
    from datetime import datetime
    if start_date:
        start_date = parser.parse(start_date)
        start_date = start_date.strftime("%Y%m%d%H%M")
    if end_date:
        end_date = parser.parse(end_date)
        end_date = end_date.strftime("%Y%m%d%H%M")
    
    # retrieving Tweets from the full tweet archive relevant to query using tweepy's pagination function
    import tweepy
    response_1 = tweepy.Cursor(api_1.search_full_archive,
                               label="full",
                               query=f"{query} lang:en",
                               fromDate=start_date,
                               toDate=end_date
                              ).items(max_results)
    
    # gathering Tweet ID's in a list
    tweet_ids = [tweet._json['id'] for tweet in response_1]
    
    # setting Tweet data to be included in response
    tweet_fields = ["text", "attachments", "author_id", "context_annotations", "conversation_id", "created_at",
                   "entities", "geo", "in_reply_to_user_id", "lang", "public_metrics", "referenced_tweets"]
    user_fields = ["public_metrics", "verified"]
    
    # initializing variables for API calls and DataFrame for Tweet data
    import pandas as pd
    tweets = pd.DataFrame(columns=tweet_fields+["followers_count", "verified"]+
                          ['entities_hashtags','retweet_count','reply_count','like_count','quote_count'])
    tweets.index.name = "Tweet ID"
    
    # loop to retrieve Tweets from ID's through API v2, 100 at a time
    api_2 = init_api_2()
    num_tweets = 0
    while num_tweets < max_results:
        # slicing tweet_ids since API v2 get_tweets only takes max 100 ID's per request
        try:
            slice_ids = tweet_ids[num_tweets:num_tweets+100]
        except IndexError:
            slice_ids = tweet_ids[num_tweets:]
        if len(slice_ids) == 0:
            break

        # retrieving Tweet data from API v2 and adding to DataFrame
        response_2 = api_2.get_tweets(slice_ids, tweet_fields=tweet_fields,
                                     expansions='author_id', user_fields=user_fields)
        tweets = tweet_df(tweets, response_2, tweet_fields)
        num_tweets += len(response_2.data)
    
    # dropping "public_metrics" since all the values are unpacked, adding "total_engagements"
    tweets.drop('public_metrics', axis=1, inplace=True)
    total_engagements = tweets["retweet_count"] + tweets["reply_count"] + tweets["like_count"] + tweets["quote_count"]
    tweets["total_engagements"] = total_engagements
    
    # writing Tweets DataFrame to csv file
    if write_csv:
        tweets.to_csv(filename)
    
    return tweets

### Wordclouds

This function creates wordclouds for frequent words and hashtags in Tweet data. To avoid making any unnecessary API calls, this function takes the DataFrame created from the search functions as an input. The arguments for this function are:

- `df`: DataFrame of Tweet data, created from one of the Tweet search functions defined above.
- `query`: The query used to create `df`. If passed into the function, `query` is added to the stop words for the word cloud, so they aren't added to the cloud.
- `save_imgs`: Boolean, whether to save the images to a file or not. The filenames will be `wordcloud.png` and `hashtags.png` in the current working directory.

In [8]:
def word_cloud(df, query=None, save_imgs=False):
    # combining DataFrame text column into one long string, doing some initial pre-processing
    import pandas as pd
    tweet_text = " ".join(df["text"])
    tweet_text = tweet_text.lower()
    tweet_text = tweet_text.replace("\n", " ")
    
    # splitting string into set of words, removing hashtags, usernames, links, and retweet indicator
    word_list = set(tweet_text.split(" "))
    hash_list = {word for word in word_list if word.startswith("#")}
    user_list = {word for word in word_list if word.startswith("@")}
    link_list = {word for word in word_list if word.startswith("http")}
    word_list = {word for word in word_list if word not in hash_list.union(user_list, link_list)}
    word_list = {word for word in word_list if word != "rt"}
    
    # using nltk tokenizer to further pre-process text, removing non-alpha words
    from nltk.tokenize import word_tokenize
    import nltk
    nltk.download('punkt')
    tweet_text = " ".join(word_list)
    word_list = word_tokenize(tweet_text)
    word_list = {word for word in word_list if word.isalpha()}
    
    # joining list of words into final cleaned string
    tweet_text = " ".join(word_list)
    
    # generating word cloud
    from wordcloud import WordCloud, STOPWORDS
    import matplotlib.pyplot as plt

    stopwords = set(STOPWORDS)
    
    # adding words from query to stop words so they don't show up in the word cloud
    if query:
        stopwords.update(query.split())

    # word cloud for text
    words_fig = plt.figure()
    word_cloud = WordCloud(background_color="white", width=3000, height=2000, max_font_size=500,
                           max_words=100, prefer_horizontal=1.0, stopwords=stopwords)
    word_cloud.generate(tweet_text)
    plt.imshow(word_cloud)
    plt.axis("off")
    plt.title("Frequent keywords in Tweets", fontsize=15)
    plt.show()
    if save_imgs:
        word_cloud.to_file("wordcloud.png")

    # word cloud for hashtags
    hash_fig = plt.figure()
    word_cloud = WordCloud(background_color="white", width=3000, height=2000, max_font_size=500,
                           max_words=100, prefer_horizontal=1.0, stopwords=stopwords)
    word_cloud.generate(" ".join(hash_list))
    plt.imshow(word_cloud)
    plt.axis("off")
    plt.title("Frequent hashtags in Tweets", fontsize=15)
    plt.show()
    if save_imgs:
        word_cloud.to_file("hashtags.png")
    
    return words_fig, hash_fig

### Attention Over Time Plots

This function plots the volume of tweets relevant to a query over time. Similar to the wordcloud function, this function avoids additional API calls and takes the DataFrame from the Tweet search functions as an input. The user can adjust aspects of the plot to fit different use cases, such as the title, plot type, and x-axis labels. The arguments for this function are:

- `df`: DataFrame of Tweet data, created from one of the Tweet search functions defined above.
- `query`: The query used to create `df`. If passed into this function, adds a subtitle to the plot with the query.
- `title`: The title of the plot.
- `xlabel`: "month", "year", or "day" (default "month"). Granularity of ticks and labels on the x-axis.
- `plot_type`: "line" or "bar" (default "line"). Choose between line or bar plot for attention over time.
- `figsize`: Default (10,5). Size of the figure outputted by this function.

In [9]:
# plot function
def attention_plots(df, query=None, title="Tweet count over time", xlabel="month", plot_type="line", figsize=(10,5)):
    
    # ensuring the correct parameters have been passed
    assert plot_type in ("line", "bar"), "Please input 'line' or 'bar' into plot_type"
    assert xlabel in ("day", "month", "year"), "Please input 'day', 'month', or 'year' into xlabel"
        
    # converting dates to datetime, getting counts of tweets per day
    import pandas as pd
    df["created_at"] = pd.to_datetime(df["created_at"])
    daily_counts = test.groupby(test["created_at"].dt.date).count()
    dates = pd.to_datetime(daily_counts.index)
    
    # creating figure for plot
    import matplotlib.pyplot as plt
    figure = plt.figure(figsize=figsize)
    
    # line or bar graph, depending on input
    if plot_type == "line":
        plt.plot(daily_counts.index, daily_counts["text"])
    else:
        plt.bar(daily_counts.index, daily_counts["text"])
    
    # setting x-axis ticks to be month, day, or year, depending on input
    if xlabel == "month":
        period = "M"
        tick_labels = dates.to_period(period).unique().strftime("%b %Y")
    elif xlabel == "day":
        period = "D"
        tick_labels = dates.to_period(period).unique().strftime("%m-%d-%Y")
    elif xlabel == "year":
        period = "Y"
        tick_labels = dates.to_period(period).unique()
    tick_locs = dates.to_period(period).unique()
    plt.xticks(ticks=tick_locs, labels=tick_labels, rotation=90)
    
    # setting plot title and subtitle (if query is passed)
    plt.suptitle(title, fontsize=15)
    if query:
        plt.title(f"Query: {query}")
    plt.xlabel("Date")
    plt.ylabel("Number of Tweets")
    plt.show()
    
    return figure

## Example Use Case: Caitlin Jenner & Elliot Page

The rest of the notebook will walk through an example use case for these functions: comparing the discussions around Caitlin Jenner and Elliot Page when they came out as transgender. The example will use all of the functions defined above as a simple baseline for users to see how they work and what their outputs are.

In [11]:
# importing a module so we can time how long the functions take
import time

In [18]:
# defining some search strings for the API queries
page_search = '"elliot page"'
jenn_search = '"caitlin jenner"'

### Searching for names and deadnames

We can get an initial idea about the difference in how these two celebrities are viewed by looking at how many times they are referenced by their "deadname." A deadname is the birth name that a transgender person drops when they transition and choose a name that fits their gender. We can use the `search_7` function to get an idea about how they are viewed in the public eye, whether they are more referenced by their chosen name or their deadname.

In [19]:
# running and timing the search_7 function for Elliot Page 
start = time.time()
page_7 = search_7(page_search, max_results=2000, write_csv=True)
end = time.time()

print(f"Time taken: {(end-start)/60} min")
page_7

Time taken: 0.27062906821568805 min


Unnamed: 0_level_0,text,attachments,author_id,context_annotations,conversation_id,created_at,entities,geo,in_reply_to_user_id,lang,referenced_tweets,followers_count,verified,entities_hashtags,retweet_count,reply_count,like_count,quote_count,total_engagements
Tweet ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1
1459257004531359754,elliot page built like arataki itto,,1433043975586385927,"[{'domain': {'id': '10', 'name': 'Person', 'de...",1459257004531359754,2021-11-12 20:29:16+00:00,,,,en,,195,False,,0,0,3,0,3
1459255598935994368,taylor comes in my room this morning fresh out...,,714662883884457984,"[{'domain': {'id': '10', 'name': 'Person', 'de...",1459255598935994368,2021-11-12 20:23:41+00:00,"{'annotations': [{'start': 0, 'end': 5, 'proba...",,,en,,102,False,,0,0,1,0,1
1459253913253777413,child Jinx aka Ellie from The Last of Us aka E...,,36757561,"[{'domain': {'id': '10', 'name': 'Person', 'de...",1459249667049500672,2021-11-12 20:16:59+00:00,"{'annotations': [{'start': 6, 'end': 9, 'proba...",,36757561,en,"[(type, id)]",93,False,,0,0,0,0,0
1459253811424514048,I am in love with Elliot Page,,894742291612667904,"[{'domain': {'id': '10', 'name': 'Person', 'de...",1459253811424514048,2021-11-12 20:16:34+00:00,"{'annotations': [{'start': 18, 'end': 28, 'pro...",,,en,,396,False,,0,0,2,0,2
1459233939273535496,was no one going to tell me that elliot page’s...,,2382415136,"[{'domain': {'id': '10', 'name': 'Person', 'de...",1459233939273535496,2021-11-12 18:57:37+00:00,"{'annotations': [{'start': 33, 'end': 38, 'pro...",,,en,,782,False,,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1456746529838960641,https://t.co/BE97NPg1ZR Elliot Page Showcases ...,{'media_keys': ['3_1362633293540155395']},1376744694302994438,"[{'domain': {'id': '10', 'name': 'Person', 'de...",1456746529838960641,2021-11-05 22:13:32+00:00,"{'annotations': [{'start': 24, 'end': 29, 'pro...",,,en,,5,False,,0,0,0,0,0
1456741725460893703,@Vera_Lustig @bindelj It wasn’t that at all- i...,,728317754479095808,"[{'domain': {'id': '10', 'name': 'Person', 'de...",1456689342332682243,2021-11-05 21:54:26+00:00,"{'mentions': [{'start': 0, 'end': 12, 'usernam...",,2626400402,en,"[(type, id)]",42,False,,0,2,4,0,6
1456738584459980804,I just realized I've never seen a film with El...,,907303569300316160,"[{'domain': {'id': '10', 'name': 'Person', 'de...",1456735639165181953,2021-11-05 21:41:58+00:00,"{'annotations': [{'start': 44, 'end': 49, 'pro...",,907303569300316160,en,"[(type, id)]",239,False,,0,1,0,0,1
1456738270293856257,@steelydante He was in a really bad one called...,,21926414,"[{'domain': {'id': '10', 'name': 'Person', 'de...",1456724142552719361,2021-11-05 21:40:43+00:00,"{'annotations': [{'start': 55, 'end': 58, 'pro...",,1189609796623917056,en,"[(type, id)]",3558,False,,0,0,1,0,1


In [20]:
# running the search_7 function again to see how many times Elliot Page has been deadnamed in the past 7 days
start = time.time()
page_7 = search_7('"ellen page"', max_results=2000, write_csv=True)
end = time.time()

print(f"Time taken: {(end-start)/60} min")
page_7

Time taken: 0.2727751096089681 min


Unnamed: 0_level_0,text,attachments,author_id,context_annotations,conversation_id,created_at,entities,geo,in_reply_to_user_id,lang,referenced_tweets,followers_count,verified,entities_hashtags,retweet_count,reply_count,like_count,quote_count,total_engagements
Tweet ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1
1459237711009644547,Oh shit! Ellen Page is shook Richard Briers is...,,1126153604237287424,,1459237711009644547,2021-11-12 19:12:36+00:00,"{'annotations': [{'start': 9, 'end': 18, 'prob...",,,en,,29,False,,0,0,0,0,0
1459228179814621186,BREAKING NEWS! Ellen Page revealed to be a dog.,,824582231599550464,,1459228179814621186,2021-11-12 18:34:43+00:00,"{'annotations': [{'start': 15, 'end': 24, 'pro...",,,en,,5,False,,0,0,0,0,0
1459210390408024064,RT @celebs_fap: Kate Mara and Ellen Page in My...,{'media_keys': ['7_1426751493068279808']},1427959021189832708,"[{'domain': {'id': '10', 'name': 'Person', 'de...",1459210390408024064,2021-11-12 17:24:02+00:00,"{'annotations': [{'start': 16, 'end': 24, 'pro...",,,en,"[(type, id)]",14,False,,491,0,0,0,491
1459193093115166721,feeling like ellen page in hard candy wearing ...,,18665876,"[{'domain': {'id': '10', 'name': 'Person', 'de...",1459193093115166721,2021-11-12 16:15:18+00:00,"{'annotations': [{'start': 13, 'end': 17, 'pro...",,,en,,909,False,,0,0,1,0,1
1459190664466370562,RT @divangreedy88: @yourdadspanties @XcloudTim...,,2179893584,"[{'domain': {'id': '10', 'name': 'Person', 'de...",1459190664466370562,2021-11-12 16:05:39+00:00,"{'annotations': [{'start': 104, 'end': 108, 'p...",,,en,"[(type, id)]",166,False,,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1456784323453997056,"In a new sci-fi from the mind of Ellen Page, a...",,1105484542649851905,"[{'domain': {'id': '30', 'name': 'Entities [En...",1456784323453997056,2021-11-06 00:43:43+00:00,"{'annotations': [{'start': 33, 'end': 42, 'pro...",,,en,,35,False,,0,0,0,0,0
1456763604175360007,@Quikwest when he was still known as Ellen Pag...,,4729879519,"[{'domain': {'id': '10', 'name': 'Person', 'de...",1456761374802071553,2021-11-05 23:21:23+00:00,"{'mentions': [{'start': 0, 'end': 9, 'username...",,33628729,en,"[(type, id)]",361,False,,0,1,0,0,1
1456751967745847297,RT @Mattsa_64: @PhisterPig @BARGH3ST @WHC_MSwo...,,1074460626355929088,,1456751967745847297,2021-11-05 22:35:08+00:00,"{'mentions': [{'start': 3, 'end': 13, 'usernam...",,,en,"[(type, id)]",191,False,,1,0,0,0,1
1456750422891409412,@GourmetGhosts @haleyellingson The *only* dire...,,1416017417139064838,,1455718847697207304,2021-11-05 22:29:00+00:00,"{'mentions': [{'start': 0, 'end': 14, 'usernam...",,1416017417139064838,en,"[(type, id)]",27,False,,0,1,0,0,1


In [21]:
# running and timing the search_7 function for Caitlin Jenner
start = time.time()
jenn_7 = search_7(jenn_search, max_results=2000, write_csv=True)
end = time.time()

print(f"Time taken: {(end-start)/60} min")
jenn_7

Time taken: 0.23578269084294637 min


Unnamed: 0_level_0,text,attachments,author_id,context_annotations,conversation_id,created_at,entities,geo,in_reply_to_user_id,lang,referenced_tweets,followers_count,verified,entities_hashtags,retweet_count,reply_count,like_count,quote_count,total_engagements
Tweet ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1
1459244861895462916,RT @waminette: Caitlin Jenner saying 9/11,,1096853399851683843,,1459244861895462916,2021-11-12 19:41:01+00:00,"{'annotations': [{'start': 15, 'end': 28, 'pro...",,,en,"[(type, id)]",325,False,,1,0,0,0,1
1459244118551498752,Caitlin Jenner saying 9/11,,2886479974,,1459244118551498752,2021-11-12 19:38:03+00:00,"{'annotations': [{'start': 0, 'end': 13, 'prob...",,,en,,70,False,,1,0,1,0,2
1459243704619778055,@SullyTrent @Ryanyates10 @20StoriesMCR How is ...,,1430314506656354308,,1456970421945962498,2021-11-12 19:36:25+00:00,"{'annotations': [{'start': 88, 'end': 101, 'pr...",,17509814,en,"[(type, id)]",0,False,,0,1,0,0,1
1459228759937085447,DIDN'T CAITLIN JENNER LITERALLY KILL SOMEONE B...,,1315458253594165249,"[{'domain': {'id': '10', 'name': 'Person', 'de...",1459228759937085447,2021-11-12 18:37:02+00:00,"{'annotations': [{'start': 7, 'end': 20, 'prob...",,,en,"[(type, id)]",24,False,,0,0,0,0,0
1459196493986873351,@GautamGambhir @ashwinravi99 I'm 100% sure in ...,,1143030051845238784,"[{'domain': {'id': '10', 'name': 'Person', 'de...",1458865037351682051,2021-11-12 16:28:49+00:00,"{'annotations': [{'start': 68, 'end': 81, 'pro...",,99448420,en,"[(type, id)]",3,False,,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1456822708252069890,@men_are_human by the way they wouldnt write a...,,1849073334,,1456744252621340673,2021-11-06 03:16:14+00:00,"{'annotations': [{'start': 132, 'end': 145, 'p...",,1027591390342119424,en,"[(type, id)]",1582,False,,0,1,2,0,3
1456796669647867908,RT @elizamondegreen: Why slam Rachel Dolezal b...,,1450971095662673921,"[{'domain': {'id': '47', 'name': 'Brand', 'des...",1456796669647867908,2021-11-06 01:32:46+00:00,"{'annotations': [{'start': 30, 'end': 43, 'pro...",,,en,"[(type, id)]",3,False,,43,0,0,0,43
1456795103691825152,"CHIPS on Charge, featuring Bruce Jenner but im...",,45032418,"[{'domain': {'id': '47', 'name': 'Brand', 'des...",1456795103691825152,2021-11-06 01:26:33+00:00,"{'annotations': [{'start': 27, 'end': 38, 'pro...",,,en,,980,False,,0,0,0,0,0
1456760179173863427,Check out this article: Caitlin Jenner's perso...,,88467128,"[{'domain': {'id': '3', 'name': 'TV Shows', 'd...",1456760179173863427,2021-11-05 23:07:46+00:00,"{'annotations': [{'start': 24, 'end': 37, 'pro...",,,en,,684,False,,0,0,0,0,0


In [22]:
# running the search_7 function again to see how many times Caitlin Jenner has been deadnamed in the past 7 days
start = time.time()
jenn_7 = search_7('"bruce jenner"', max_results=2000, write_csv=True)
end = time.time()

print(f"Time taken: {(end-start)/60} min")
jenn_7

Rate limit exceeded. Sleeping for 142 seconds.


Time taken: 2.6235800266265867 min


Unnamed: 0_level_0,text,attachments,author_id,context_annotations,conversation_id,created_at,entities,geo,in_reply_to_user_id,lang,referenced_tweets,followers_count,verified,entities_hashtags,retweet_count,reply_count,like_count,quote_count,total_engagements
Tweet ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1
1459258669825146883,@CaitlinStraugh1 Are you saying big Stu is the...,,1415315592056291330,"[{'domain': {'id': '10', 'name': 'Person', 'de...",1459253953854685193,2021-11-12 20:35:53+00:00,"{'annotations': [{'start': 36, 'end': 38, 'pro...",,1288136709595639813,en,"[(type, id)]",314,False,,0,1,0,0,1
1459252839662903300,@ObeliskReborn @emperorkhanarts @KaitMarieox Y...,,1108507261637480450,"[{'domain': {'id': '10', 'name': 'Person', 'de...",1455702493145468930,2021-11-12 20:12:43+00:00,"{'annotations': [{'start': 92, 'end': 103, 'pr...",,1453713306578046977,en,"[(type, id)]",16,False,,0,0,0,0,0
1459248139093827591,@mixtapeminimus1 Will always be Bruce Jenner t...,,232951266,"[{'domain': {'id': '10', 'name': 'Person', 'de...",1458902307039547399,2021-11-12 19:54:02+00:00,"{'annotations': [{'start': 32, 'end': 43, 'pro...",,1211094273753370629,en,"[(type, id)]",76,False,,0,0,1,0,1
1459247809979371529,Is that Bruce Jenner?Was this the moment Caitl...,,964712029,"[{'domain': {'id': '10', 'name': 'Person', 'de...",1459247809979371529,2021-11-12 19:52:44+00:00,"{'annotations': [{'start': 8, 'end': 19, 'prob...",,,en,"[(type, id)]",935,False,,0,0,1,0,1
1459246986096365574,@ResisterSis20 @SinclaireU FACT: Decathlon Gol...,{'media_keys': ['3_1459246137471238144']},1216876347198324736,"[{'domain': {'id': '11', 'name': 'Sport', 'des...",1385701660299776002,2021-11-12 19:49:27+00:00,"{'hashtags': [{'start': 207, 'end': 221, 'tag'...",,328756439,en,"[(type, id)]",70,False,[SayNoToJenner],0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1456734038517977089,"i’m crying, i’m reading an article abt the bru...",,1199891005891211264,"[{'domain': {'id': '10', 'name': 'Person', 'de...",1456734038517977089,2021-11-05 21:23:54+00:00,"{'annotations': [{'start': 43, 'end': 54, 'pro...",,,en,,221,False,,0,1,1,0,2
1456733930707640320,RT @uniqueblessed: HOLD THE FUCK UP BRUCE JENN...,,4083304753,"[{'domain': {'id': '10', 'name': 'Person', 'de...",1456733930707640320,2021-11-05 21:23:28+00:00,"{'annotations': [{'start': 36, 'end': 47, 'pro...",,,en,"[(type, id)]",378,False,,12,0,0,0,12
1456732774069686276,RT @uniqueblessed: HOLD THE FUCK UP BRUCE JENN...,,785131938092818432,"[{'domain': {'id': '10', 'name': 'Person', 'de...",1456732774069686276,2021-11-05 21:18:52+00:00,"{'annotations': [{'start': 36, 'end': 47, 'pro...",,,en,"[(type, id)]",20204,False,,12,0,0,0,12
1456732026699190279,RT @uniqueblessed: HOLD THE FUCK UP BRUCE JENN...,,324176720,"[{'domain': {'id': '10', 'name': 'Person', 'de...",1456732026699190279,2021-11-05 21:15:54+00:00,"{'annotations': [{'start': 36, 'end': 47, 'pro...",,,en,"[(type, id)]",2432,False,,12,0,0,0,12


Elliot Page has been referenced by his correct name 689 times, and deadnamed 109 times in the past seven days. In stark contrast, Caitlin Jenner has had an opposite experience: she has only been referenced by her correct name 197 times, but deadnamed 681 times in the past week, even though she's been out as a trans woman for longer.