## Imports

In [None]:
import requests  # Communicates with Twitter server
import pandas    # Manipulates and saves tabular data
import datetime  # Helper for changing date stamps

# Python will search for a file called api_keys.py in the same directory.
from api_keys import bearer_token

## Settings
While we can put this directly into the ```response.get()``` parameters, it is easier to read and modify if we prepare it first.

In [None]:
url = "https://api.twitter.com/2/tweets/search/recent"
columns = ["created_at", "text", "lang"]
columns_string = ",".join(columns) # This creates a string of columns separated
                                   # by commas.  We still need the columns as a
                                   # list for later when we prepare the dataframe.
parameters = {
                "query":        "#academictwitter",
                "max_results":  100,
                "tweet.fields": columns_string
            }
headers = {"Authorization": f"Bearer {bearer_token}"}

## Making the call
First, we prepare an empty list for the payloads, because there may be many in a single script.

We send the first request to the Twitter server, and when the server responds, the ```requests``` module packages it up nicely in a Response object, which we save in the variable ```response```.  This allows us to access properties of the response, such as the raw text, the URL, or the headers.

Because ```response.text``` is raw text, Python cannot navigate it as a dict or list.  We need to call the ```response.json()``` method.  This will parse the raw text and return Python data types.  We save that data in the ```response_json``` variable.

In [None]:
payloads = list()
response = requests.get(url, params=parameters, headers=headers)
response_json = response.json()
payloads.append(response_json)

Twitter is likely to have more than 100 results for our query.  If so, we will need to send another request for each page of 100 results until there are no more.  To do this, we grab the ```response_json["meta"]["next_token"]``` value if there is one, and we send the same request again and include this token.

Other APIs you encounter may do this differently.  They may use a page number and page size, e.g. www.example.com/api?page=2&size=100.

In [None]:
if response_json["meta"] and response_json["meta"]["next_token"]:
    
    # Add the next_token to the parameters we used in the last call, overwriting
    # the previous value.
    parameters["next_token"] = response_json["meta"]["next_token"]
    
    # We send requests until we no longer have a next_token.
    while parameters["next_token"]:
        response = requests.get(url, params=parameters, headers=headers)
        response_json = response.json()
        payloads.append(response_json)
        
        # Make sure that we break immediately if we hit a rate limit
        if response.status_code == 429: raise Exception("Rate limit exceeded.")
        elif response.status_code >= 400: raise Exception("Client or server error.")
        
        # Attempt to set the next token, but if we get an error telling us
        # there is no next_token, than we break the loop.
        try:
            parameters["next_token"] = response_json["meta"]["next_token"]
        except KeyError:
            break

## Pandas
We prepare the empty dataframe and assign it to the variable ```dataframe``` so that we can later loop through the results and insert them.

In [None]:
dataframe = pandas.DataFrame(columns=columns)

## Inserting results into the dataframe
For each tweet in our results, we prepare an empty row.  Then, within that loop, we loop over the columns.  We add the column to the row, and finally, we add the row to the dataframe.

In [None]:
for payload in payloads:
    for tweet in payload["data"]:
        row = dict()
        for column in columns:
            row[column] = tweet[column]
        dataframe = dataframe.append(row, ignore_index=True)

In [None]:
dataframe.head()

Unnamed: 0,created_at,text,lang
0,2021-11-17T01:36:46.000Z,I'd be interested to hear from authors that us...,en
1,2021-11-17T01:36:32.000Z,"Hi guys! If you need any help with acads, essa...",en
2,2021-11-17T01:36:14.000Z,Good morning! We're open for commissions! Rush...,en
3,2021-11-17T01:35:51.000Z,Considering taking students in my Mass Atrocit...,en
4,2021-11-17T01:35:42.000Z,#acwrimo Day 16: Today was a really weird day....,en


Note: Normally in Python, calling the ```append()``` method on a list will append in place.  That is, you do not need to assign a return value to a variable.

Pandas does it differently.  If you find that your dataframe is empty at the end of the loop, it may be because you have not assigned the return value from ```dataframe.append()``` to any variable, so each loop it is appending it and then throwing it away.

## Storage
Luckily, Pandas makes loading and saving dataframes almost illegally easy.  This will save the file in the same directory as the script.  It will overwrite files without warning.  I recommend using a timestamp to avoid confusion.  The following code gets the current time and it formats it as ISO 8601 without milliseconds.

If you want to save the file to another location, learn more about the ```os``` package.

In [None]:
timestamp = datetime.datetime.now().strftime("%Y-%m-%d_%H:%M:%S")
dataframe.to_csv(f"{timestamp}_results.csv")
dataframe.to_excel(f"{timestamp}_results.xlsx")