# APIs 101 (oDCM)

*The focus in this tutorial lies on pagination (i.e., looping through multiple pages), and parameters (i.e., modifying the response of an API call). We know you love dad jokes, so guess what? We're back with many more jokes, and you're going to learn how to save them all! Finally, we show you how to conduct a user-level analysis of activity on Reddit!*

--- 

## Learning Objectives

* Send HTTP requests to a web API, and retrieve JSON responses
* Use parameters to modify the results of an API call
* Iterate over multiple pages of JSON responses 
* Extract and store results of an API request in lists and files

--- 

## Acknowledgements
This course draws on a variety of online resources which can be retrieved from the [course website](https://odcm.hannesdatta.com/#student-profile--prerequisites). 


--- 

## Support Needed?
For technical issues outside of scheduled classes, please check the [support section](https://odcm.hannesdatta.com/docs/course/support) on the course website.

---

## 1. Icanhazdajoke

### 1.1 Make an API request

[Icanhazdadjoke.com](https://icanhazdadjoke.com) is a simple web site that allows users of their API to receive (randomized) *dad jokes*. Yes, we know that sounds stupid, but we really like that API for its simplicity, which is ideal when explaining you more about APIs.

So, the code cell below calls the joke API, and the result of the API request displays a joke. 

__Let's try it out__
Run the cell a few time to notice that with each call, new jokes are served.

In [3]:
# request JSON output from icanhazdadjoke API
import requests
url = "https://icanhazdadjoke.com"
response = requests.get(url, headers={"Accept": "application/json"})
joke_request = response.json() 
print(joke_request)

{'id': 'trjG61Dlqzd', 'joke': "Why was Santa's little helper feeling depressed? Because he has low elf esteem.\n", 'status': 200}


### 1.2 Use parameters to modify the API results   

__Importance__

Probably you agree that dad jokes per se aren't that exciting. Wouldn't it be amazing to search for particular jokes instead?

APIs certainly provide the functionality to *customize* requests. Actually, that's where APIs make most of a difference! You have probably already modifying the results of an API call a dozen times without even knowing it. For example, if you Google the word `cat`, the results page may look something like this:

<img src="images/google.png" width=60% align="left"  style="border: 1px solid black"/>

Note how the link in the browser starts off with [`google.com/search?q=cat...`](https://www.google.com/search?q=cat)? What happened here is that your search query was passed to the Google Search API, and hence returned the results of the search query `cat`. That search query is even already embedded in the link itself. Cool, right?

__Let's try it out__

So, rather than filling out the search box on the website of Icanhazdadjoke.com itself, you can also tweak it in the URL directly. Open your browser now at [https://icanhazdadjoke.com/search?term=cat](https://icanhazdadjoke.com/search?term=cat), nd modify the `term` parameter to try search for different jokes.

<img src="images/cat_jokes.gif" width=60% align="left"  style="border: 1px solid black"/>

With the idea of passing parameters to a website, we can update the `search_url` and include the `params` attribute, contains a dictionary with parameters that further specify our request. Run the cell below to see cat jokes here in Jupyter Notebook.

In [4]:
import requests
search_url = "https://icanhazdadjoke.com/search"

response = requests.get(search_url, 
                        headers={"Accept": "application/json"}, 
                        params={"term": "cat"})
joke_request = response.json()
print(joke_request)

{'current_page': 1, 'limit': 20, 'next_page': 1, 'previous_page': 1, 'results': [{'id': '8UnrHe2T0g', 'joke': '‘Put the cat out’ … ‘I didn’t realize it was on fire'}, {'id': 'iGJeVKmWDlb', 'joke': 'My cat was just sick on the carpet, I don’t think it’s feline well.'}, {'id': 'daaUfibh', 'joke': 'Why was the big cat disqualified from the race? Because it was a cheetah.'}, {'id': '1wkqrcNCljb', 'joke': "Did you know that protons have mass? I didn't even know they were catholic."}, {'id': 'BQfaxsHBsrc', 'joke': 'What do you call a pile of cats?  A Meowtain.'}, {'id': 'O7haxA5Tfxc', 'joke': 'Where do cats write notes?\r\nScratch Paper!'}, {'id': 'TS0gFlqr4ob', 'joke': 'What do you call a group of disorganized cats? A cat-tastrophe.'}, {'id': '0wcFBQfiGBd', 'joke': 'Did you hear the joke about the wandering nun? She was a roman catholic.'}, {'id': 'AQn3wPKeqrc', 'joke': 'It was raining cats and dogs the other day. I almost stepped in a poodle.'}, {'id': '39Etc2orc', 'joke': 'Why did the man

The `joke_request` object now contains a list with all cat-related jokes (`joke_request['results']`), the search term (`cat`), and the total number of jokes (`10`).

#### Exercise 1
1. Change the search term parameter to `dog` and revisit `joke_request['results']`. How many dog jokes are there? 
2. Write a function `find_joke()` that takes a query as an input parameter and returns the number of jokes from the `icanhazdadjoke` search API. 




#### Solutions

In [5]:
# Question 1 
search_url = "https://icanhazdadjoke.com/search"

response = requests.get(search_url, 
                        headers={"Accept": "application/json"}, 
                        params={"term": "dog"})
joke_request = response.json()
print(f"The number of dog jokes is: {joke_request['total_jokes']}")

The number of dog jokes is: 12


In [11]:
# Question 2
def find_jokes(term):
    search_url = "https://icanhazdadjoke.com/search"

    response = requests.get(search_url, 
                            headers={"Accept": "application/json"}, 
                            params={"term": term})
    joke_request = response.json()
    num_results = joke_request['total_jokes']
    return num_results

find_jokes("some-searchterm-you-would-like-to-try-out")

0

### 1.3 Pagination

__Importance__

Transferring data is costly - not strictly in a monetary sense, but in terms of *time*. So - APIs are typically very greedy in returning data. In fact, ideally, they only return a very targeted data point that is needed for the user to see. On icanhazdadjoke.com, for example, that would be a few jokes at maximum. It saves the web site owner paying for bandwith, and guarantees that the site is responding fast to user input (such as navigating the site, or searching for jokes).

However, when using APIs for research purposes, we are frequently interested in obtaining *everything*. What's the use, for example, to obtain a book's most recent 10 reviews, if there are hundreds of reviews written?

We think you see where we're going with this... 

__Let's try it out__

So, let's try to grab all of the 649 jokes currently available at Icanhazdadjoke.com. The API output, unfortunately, only shows the *first 20 jokes*. To retrieve the remaining 629 jokes, you need *pagination*. That is, the API divides the data into smaller subsets that can be accessed on various pages, rather than returning all output at once. 

Let's retrieve the first batch of dad jokes (note, here we're searching for the `term` `""` - an empty string - which brings us to the entire set of jokes available via the API. In practice, searching for `""` is often blocked by APIs - simply because the site doesn't *want* you to extract an entire copy of their data. In that case, you'd have to become creative to obtain your seeds.

In [18]:
search_url = "https://icanhazdadjoke.com/search"

response = requests.get(search_url, 
                        headers={"Accept": "application/json"}, 
                        params={"term": ""})
joke_request = response.json()
joke_request['results'] = '' # let's remove all jokes, and only look at the other attributes in the JSON response
joke_request

{'current_page': 1,
 'limit': 20,
 'next_page': 2,
 'previous_page': 1,
 'results': '',
 'search_term': '',
 'status': 200,
 'total_jokes': 649,
 'total_pages': 33}

You notice that by default, each page contains 20 jokes (see `limit` in the JSON response above), where page 1 shows jokes 1 to 20, page 2 jokes 21 to 40, ..., and page 33 jokes 641 to 649. 

You can adjust the number of results on each page (max. 30) with the `limit` parameter (e.g., `params={"limit": 10}`). In practice, almost every API on the web limits the results of an API call (`100` is also a common cap).

In the example below, we set `limit` equal to `10`, `20`, and `30`, and see how it affects the number of total pages (`total_pages`) on which jokes are listed. 

In [20]:
for limit in range(10, 31, 10):
    response = requests.get(search_url, 
                            headers={"Accept": "application/json"}, 
                            params={"term": "", 
                                   "limit": limit})
    joke_request = response.json()
    print(f"Limit {limit} gives {joke_request['total_pages']} pages")

Limit 10 gives 65 pages
Limit 20 gives 33 pages
Limit 30 gives 22 pages


As expected we find that the higher the limit, the more results fit on a single page, and thus the *lower the number of pages* to loop through.

--- 
#### Exercise 2

In addition to the `limit` parameter, you can specify the current page number with the `page` parameter (e.g., `params={"term": "", "page": 2}`. See the example in the next cell:

In [25]:
response = requests.get(search_url, 
                            headers={"Accept": "application/json"}, 
                            params={"term": "", 
                                   "limit": 5,
                                   "page": 2})
response.json()

{'current_page': 2,
 'limit': 5,
 'next_page': 3,
 'previous_page': 1,
 'results': [{'id': '0LuXvkq4Muc',
   'joke': "I knew I shouldn't steal a mixer from work, but it was a whisk I was willing to take."},
  {'id': '0ga2EdN7prc',
   'joke': 'How come the stadium got hot after the game? Because all of the fans left.'},
  {'id': '0oO71TSv4Ed',
   'joke': 'Why was it called the dark ages? Because of all the knights. '},
  {'id': '0oz51ozk3ob', 'joke': 'A steak pun is a rare medium well done.'},
  {'id': '0ozAXv4Mmjb',
   'joke': 'Why did the tomato blush? Because it saw the salad dressing.'}],
 'search_term': '',
 'status': 200,
 'total_jokes': 649,
 'total_pages': 130}

Adapt the function `find_joke()`, such that it loops over *all available pages*, and stores the ids and jokes in a list. You can leave the `limit` parameter at its default value (20). Make sure that your function also works when you pass it a search `term`. 

Tip: to determine how many pages you need to loop through, you can use the `total_pages` field (e.g., there are only ten cat jokes, so in that case, 1 page would suffice).

#### Solutions

In [33]:
def find_jokes(term):
    search_url = "https://icanhazdadjoke.com/search"
    page = 1
    jokes = []

    while True: 
        response = requests.get(search_url, 
                                headers={"Accept": "application/json"}, 
                                params={"term": term,  # optionally you can add "limit": 20 but that's already the default so it doesn't change anything
                                        "page": page})
        joke_request = response.json()
        jokes.extend(joke_request['results'])
        if joke_request['current_page'] <= joke_request['total_pages']:
            page += 1
        else: 
            return jokes

output = find_jokes("cat") # try running it with "", too!

In [34]:
print(f"You've collected {len(output)} jokes")

You've collected 10 jokes


### 1.4 Wrap-up

To sum up, we have seen how *parameters* can be a powerful tool when working with APIs. They allow you to tailor your request to be more specific, or loop through multiple pages. 

In the API documentation, you typically find more information about the available parameters and the values they can take on. For example, the `icanhazdadjoke` [documentation](https://icanhazdadjoke.com/api) includes a section on the `/search` endpoint and the accepted parameters (`page`, `limit`, `term`). These parameters, however, differ from one API to another. So it's crucial to study each web service's API documentation carefully.

--- 
## 2. Reddit

### 2.1 Subreddits

Although we already touched upon the Reddit API last time, we'll provide a more thorough description of subreddits here as this entire tutorial is devoted to getting started with the API. Users can post content in subreddits which are niche communities around a specific topic. There is a subreddit for almost everything, and they all start with `reddit.com/r/...`, for example, [askreddit](https://www.reddit.com/r/AskReddit), [aww](https://www.reddit.com/r/aww/), [gifs](https://www.reddit.com/r/gifs/), [showerthoughts](https://www.reddit.com/r/Showerthoughts), [lifehacks](https://www.reddit.com/r/lifehacks), [getmotivated](https://www.reddit.com/r/GetMotivated), [moviedetails](https://www.reddit.com/r/MovieDetails), [todayilearned](https://www.reddit.com/r/todayilearned/), or [foodporn](https://www.reddit.com/r/FoodPorn/). 

<img src="images/reddit_science.png" width=60% align="left"  style="border: 1px solid black"/>

Subreddits are hosted by moderators and come with their own set of rules (e.g., links to papers you share in [`r/science`](https://www.reddit.com/r/science/) must be less than 6 months old). Other users can join a subreddit so that they receive updates about new posts and comments.

<img src="images/reddit_moderators.png" width=60% align="left"  style="border: 1px solid black"/>

#### Exercise 3
Consult the [`marketing`](https://www.reddit.com/r/marketing/hot/) subreddit and answer the following questions: 
1. For your thesis, you need to collect a couple more survey responses. Are you allowed to share a link to your survey in this subreddit? Please elaborate on how you came to this conclusion. 
2. You're a bit stubborn and decide to do it anyway and therefore run the risk of being reported by one of the moderators. How many moderators take care of managing this subreddit? 
3. Like other social media platforms, you can navigate towards Reddit's user-profiles and learn more about these persons. Inspect the profile of one of the moderators of the marketing subreddit, [`sixwaystop313`](https://www.reddit.com/user/sixwaystop313), and describe in your own words what types of information you can gather from this user. How is the feed organized? 

#### Solutions
1. No, the subreddit rules prescribe users not to post surveys and homework assignments (right sidebar).
2. `r/marketing` is moderated by 10 users (of which 1 AutoModerator)
3. On a user page you find the bio, trophies, communities the user moderates, connected accounts, and most importantly: all user's posts and comments.

---

### 2.2 API headers  


**Importance**  

To request data from the Reddit API we need to include so-called `headers` in our request. HTTP headers are an important part of the API request as they include meta-data associated with the request (e.g., type of browser, language, expected data format, etc.). 

**Let's try it out**  

Below we make a request to the moderators' page of the [`marketing`]() subreddit that includes such a header. In the upcoming exercise, we make our very first request to the Reddit API and parse the output!

In [7]:
import requests
url = 'https://www.reddit.com/r/marketing/about/moderators/.json'

headers = {'authority': 'www.reddit.com', 'cache-control': 'max-age=0', 'upgrade-insecure-requests': '1', 'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36', 'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9', 'sec-fetch-site': 'same-origin', 'sec-fetch-mode': 'navigate', 'sec-fetch-user': '?1', 'sec-fetch-dest': 'document', 'accept-language': 'en-GB,en;q=0.9'}
response = requests.get(url, headers=headers)
json_response = response.json()

#### Exercise 4
1. First, take a look at the `json_response` object. Then, leave out the `headers` parameter in your request, run the cell again, and inspect the `json_response` another time. Are there any differences? 
2. Write a for-loop that prints the moderator `name` of the `marketing` subreddit. Every subreddit includes a bot moderator (`AutoModerator`) which should not be included.
3. Convert your code from the previous exercise into a function `get_mods()` that takes a `subreddit` as input and returns a list of moderators' names. Test your function for the `science` subreddit. How many moderators does it have? 

#### Solutions
1. Without the `headers` parameter, it returns an error code (429).

In [8]:
# Question 2 
# don't forget to run the request object with headers again!
for item in json_response['data']['children']: 
    moderator_name = item['name']
    if moderator_name != 'AutoModerator': 
        print(moderator_name)

dpatrick86
v022450781
r0nin
Gustomaximus
everythingswan
sixwaystop313
shampine
JonODonovan
AptSeagull


In [9]:
# Question 3
def get_mods(subreddit):
    moderator_names = []
    response = response = requests.get(f'https://www.reddit.com/r/{subreddit}/about/moderators/.json', headers=headers)
    json_response = response.json()
    for item in json_response['data']['children']:
        moderator_name = item['name']
        if moderator_name != 'AutoModerator':
            moderator_names.append(moderator_name)
    return moderator_names
    
science_moderators = get_mods('science')
print(f"The science subreddit is moderated by {len(science_moderators)} users")

The science subreddit is moderated by 1544 users


---
### 2.3 Pagination

**Importance**  

In addition to subreddits (`r/...`) and moderator pages (`.../about/moderators`), Reddit users have their own profile page. Let's have another look at the marketing moderator [profile](https://www.reddit.com/user/sixwaystop313) we saw before. Each of the `children` in the `data` is characterized by a type (e.g., `t1` = comment, `t3` = post), subreddit, timestamp, number of comments, upvotes, downvotes, and many others. 

In [10]:
mod = "sixwaystop313"
response = requests.get(f'https://www.reddit.com/user/{mod}.json', headers=headers)
json_response = response.json()

#### Exercise 5
1. The `json_response` object contains both comments and posts that are ordered chronologically (as they appear on the profile page). Pick a comment of the author (`kind`: `'t1'`) and store the text of the comment in a variable called `comment_text`. 
2. What happens to `comments_text` once the author publishes another post? 
3. How many objects are stored in `json_response['data']['children']`? What does that mean? 

#### Solutions
1. At the moment of creating this solutions file, the 1st item in the list is a comment which we extract as follows:
`comment_text = json_response['data']['children'][0]['data']['body']`. In your case, it may be 2nd (or 3rd, 4th, ... item), however, provided that all other items in the lists are posts. For that reason, the counter after `[0]` may deviate from time to time. 
2. Since the list items are ordered chronologically, new items are appended at the beginning of the list and thus push existing items to the "right" (i.e., index 0 becomes index 1, etc.). Suppose that the author publishes another post, then index `[0]` would no longer contain a comment. And because post items have been structured differently than comment items this could potentially break your script once you try to parse not existing items. For example, posts do not have a `['body']` element that stores the comment text. 
3. The object comprises 25 items (`len(json_response['data']['children'])`). This means that only the 25 most recent comments and posts are shown, and thus that we need to apply pagination to obtain historical records.

As you just noticed, the API only returns a subset of all records (every time you scroll to the bottom of the page, it pulls in new data - ordered chronologically). After all, it would take ages to show all data for a user that has been active on Reddit since 2009! 

Similar to `icanhazdadjoke` we apply pagination to tell the API which part of the data it needs to return. The difference, however, is that it's not a number (like `"page": 2`) but a string of characters that can only be obtained from the previous request (i.e., we cannot derive  what the next key will be from a pattern, like: page 2, 3, ..., etc.). In fact, the request we already made contains this "secret" key in the attribute `after`:

In [11]:
json_response['data']['after']

't3_k20wsy'

Next, we attach this key to our request with the `after` parameter to obtain the next subset of items and assign the responses to a variable called `json_response_after`: 

In [12]:
after = json_response['data']['after']
url = f'https://www.reddit.com/user/{mod}.json'
response = requests.get(url, 
                        headers=headers, 
                        params={"after": after})
json_response_after = response.json()

At the point of writing this tutorial, the last item in `json_respose` is the following post (`Detroit's Brewing Heritage' on tap at Historical Museum`): 

<img src="images/json_response.png" width=60% align="left"/>

The first and second item in `json_response_after` are the two comments below that ("Shame on ... us back." and "Are you ... comment /u/ehchip"). In other words, where one object ends, another begins. We apply this concept to loop over the first 10 pages. Every time we store the value of the `after` attribute which we use as a parameter in the follow-up request. 

In [13]:
after = None
item_type = []

for counter in range(10): 
    url = f'https://www.reddit.com/user/{mod}.json'
    response = requests.get(url, 
                            headers=headers, 
                            params={"after": after})
    json_response = response.json()
    after = json_response['data']['after']

    # loop over all items in a request
    for item in json_response['data']['children']:
        item_type.append(item['kind'])

#### Exercise 6
1. Why do we define `after = None` at the top of the file? Can we leave it out? 
2. Without looking at the length of the list: how many items do you expect in `item_type`? 
3. Of those items, calculate the the percentage of posts (`t3`) and comments (`t1`). What does this tell you? 
4. Convert the code snippet above into a function `reddit_activity()` that takes a `username`, `attribute`, and `num_pages` as input and returns the attribute for the given user. For example, `reddit_activity("sixwaystop313", "subreddit_name_prefixed", 40)` should return a list of the subreddits in which the user has posted or commented across the 1000 most recent items. Has this moderator actively contributed to the `r/marketing` subreddit recently? 

#### Solutions 
1. In our first request we don't know the value of `after` yet. It is important, however, to include this line because otherwise the `after` value in `params={}` is undefined. 
2. We expect the list to have a size of 10 (number of requests) * 25 (number of items per request) = 250. 

In [14]:
# Question 3
def item_frequency(items, item_filter):
    total_items = len(items)
    item_filter_count = items.count(item_filter)
    return item_filter_count / total_items * 100
            
perc_posts = item_frequency(item_type, 't1')
perc_comments = item_frequency(item_type, 't3')

print(f"The percentage of posts and comments is {perc_posts}% and {perc_comments}%, respectively")
# Thus, based on this subset of data, the author is more likely to start a new post than to comment on others' posts

The percentage of posts and comments is 64.4% and 35.6%, respectively


In [15]:
# Question 4
def reddit_activity(username, attribute, num_pages):
    after = None
    activity = []

    for counter in range(num_pages): 
        url = f'https://www.reddit.com/user/{username}.json'
        response = requests.get(url, 
                                headers=headers, 
                                params={"after": after})
        json_response = response.json()
        after = json_response['data']['after']

        # loop over all items in a request
        for item in json_response['data']['children']:
            activity.append(item['data'][attribute])
    return activity

reddit_data = reddit_activity("sixwaystop313", "subreddit_name_prefixed", 40)
print(f"The percentage of posts and comments in the marketing subreddit is {item_frequency(reddit_data, 'r/marketing')}%")
# Thus, the moderator has not actively contributed to the marketing subreddit recently

The percentage of posts and comments in the marketing subreddit is 0.0%


---
### 2.4 Time Conversion

Both posts and comments contain the `created_utc` attribute which a timestamp that indicates the number of seconds since 1970. With the use of the `time` library we can easily convert it into a readable date and time:

In [16]:
import time 

time_example = 1595571434
time_converted = time.gmtime(time_example)
print(time_converted)

time.struct_time(tm_year=2020, tm_mon=7, tm_mday=24, tm_hour=6, tm_min=17, tm_sec=14, tm_wday=4, tm_yday=206, tm_isdst=0)


From `time_converted` you can extract the day, month, and year separately:

In [17]:
print(f"The day is: {time_converted.tm_mday}")
print(f"The month is: {time_converted.tm_mon}")
print(f"The year is: {time_converted.tm_year}")

The day is: 24
The month is: 7
The year is: 2020


Or together, like this (characters that start with `%` have a special meaning, the `-` in  between these characters are literally the dashes you see in the output): 

In [18]:
print(time.strftime("%d-%m-%Y", time_converted))  
# %d = day
# %m = month
# %Y = year (4 digits) and %y = year (2 digits)

24-07-2020


--- 
#### Exercise 7 
1. In a similar way, you can convert the UTC time into an hour (`%H`) and minute (`%M`). Transform `time_example` into a readable time. The output should be `06:17`. 
2. Suppose we want to analyze the Reddit use of `sixwaystop313` throughout the day. More specifically, we want to know during what hours he is most active on the platform. 
  * Use the function `reddit_activity()` you wrote earlier to pull in the UTC timestamps (set `num_items` to `10`). 
  * Extract the hour from these timestamps. 
  * Determine the top 3 hours the user is most active on Reddit. You can assume that the total number of posts and comments is a reasonable proxy for time spend on the platform. 

#### Solutions

In [19]:
# Question 1 
print(time.strftime("%H:%M", time_converted))

06:17


In [20]:
# Question 2
time_data = reddit_activity("sixwaystop313", "created_utc", 10)
hours = []

for timestamp in time_data: 
    time_converted = time.gmtime(timestamp)
    hours.append(time_converted.tm_hour)
    
for hour in range(24):
    print(f"Hour {hour}: {hours.count(hour)} items")
    
# For this dataset the top 3 hours are: 1, 3, and 4.

Hour 0: 16 items
Hour 1: 22 items
Hour 2: 16 items
Hour 3: 21 items
Hour 4: 25 items
Hour 5: 6 items
Hour 6: 2 items
Hour 7: 0 items
Hour 8: 3 items
Hour 9: 0 items
Hour 10: 1 items
Hour 11: 3 items
Hour 12: 4 items
Hour 13: 8 items
Hour 14: 12 items
Hour 15: 15 items
Hour 16: 17 items
Hour 17: 10 items
Hour 18: 7 items
Hour 19: 10 items
Hour 20: 15 items
Hour 21: 7 items
Hour 22: 14 items
Hour 23: 16 items


---

### 2.5 Wrap-Up

After working on this set of exercises you should be able to further explore the Reddit API on your own. Does `sixwaystop313` spend most time in subreddits in which he get the most upvotes? Did his posting behavior change over time? Are moderators more likely to be a premium Reddit user. Give it a try! 

At the same time, you should realize that we have only scratched the surface of what's possible. Headers and pagination play an important role in requests and were sufficient thus far, yet the majority of [API endpoints](https://www.reddit.com/dev/api/) require authentication (oauth) which is a whole topic on its own. 