# APIs

## Learning objectives
- Learn what a REST API is
- Use REST APIs to obtain data

In the last notebook, we looked at scraping the web to obtain some (housing) data. In many cases, especially when wanting textual data, we may need to resort to scraping the web. However, some websites offer web APIs we may access to pull information from. Information coming from APIs are returned in a structured format, such as JSON.

The word API keeps popping... but what is it? And what is a REST API?

An **API** stands for Application Programmable Interface. When we are writing a program/code, we would often need to interface with other people's code (e.g. a library). An API defines the rules we need to follow to talk to the code (e.g. function names).

A **REST API** allows communication over HTTP. The client sends a request, and the server receives a response. Requests will take on one four following types: GET, PUT, POST, and DELETE. Most related to pulling data from other services (via APIs) is **GET**. As the name implies, this is the HTTP Method we use when we want to request some data.

**So how do we request data?**
Well, we need a place to request data from, and this comes in the form of an endpoint URL. An endpoint URL usually looks something along these lines:

![](images/api_url_structure.png)

Let's visit the github API endpoint to see what the **response** is: https://api.github.com/users/ai-core/repos?sort=pushed&direction=desc

As we can see, the response from calling the Github API is a JSON object. However, this doesn't necessarily have to be the case - the developer who coded the API could have allowed for any file format to be returned (XML, CSV, Images etc.). For gathering data through APIs, JSON is typically the easiest to work with, so where possible, we should favour this.

### HTTP Codes
<img src="https://infidigit.b-cdn.net/wp-content/uploads/2019/12/20191227_012601_0000.png" style="width: 350px"/>

Read the docs! https://api.stackexchange.com/docs

Let's collect data from StackExchange's API. Here we'll be working in a slightly roundabout fashion to pull the data we want from their API. This is for teaching purposes, so we can understand the structure of JSON, and for you to get some hands on experience with using a REST API.

We'll be collecting the body/contents of questions posted on StackOverflow. To do this, we'll first pull some posts within a date range. If the type of the post is a question, we'll make another API request to StackExchange's questions endpoint to pull the body of the question.

In [None]:
import requests
# api key
ROOT_URL = "https://api.stackexchange.com"
POSTS_ENDPOINT = "/2.2/posts?fromdate=1596240000&todate=1596585600&order=desc&sort=activity&site=stackoverflow"
r = requests.get(ROOT_URL+POSTS_ENDPOINT)

In [None]:
r.status_code

In [None]:
r.json()

In [None]:
def get_questions(items_object):
    data = {"display_name": [], "profile_image_url": [], "post_id": [], "post_contents": []}
    
    ## Loop over the items object. For the relevant fields in the 'data' variable defined above,
    ## Populate those fields IF the type of the post is a question.
    ## If the type of a post is a question, additionally a request to the relevent API method to obtain the question body
    ## READ READ READ the documentation (or Google it 🙄) to find out how to do so
    ## The question body should be populated in the 'post_contents' field
    ## Return the data object

In [None]:
import pprint

questions = get_questions(r.json()["items"])
pprint.pprint(questions)

In [None]:
for dn, piu, pi, pc in zip(questions["display_name"], questions["profile_image_url"], questions["post_id"], questions["post_contents"]):
    print("Display Name:", dn)
    print("Profile Image URL:", piu)
    print("Post ID:", pi)
    print("Post Body:", pc)
    print()