# Working with APIs


## Introduction

Thus far in the program, we have learned how to obtain data from files and from relational databases. However, sometimes the data we need is not readily available via one of these two data sources. In some cases, the data we need may be contained within an application. Application owners will often create APIs **(or Application Programming Interface)** so that their applications can talk to other applications. An **API is a set of programmatic instructions for accessing software applications, and the data that comes from APIs typically contains some sort of structure (such as JSON).** This structure makes working with API data preferable to crawling websites and scraping content off of web pages.

In this lesson, we are going to learn how to make API calls to an application, retrieve data in JSON format, learn about API authentication, and use Python libraries to obtain data from APIs.

## Simple API Example Requests

There are a few libraries that can be used for working with APIs in Python, but the Requests library is one of the most intuitive. It has a get method that allows you to send an HTTP request to an application and receive a response. Let's take a look at a basic API call using the requests library.

In [None]:
import json
import requests

response = requests.get('https://jsonplaceholder.typicode.com/todos')
results = response.json()


In this example, we used the get method to send a request to the JSONPlaceholder API, and we received back a response in the form of JSON structured data. If we wanted to analyze this data, we could easily use Pandas to convert the results into a data frame to which we can then apply various analytical methods.

In [None]:
import pandas as pd

data = pd.DataFrame(results)
data.head(10)

## More Complex API Requests

In the previous section, the data we received from the API was not very complex. It was all at a single level and fit neatly into a data frame. However, sometimes API responses contain data that is nested, and we must find a way to **flatten** the JSON data so that it fits nicely into a data frame. To see this, let us make a call to the StackOverflow API.

### StackAPI

In [None]:
# Import StackAPI
from stackapi import StackAPI

In [None]:
# from stackapi import StackAPI
stack_api = StackAPI('stackoverflow')
badges = stack_api.fetch('badges')

In [None]:
badges_data = pd.DataFrame(badges)
badges_data.head(10)

### GithubAPI

Let's now make an API call to the Github public API, create a Pandas data frame from the results, and examine the structure of the data.

In [None]:
response = requests.get('https://api.github.com/events')

data = pd.DataFrame(response.json())
data.head(10)

When we look at the data frame, we can see that there are dictionaries nested in several fields. We need to extract the information that is in these fields and add them to the data frame as columns. To do this, we are going to create our own flatten function that accepts a data frame and a list of columns that contain nested dictionaries in them. Our function is going to iterate through the columns and, for each column, it is going to:

- Turn the nested dictionaries into a data frame with a column for each key
- Assign column names to each column in this new data frame
- Add these new columns to the original data frame
- Drop the column with the nested dictionaries

### More Complex API Requests: One Column

In [None]:
# select the data['actor'] column
data['actor']

In [None]:
# convert the data['actor'] column to a dictionary
dict(data['actor'])

In [None]:
# create a new data frame 
flatten = pd.DataFrame(dict(data['actor']))
flatten.head()

In [None]:
# transpose flatten

flatten = pd.DataFrame(dict(data['actor'])).transpose()
flatten.head()

In [None]:
# save the columns as strings

columns = [str(i) for i in flatten.columns]
columns

In [None]:
# rename the columns for actor

flatten.columns = ['actor' + '_' + colname for colname in columns]
flatten.columns

In [None]:
# add flatten to data using pd.concat

data = pd.concat([data, flatten], axis=1)
data.head()

In [None]:
# drop the 'messy' column

data = data.drop('actor', axis=1)

In [None]:
data.head(10)

### More Complex API Requests: For Loop 

Most data types of these kinds have more than one condensed column. In that case, it is useful to simply loop over those columns for which this holds.

In [None]:
# Reinitialise the data
response = requests.get('https://api.github.com/events')
data = pd.DataFrame(response.json())

In [None]:
data.head(10)

In [None]:
# select the columns that contain a dictionary
col_list = ['actor', 'org', 'payload', 'repo']

In [None]:
# Create a for-loop to loop over the columns
for column in col_list:
    flattened = pd.DataFrame(dict(data[column])).transpose()
    columns = [str(col) for col in flattened.columns]
    flattened.columns = [column + '_' + colname for colname in columns]
    data = pd.concat([data, flattened], axis=1)
    data = data.drop(column, axis=1)

In [None]:
data.head()

### More Complex API Requests: Function 

We can also write a function that does this for us. Using a function allows us to return a new data frame without actually interfering upon the original one. 

In [None]:
# Reinitialise the data
response = requests.get('https://api.github.com/events')
data = pd.DataFrame(response.json())

In [None]:
def flatten(data, col_list):
    for column in col_list:
        flattened = pd.DataFrame(dict(data[column])).transpose()
        columns = [str(col) for col in flattened.columns]
        flattened.columns = [column + '_' + colname for colname in columns]
        data = pd.concat([data, flattened], axis=1)
        data = data.drop(column, axis=1)
    return data

In [None]:
# Call the function flatten
nested_columns = ['actor', 'org', 'payload', 'repo']

flat = flatten(data, nested_columns)

In [None]:
# display here
flat.head(10)

### More Complex API Requests: JSON_Normalise Function  


Alternatively, we can flatten nested data using the function json_normalize. This function is part of the Pandas library. The function will flatten and rename each flattened column to the name of the original column and the name of the nested column separated by a period. For example actor.avatar_url.

Here is an example of how to use this function. Note that you have to import it separately in order to avoid using the full path when calling the function.

In [None]:
from pandas.io.json import json_normalize

results = response.json()
#results

In [None]:
flattened_data = json_normalize(results)
flattened_data

This data looks much cleaner, and now we have access to the information that was enclosed within those dictionaries. Sometimes multiple rounds of flattening will be required if the JSON data returned from the API you are working with has hierarchically nested data.

## Summary

In this lesson, we covered the basics of working with APIs. We began by introducing the requests library and showing how to make a simple API call using it. We then obtained some more complex JSON data from an API, where the information was nested, and learned how to flatten it.