# API Calls with Python

APIs (application programming interfaces) are hosted on web servers. When you type www.google.com in your browser's address bar, your computer is actually asking the www.google.com server for a webpage, which it then returns to your browser. APIs work much the same way, except instead of your web browser asking for a webpage, your program asks for data. This data is usually returned in JSON format. To retrieve data, we make a request to a webserver. The server then replies with our data. In Python, we'll use the `requests` library to do this.

### Python Setup

In [None]:
# interacting with websites and web-APIs
from requests import get 

# data manipulation
import pandas as pd 
import numpy as np

## How does the request package work?

We'll start with a simple example of using an API to get information about the International Space Station, such as location and people currently on the ISS. Information about this API can be found here: http://open-notify.org

**Note:** There are Python code examples provided in the documentation as well. We will be using slightly different code, but their code should work too! There are multiple modules you can use to access APIs, and we just use one possibility. Feel free to look at the code that they provide and see if you can figure out what is going on. 

### Using the Open Notify ISS API

To access the API, we use the `request` function. In oder to tell Python what to access we need to specify the url of the API endpoint.

### Making a Request
When you ping a website or portal for information this is called making a request. That is exactly what the `requests` library has been designed to do. 

### Step 1. Specify the URL


In [None]:
url = "http://api.open-notify.org/iss-now.json"

### Step 2. Get the response

Now let's get the response using the URL defined above, using the `requests` library. The `get` function is used to get a response from the URL.  

In [None]:
# Response from the URL
# get is a function from requests

r = get(url)  

In [None]:
r.url

### Step 3. Check the Response Code

Before you can do anything with a website or URL in Python, it’s a good idea to check the current status code of said portal.

The following are some useful response codes to keep in mind:

`200` - the query parameters are all valid; the results will be in the body of the response

`400` - the query parameters are not valid, typically either because they are not in valid JSON format, or a specified field or value is not valid; the “status reason” in the header will contain the error message

`500` -  there is an internal error with the processing of the query; the “status reason” in the header will contain the error message

Let's check the status of our response.

In [None]:
r.status_code  # Check the status code

We are good to go. Now let's get the content.

### Step 4. Get the Content

After a web server returns a response, you can collect the content you need by converting it into a usable format. JSON is a way to encode data structures like lists and dictionaries to strings that ensures that they are easily readable by machines. JSON is the primary format in which data is passed back and forth to APIs, and most API servers will send their responses in JSON format. The `.json` method for a response object converts it into a Python dictionary so that we can use it within Python. 

In [None]:
json = r.json() 

In [None]:
json

Here, this API gives us information on the timestamp, the message whether it was a success or not, and the ISS position. This isn't a super sophisticated API, because it really only gives information about the position of the ISS whenever you send a request, but it does give some information.

<font color ='red'>**Question 1: What is the length of `json`? What type of object is the value associated with the key `iss_position`?**</font>

Sometimes, it can be hard to see exactly what is in the response. It might be useful to look at the keys to see what data we actually want. 

In [None]:
json.keys()  # View JSON

Note that we have three keys: `message`, `iss_position`, and `timestamp`. The information that we really want is in the `iss_position` key. We can try taking a look at it. 

In [None]:
json['iss_position']

This gives us the latitude and longitude at the time we made the request. 

There is usually some other information that is provided with that might be of less interest to you. For example, the `message` field isn't particularly interesting. Many times, APIs will include a source or a summary of the data returned in addition to the exact information that you requested.

## Adding Queries to the API Request

The ISS API is a very simple example of an API. There is only one thing that we can get from it: the position of the ISS at the point in time that we send the request. Usually, we also have query parameters that we add so that we can specify exactly what data we want to get. For example, if you wanted to get data about the US, there's lots of different variables that you might be interested, over different time frames. These are things that you might need to specify to get the data you need.

Consider the Data USA API, which can be found here: https://datausa.io/about/api/. This is an API that you can use to get information about various statistics about the US, broken down by categories like State or Year. Let's look at an example of constructing the API query. 

In [None]:
datausa_base_url = 'https://datausa.io/api/data'
parameters = {'drilldowns': 'State', 'measure':'Population' ,'year':2020}

datausa_response = get(datausa_base_url, params = parameters) 
datausa_response.status_code

Here, we start with the base URL and add the queries that we want to include. The way to define the parameters to get the data you want should generally be described within the API documentation (the Data USA website isn't the best about this, but they do include some examples to help you see how this might be constructed). In our example above, we want the Population of each state in the year 2020. Looking at the documentation from the Data USA site, we can see that we should specify a `drilldown` of `State`, a `measure` of `Population`, and a `year` of 2020. This helps us to construct the final URL which retrieves the data we want. 

You can try looking at that URL and actually navigating to it. You should see the JSON of the response we get from it.

In [None]:
datausa_response.url

Let's take a look at the response for the population by state in 2020.

In [None]:
pop_by_state_2020 = datausa_response.json()

<font color ='red'>**Question 2: What are the keys in `pop_by_state_2020`? What are the types of objects for the values for those keys? What is the source of the data that we pulled?**</font>

<font color ='red'>**Question 3: Assign the population of Alabama to `alabama_pop`. Do not hard code anything (that is, retrieve the information from `pop_by_state_2020` instead of just typing out the number after reading it).**</font>

### Exploring the Data

Looking through the various tools within the Data USA website, you should be able to find other drilldowns, measures, and characteristics you can request data about. For example, to get the total population in 2020 broken down by citizenship status, we can use the drilldown of `Citizenship Status` with a measure of `Total Population` and a year of `2020`.

In [None]:
parameters = {'drilldowns': 'Citizenship Status', 'measure':'Total Population', "Year":2020}

response = get(datausa_base_url, params = parameters)
print(response.url)
response.json()


Note that we actually could have built the URL manually instead of using the params argument in `get`. 

In [None]:
response_with_full_url = get('https://datausa.io/api/data?drilldowns=Citizenship+Status&measure=Total+Population&Year=2020')

In [None]:
response_with_full_url.json()

<font color ='red'>**Question 4: Pull from the Data USA API to get the breakdown of the number of people by Gender in the US in the year 2020.**</font>

*Hint:* You can use `Gender` for this.

You can also include multiple variables in your parameters by including the variable names in a list. Take a look at the URL to see what happens when you do this. You should be able to see the way that the URL is constructed, as well as the resulting data that you get back from this request.

In [None]:
parameters = {'drilldowns': ['State','Citizenship Status'], 'measure':'Total Population', "Year":2020}

citizenship_by_state_response = get(base_url, params = parameters)
print(citizenship_by_state_response.url)
citizenship_by_state = citizenship_by_state_response.json()['data']

<font color = 'red'>**Question 5: What type of object is `citizenship_by_state`? What is the length of `citizenship_by_state`? What are the types of objects that are inside `citizenship_by_state`?**</font>

### Dictionary Comprehension

Dictionary comprehension is very similar to list comprehension, except we create a dictionary instead of a list as the output. We have the same format, except it is in curly braces (`{}`) and includes an expression for how we should define the keys as well as the values.

Recall: Loop structure looks like:

    for i in <range>:
        <some expression>
        
Dictionary comprehension would look something like this:

    {<key expression>:<value expression> for i in <range>}

In [None]:
{x:x*2 for x in range(10)}

In [None]:
{'Number ' + str(x):x*2 for x in range(10) if x > 5}

<font color = 'red'>**Question 6: Create a dictionary called `noncitizens` that contains the number of non-citizens in each state. The key should contain the state name and the value should be the number of non-citizens.**</font>

*Hint:* If you're having trouble thinking of how to build it out, try doing it piece by piece. You can even work backwards, starting with the `for` part.

<font color = 'red'>**Question 7: In 2015, what was the average wage by race for male and female workers? Create two dictionaries, one called `male_wages` and one called `female_wages`, with keys representing race category and values representing the average wage for people in that group.**</font>

*Hint:* Use `Average Wage` for the average wage.

## Notes on APIs

APIs are generally useful because they are typically well-documented and come with example code. This is because the data provider wants to make the data available to others. However, there are many cases in which the documentation can be confusing or misleading. In addition, there might be times when building the URL can be a bit difficult or may not follow the exact conventions that you are used to. Feel free to try building the URL manually and navigating to it so that you can see the JSON response before using it in Python. Sometimes, the best way to check something is by trying it out in the browser!