In [1]:
import requests
import matplotlib.pyplot as plt

# APIs

An Application Programming Interface, or API, is a structured way to retrieve data from a website. Using an API is safer and easier than something like webscraping, since what you get back is already in a usable format. Many organizations use APIs like:
- Government organizations ([US Government](https://www.data.gov/developers/apis))
- Large companies ([Twitter API](https://developer.twitter.com/en/docs))
- News organizations ([NYT API](https://developer.nytimes.com/))
- And [many more](https://github.com/public-apis/public-apis)

If you type `how to use an api in python` in google, you get back many articles walking through how to use an API. It is a well documented and useful tool to be familiar with.

## Basic API Usage

Let's start by using the Numbers API, an API which provides interesting facts about numbers.

With any API, you should start by inspecting the documentation. For the Numbers API, the documentation is located here: http://numbersapi.com

We will be using the `requests` package to make a `GET` request to an API. Similar to webscraping, APIs require an endpoint to tell python where to send the request.

When using an API, the first thing we need to know is the expected URL structure. In this case, it is http://numbersapi.com/number/type where number is the number for which we want an interesting fact and type indicates which type of fact we want. Note that type can be omitted, and it will default to trivia.

In [2]:
endpoint = 'http://numbersapi.com/8'

response = requests.get(endpoint)

Let's check the response. If all is well, we should have a 200 response.

In [3]:
response

<Response [200]>

To access the content of the response, we can look at the `text` attribute.

In [4]:
response.text

'8 is the number of principles of Yong in Chinese calligraphy.'

What if we want to be able to easily change the number that we want to retrieve. For this, we can make use of an f-string. 

In [5]:
number = 95

endpoint = f'http://numbersapi.com/{number}'

response = requests.get(endpoint)

response.text

'95 is the NBA record for Most Assists in a 7-game playoff series (by Magic Johnson of the Los Angeles Lakers in 1984).'

This API also allows for batch requests: http://numbersapi.com#batching

In [6]:
number_range = "1..10"

endpoint = f'http://numbersapi.com/{number_range}'

response = requests.get(endpoint)

We can look at the response as text.

In [7]:
response.text

'{\n "1": "1 is the loneliest number.",\n "2": "2 is the first magic number in physics.",\n "3": "3 is number of performers in a trio.",\n "4": "4 is the number of bits in a nibble, equivalent to half a byte.",\n "5": "5 is the number of babies born in a quintuplet.",\n "6": "6 is the number of ponies in the main cast of My Little Pony: Friendship is Magic.",\n "7": "7 is the number of periods, or horizontal rows of elements, in the periodic table.",\n "8": "8 is the number of furlongs in a mile.",\n "9": "9 is the number of circles of Hell in Dante\'s Divine Comedy.",\n "10": "10 is the number of years in a decade."\n}'

However, it will be easier to work with as a json. We can use the `json` method to convert the results to a dictionary.

In [8]:
res = response.json()
res

{'1': '1 is the loneliest number.',
 '2': '2 is the first magic number in physics.',
 '3': '3 is number of performers in a trio.',
 '4': '4 is the number of bits in a nibble, equivalent to half a byte.',
 '5': '5 is the number of babies born in a quintuplet.',
 '6': '6 is the number of ponies in the main cast of My Little Pony: Friendship is Magic.',
 '7': '7 is the number of periods, or horizontal rows of elements, in the periodic table.',
 '8': '8 is the number of furlongs in a mile.',
 '9': "9 is the number of circles of Hell in Dante's Divine Comedy.",
 '10': '10 is the number of years in a decade.'}

Then, we can access the individual entries by passing in the correct key.

In [9]:
res['5']

'5 is the number of babies born in a quintuplet.'

Finally, notice that we can ask for a fact about a random number.

In [10]:
endpoint = 'http://numbersapi.com/random'

response = requests.get(endpoint)

response.text

'270 is the average number of days in human pregnancy.'

We can specify a minimum and maximum for these random numbers: http://numbersapi.com#min-and-max

#### Parameters

Parameters are specific to each API and indicate what information you want back. These can be compared to the various ways you slice a table or df to get just the subset you want. Some parameters are required, others are optional. Always look at the documentation to know what parameters you should include and what are possible values for each one. When using parameters for an API call, you can do the following:

1. Make an empty dictionary for the `params` variable
2. Look at the documentation to know what parameters you should include, add these as **keys** to the dictionary
3. Add the appropriate values for each parameter as the **values** for the dictionary

For example, let's get a fact about a random number between 500 and 600

In [11]:
endpoint = 'http://numbersapi.com/random'

params = {
    'min': 500,
    'max': 600
}

response = requests.get(endpoint, params = params)

response.text

'536 is the number of ways to arrange the pieces of the stomachion puzzle into a square, not counting rotation or reflection.'

### NASA API and API Keys

Now, let's work with the NASA API: https://api.nasa.gov/

One of the main ways APIs maintain security is by the use of some form of authentication, such as an API key. An API key can be obtained in a number of ways, depending on the API, and is a way for the application to know who you are and provides you secure access to the data.

To work with the NASA API, you'll need to create an API key.

1. Scroll down and enter your First Name, Last Name, and email to generate an API key
2. Copy the API key into the keys.json file.

**DO NOT SHARE YOUR API KEYS OR PUT THEM IN A PUBLIC PLACE LIKE GITHUB**

API keys should be stored securely on your computer and removed from any code or documents you share.

Now, we can safely load your key into a variable using the json library.

In [12]:
import json

In [13]:
with open('keys.json') as fi:
    credentials = json.load(fi)

In [14]:
api_key = credentials['api_key']

1. Look at the different available APIs in the `Browse APIs` tab
2. Click on the **Asteroids - NeoWs**
3. Under **Neo - Feed**, copy the second line into the endpoint variable below as a string and delete the last `?`

In [15]:
endpoint = ''

Fill in the parameters dictionary below to retrieve information on all NeoWs between January 1, 2022 and January 7, 2022. (Be sure the include your api key as a parameter).

In [16]:
params = {

}

You now have all the pieces to make an API request

In [17]:
response = requests.get(endpoint, params = params)

MissingSchema: Invalid URL '': No scheme supplied. Perhaps you meant https://?

See what was saved to `response`

In [None]:
response

This API returns the results as a json, so we'll access them using the `json` method.

In [None]:
res = response.json()
res

In [None]:
res.keys()

The information that we're interested in is located under `near_earth_objects`.

In [None]:
res['near_earth_objects']

In [None]:
res['near_earth_objects'].keys()

**Question:** How many near earth objects were there on January 3?

In [None]:
# Your Code Here

**Question:** Is the first returned result for January 3 potentially hazardous (as indicated by the `is_potentially_hazardous_asteroid` field)?

In [None]:
# Your Code Here

**Question:** What was the relative velocity, in miles per hour of the first object returned for January 3?

In [None]:
# Your Code Here

The for loop below iterates over the data returned and pulls out information for each asteroid. It then saves the information to lists, that are used for making a scatter plot of the asteroids.

In [None]:
max_diam = []
hazardous = []
miss_dist = []
for day, objs in res['near_earth_objects'].items():
    for obj in objs:
        max_diam.append(float(obj['estimated_diameter']['miles']['estimated_diameter_max']))
        hazardous.append(obj['is_potentially_hazardous_asteroid'])
        miss_dist.append(float(obj['close_approach_data'][0]['miss_distance']['miles']))

plt.figure(figsize = (17, 10))
plt.scatter(max_diam, miss_dist, c = hazardous)
plt.xlabel('max diameter (miles)')
plt.ylabel('miss distance (miles)');

If you want to work with the response from an API using _pandas_, you'll want to convert it to a DataFrame. In some circumstances, you can easily convert a json to a DataFrame, but in other cases, you have to do a little bit of work.

In [None]:
import pandas as pd

The easiest case is when you have a list of dictionaries. Here, you can simply use the `DataFrame` constructor. Let's see how this works using one of the days. If you wanted to get all of the results into a single DataFrame, you could iterate through and concatenate. 

In [None]:
pd.DataFrame(response.json()['near_earth_objects']['2022-01-07']).head(2)

You'll notice that we still have dictionaries in some of the columns. This can be remedied using the `json_normalize` function.

In [None]:
pd.json_normalize(response.json()['near_earth_objects']['2022-01-07']).head(2)

This almost does it, but the `close_approach_data` column contains a list, which `json_normalize` can't handle. To fix this, we can use the `explode` method which will unpack the list across multiple columns, if needed.

In [None]:
response_df = pd.json_normalize(response.json()['near_earth_objects']['2022-01-07'])
response_df.explode('close_approach_data').head(2)

Once exploded, you can use the `json_normalize` function again.

In [None]:
pd.json_normalize(response_df.explode('close_approach_data')['close_approach_data']).head(2)

And finally, you can concatenate the two pieces together.

In [None]:
pd.concat([
    response_df.explode('close_approach_data').drop(columns = ['close_approach_data']),
    pd.json_normalize(response_df.explode('close_approach_data')['close_approach_data'])
], axis = 1).head(2)

Let's try another `endpoint` from NASA. This time copy the endpoint from the **APOD** (Astronomy Picture of the Day) section.

Fill in the endpoint and parameters in order to retrieve the image for January 1, 2019.

In [None]:
endpoint = ''

params = {

}

In [None]:
response = requests.get(endpoint, params = params)

In [None]:
response

In [None]:
response.json()

Finally, let's grab the image url so that we can retrieve the actual image.

In [None]:
image_response = requests.get(response.json()['url'])

For image responses, we don't want to look at the text or json, but instead take the content. We'll now use the `.content` attribute from the response to render an image.

In [None]:
from IPython.display import Image

In [None]:
Image(image_response.content)

In [18]:
endpoint = 'https://data.nashville.gov/resource/fuaa-r5cm.json'

In [21]:
response = requests.get(endpoint)

In [27]:
response.text



In [65]:
from bs4 import BeautifulSoup as BS

In [36]:
import pandas as pd

In [86]:
end = 'NY.GDP.PCAP.PP.KD'

In [87]:
endpoint = 'http://api.worldbank.org/v2/country/all/indicator/'

In [106]:
params ={
    'format' : 'json',
    'per_page' : '20000'
}

In [107]:
response  = requests.get(endpoint + end, params = params)

In [108]:
df_1 = pd.json_normalize(response.json()[1])

In [109]:
df_1

Unnamed: 0,countryiso3code,date,value,unit,obs_status,decimal,indicator.id,indicator.value,country.id,country.value
0,AFE,2023,4047.007031,,,0,NY.GDP.PCAP.PP.KD,"GDP per capita, PPP (constant 2021 internation...",ZH,Africa Eastern and Southern
1,AFE,2022,4038.638689,,,0,NY.GDP.PCAP.PP.KD,"GDP per capita, PPP (constant 2021 internation...",ZH,Africa Eastern and Southern
2,AFE,2021,3994.171654,,,0,NY.GDP.PCAP.PP.KD,"GDP per capita, PPP (constant 2021 internation...",ZH,Africa Eastern and Southern
3,AFE,2020,3919.499230,,,0,NY.GDP.PCAP.PP.KD,"GDP per capita, PPP (constant 2021 internation...",ZH,Africa Eastern and Southern
4,AFE,2019,4130.057222,,,0,NY.GDP.PCAP.PP.KD,"GDP per capita, PPP (constant 2021 internation...",ZH,Africa Eastern and Southern
...,...,...,...,...,...,...,...,...,...,...
17019,ZWE,1964,,,,0,NY.GDP.PCAP.PP.KD,"GDP per capita, PPP (constant 2021 internation...",ZW,Zimbabwe
17020,ZWE,1963,,,,0,NY.GDP.PCAP.PP.KD,"GDP per capita, PPP (constant 2021 internation...",ZW,Zimbabwe
17021,ZWE,1962,,,,0,NY.GDP.PCAP.PP.KD,"GDP per capita, PPP (constant 2021 internation...",ZW,Zimbabwe
17022,ZWE,1961,,,,0,NY.GDP.PCAP.PP.KD,"GDP per capita, PPP (constant 2021 internation...",ZW,Zimbabwe


In [116]:
ender = 'SP.DYN.LE00.IN'

params = {
    'format' : 'json',
    'per_page' : '20000'
}
    

In [117]:
response = requests.get(endpoint + ender, params = params)

In [119]:
df_2 = pd.json_normalize(response.json()[1])

In [146]:
all_data = pd.merge(df_1, df_2, on = ['date', 'country.id', 'country.value'])

In [147]:
all_data

Unnamed: 0,countryiso3code_x,date,value_x,unit_x,obs_status_x,decimal_x,indicator.id_x,indicator.value_x,country.id,country.value,countryiso3code_y,value_y,unit_y,obs_status_y,decimal_y,indicator.id_y,indicator.value_y
0,AFE,2023,4047.007031,,,0,NY.GDP.PCAP.PP.KD,"GDP per capita, PPP (constant 2021 internation...",ZH,Africa Eastern and Southern,AFE,,,,0,SP.DYN.LE00.IN,"Life expectancy at birth, total (years)"
1,AFE,2022,4038.638689,,,0,NY.GDP.PCAP.PP.KD,"GDP per capita, PPP (constant 2021 internation...",ZH,Africa Eastern and Southern,AFE,62.899031,,,0,SP.DYN.LE00.IN,"Life expectancy at birth, total (years)"
2,AFE,2021,3994.171654,,,0,NY.GDP.PCAP.PP.KD,"GDP per capita, PPP (constant 2021 internation...",ZH,Africa Eastern and Southern,AFE,62.454590,,,0,SP.DYN.LE00.IN,"Life expectancy at birth, total (years)"
3,AFE,2020,3919.499230,,,0,NY.GDP.PCAP.PP.KD,"GDP per capita, PPP (constant 2021 internation...",ZH,Africa Eastern and Southern,AFE,63.313860,,,0,SP.DYN.LE00.IN,"Life expectancy at birth, total (years)"
4,AFE,2019,4130.057222,,,0,NY.GDP.PCAP.PP.KD,"GDP per capita, PPP (constant 2021 internation...",ZH,Africa Eastern and Southern,AFE,63.755678,,,0,SP.DYN.LE00.IN,"Life expectancy at birth, total (years)"
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
17019,ZWE,1964,,,,0,NY.GDP.PCAP.PP.KD,"GDP per capita, PPP (constant 2021 internation...",ZW,Zimbabwe,ZWE,54.994000,,,0,SP.DYN.LE00.IN,"Life expectancy at birth, total (years)"
17020,ZWE,1963,,,,0,NY.GDP.PCAP.PP.KD,"GDP per capita, PPP (constant 2021 internation...",ZW,Zimbabwe,ZWE,54.549000,,,0,SP.DYN.LE00.IN,"Life expectancy at birth, total (years)"
17021,ZWE,1962,,,,0,NY.GDP.PCAP.PP.KD,"GDP per capita, PPP (constant 2021 internation...",ZW,Zimbabwe,ZWE,54.071000,,,0,SP.DYN.LE00.IN,"Life expectancy at birth, total (years)"
17022,ZWE,1961,,,,0,NY.GDP.PCAP.PP.KD,"GDP per capita, PPP (constant 2021 internation...",ZW,Zimbabwe,ZWE,53.619000,,,0,SP.DYN.LE00.IN,"Life expectancy at birth, total (years)"


In [122]:
endpoint ='http://api.worldbank.org/v2/country'

In [135]:
params = {
    'format' : 'json',
    'per_page' : '500'
}

In [136]:
response = requests.get(endpoint, params = params)

In [137]:
df_3 = pd.json_normalize(response.json()[1])

In [142]:
df_3 = df_3.loc[df_3['region.id'] != 'NA']

In [143]:
df_3

Unnamed: 0,id,iso2Code,name,capitalCity,longitude,latitude,region.id,region.iso2code,region.value,adminregion.id,adminregion.iso2code,adminregion.value,incomeLevel.id,incomeLevel.iso2code,incomeLevel.value,lendingType.id,lendingType.iso2code,lendingType.value
0,ABW,AW,Aruba,Oranjestad,-70.0167,12.5167,LCN,ZJ,Latin America & Caribbean,,,,HIC,XD,High income,LNX,XX,Not classified
2,AFG,AF,Afghanistan,Kabul,69.1761,34.5228,SAS,8S,South Asia,SAS,8S,South Asia,LIC,XM,Low income,IDX,XI,IDA
5,AGO,AO,Angola,Luanda,13.242,-8.81155,SSF,ZG,Sub-Saharan Africa,SSA,ZF,Sub-Saharan Africa (excluding high income),LMC,XN,Lower middle income,IBD,XF,IBRD
6,ALB,AL,Albania,Tirane,19.8172,41.3317,ECS,Z7,Europe & Central Asia,ECA,7E,Europe & Central Asia (excluding high income),UMC,XT,Upper middle income,IBD,XF,IBRD
7,AND,AD,Andorra,Andorra la Vella,1.5218,42.5075,ECS,Z7,Europe & Central Asia,,,,HIC,XD,High income,LNX,XX,Not classified
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
290,XKX,XK,Kosovo,Pristina,20.926,42.565,ECS,Z7,Europe & Central Asia,ECA,7E,Europe & Central Asia (excluding high income),UMC,XT,Upper middle income,IDX,XI,IDA
292,YEM,YE,"Yemen, Rep.",Sana'a,44.2075,15.352,MEA,ZQ,Middle East & North Africa,MNA,XQ,Middle East & North Africa (excluding high inc...,LIC,XM,Low income,IDX,XI,IDA
293,ZAF,ZA,South Africa,Pretoria,28.1871,-25.746,SSF,ZG,Sub-Saharan Africa,SSA,ZF,Sub-Saharan Africa (excluding high income),UMC,XT,Upper middle income,IBD,XF,IBRD
294,ZMB,ZM,Zambia,Lusaka,28.2937,-15.3982,SSF,ZG,Sub-Saharan Africa,SSA,ZF,Sub-Saharan Africa (excluding high income),LMC,XN,Lower middle income,IDX,XI,IDA


In [150]:
new_df = pd.merge(all_data, df_3, left_on = 'country.value', right_on = 'name', how = 'inner')

In [151]:
new_df['country.value'].value_counts()

country.value
Afghanistan      64
Pakistan         64
Nepal            64
Netherlands      64
New Caledonia    64
                 ..
Greece           64
Greenland        64
Grenada          64
Guam             64
Zimbabwe         64
Name: count, Length: 217, dtype: int64

In [156]:
new_df.columns

Index(['countryiso3code_x', 'date', 'value_x', 'unit_x', 'obs_status_x',
       'decimal_x', 'indicator.id_x', 'indicator.value_x', 'country.id',
       'country.value', 'countryiso3code_y', 'value_y', 'unit_y',
       'obs_status_y', 'decimal_y', 'indicator.id_y', 'indicator.value_y',
       'id', 'iso2Code', 'name', 'capitalCity', 'longitude', 'latitude',
       'region.id', 'region.iso2code', 'region.value', 'adminregion.id',
       'adminregion.iso2code', 'adminregion.value', 'incomeLevel.id',
       'incomeLevel.iso2code', 'incomeLevel.value', 'lendingType.id',
       'lendingType.iso2code', 'lendingType.value'],
      dtype='object')

In [157]:
df_3.columns

Index(['id', 'iso2Code', 'name', 'capitalCity', 'longitude', 'latitude',
       'region.id', 'region.iso2code', 'region.value', 'adminregion.id',
       'adminregion.iso2code', 'adminregion.value', 'incomeLevel.id',
       'incomeLevel.iso2code', 'incomeLevel.value', 'lendingType.id',
       'lendingType.iso2code', 'lendingType.value'],
      dtype='object')

In [158]:
pd.merge(new_df, df_3, on = 'iso2Code')

Unnamed: 0,countryiso3code_x,date,value_x,unit_x,obs_status_x,decimal_x,indicator.id_x,indicator.value_x,country.id,country.value,...,region.value_y,adminregion.id_y,adminregion.iso2code_y,adminregion.value_y,incomeLevel.id_y,incomeLevel.iso2code_y,incomeLevel.value_y,lendingType.id_y,lendingType.iso2code_y,lendingType.value_y
0,AFG,2023,,,,0,NY.GDP.PCAP.PP.KD,"GDP per capita, PPP (constant 2021 internation...",AF,Afghanistan,...,South Asia,SAS,8S,South Asia,LIC,XM,Low income,IDX,XI,IDA
1,AFG,2022,1955.212904,,,0,NY.GDP.PCAP.PP.KD,"GDP per capita, PPP (constant 2021 internation...",AF,Afghanistan,...,South Asia,SAS,8S,South Asia,LIC,XM,Low income,IDX,XI,IDA
2,AFG,2021,2138.870247,,,0,NY.GDP.PCAP.PP.KD,"GDP per capita, PPP (constant 2021 internation...",AF,Afghanistan,...,South Asia,SAS,8S,South Asia,LIC,XM,Low income,IDX,XI,IDA
3,AFG,2020,2776.561521,,,0,NY.GDP.PCAP.PP.KD,"GDP per capita, PPP (constant 2021 internation...",AF,Afghanistan,...,South Asia,SAS,8S,South Asia,LIC,XM,Low income,IDX,XI,IDA
4,AFG,2019,2933.958598,,,0,NY.GDP.PCAP.PP.KD,"GDP per capita, PPP (constant 2021 internation...",AF,Afghanistan,...,South Asia,SAS,8S,South Asia,LIC,XM,Low income,IDX,XI,IDA
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
13883,ZWE,1964,,,,0,NY.GDP.PCAP.PP.KD,"GDP per capita, PPP (constant 2021 internation...",ZW,Zimbabwe,...,Sub-Saharan Africa,SSA,ZF,Sub-Saharan Africa (excluding high income),LMC,XN,Lower middle income,IDB,XH,Blend
13884,ZWE,1963,,,,0,NY.GDP.PCAP.PP.KD,"GDP per capita, PPP (constant 2021 internation...",ZW,Zimbabwe,...,Sub-Saharan Africa,SSA,ZF,Sub-Saharan Africa (excluding high income),LMC,XN,Lower middle income,IDB,XH,Blend
13885,ZWE,1962,,,,0,NY.GDP.PCAP.PP.KD,"GDP per capita, PPP (constant 2021 internation...",ZW,Zimbabwe,...,Sub-Saharan Africa,SSA,ZF,Sub-Saharan Africa (excluding high income),LMC,XN,Lower middle income,IDB,XH,Blend
13886,ZWE,1961,,,,0,NY.GDP.PCAP.PP.KD,"GDP per capita, PPP (constant 2021 internation...",ZW,Zimbabwe,...,Sub-Saharan Africa,SSA,ZF,Sub-Saharan Africa (excluding high income),LMC,XN,Lower middle income,IDB,XH,Blend
