# Tapping APIs

A site with an API (Application Programming Interface) wants you to have the data it holds.

When we enter a URL in a browser, we typically get back a web page - a formatted document designed for people to read. APIs also use URLs, but rather than delivering visually formatted pages, API URLs deliver structured data that computer programs can easily interpret and process. These specialized API URLs are known as endpoints. Similar to how multiple web pages combine to form a website, multiple endpoints combine to create a complete API. We will learn to construct API calls.

Examples abound:

1. <a href="https://www.census.gov/data/developers/data-sets.html">U.S. Census APIs</a>
2. <a href="https://apps.fas.usda.gov/opendataweb/home">US Agriculture Commodities and Exports</a>
3. <a href="https://www.federalregister.gov/developers/documentation/api/v1">Federal Register</a>
4. <a href="https://www.eia.gov/">U.S. Energy Information Department</a>
5. <a href="https://www.usaspending.gov/">U.S.A. Spending</a>


Government sites do provide ```CSVs``` for download but their APIs offer more detailed options for data. They are not trying to hide the data.

Private sites might have APIs, but often charge hefty prices for access beyond a basic number of downloads.

The hardest part of tapping APIs is that they ***ALL HAVE DIFFERENT INSTRUCTIONS*** on how to download their data. We'll <a href="https://docs.google.com/presentation/d/158fsOq5FF2qE3dONtwNeePSJyzLRKENlCxRsFPN_ZPg/edit?usp=sharing">learn the best approach to understanding API documentation</a>, including using AI to deconstruct the technical requirements.

In this module, we'll explore different APIs that each build a different skill:

1. Census health data – **building a simple API call.**
2. USDA commodities exports – **using an API key and targeting specific commodities over several years.**
3. Federal Register – **tapping search terms.**
4. Energy Information Administration – **dealing with pagination.**
5. US Spending – **working with rate limits.**

What they all have in common:

1. a base url
2. a query string
3. tied together with query characters ```?``` and ```&```.
4. an API key (<a href="https://docs.google.com/presentation/d/17Cez665ZoDJQaQGdwbY_girCoCgF71Y77aADn88Sv6Q/edit?usp=sharing">different types of authentication</a>)

Combined together these are known as an ```API endpoint```.

You make an ```API call``` (a request) using the ```API endpoint```.




In [None]:
## import libraries




## 1. Census health data – **building a simple API call.**

- <a href="https://www.census.gov/data/developers/data-sets/Health-Insurance-Statistics.html">Census health landing page</a>
- List of <a href="https://api.census.gov/data/timeseries/healthins/sahie/variables.html">possible variables</a>

We want to create a dataframe with the following info for every state in 2021:

1. Total number insured
2. Percent insured
3. Total number uninsured
4. Percent uninsured
5. by Race

## Using API key

<a href="https://api.census.gov/data/key_signup.html">Sign up for an api key</a>

Turns out that the US Census API isn't actually required, but it's best to get one for the following reasons:

1. **Without a key**, your entire IP address is limited to 500 requests per day. 
2. **With a key:** you get 500 requests per day per key (but you can register multiple keys)!

Safeguard your API Key and do not share them. It's good practice to store them in password management in a secure document. I keep all my API keys in a single document.



## Parts to build your API call.

In [None]:
## create a dictionary to know what codes mean
## all variables are 90% upper bound


In [None]:
## we can see just the keys in the dictionary


In [None]:
## turn it into a list of keys



In [None]:
## pull values into a list



In [None]:
## format into api query format


In [None]:
## other targets


In [None]:
## base url + year



In [None]:
## create query string


**Set the key secretly in your notebook:**

We will use `python-dotenv`, a professional way to hide your API key used at Bloomberg and other places.

```python
     from dotenv import load_dotenv
     load_dotenv()  # Reads from .env file 
     import os   ## to handle environment variables
```
You need to ```pip install dotenv``` first.


7. Create a file named ```.env``` using VSCode. **Note:** once you save it and close it, you won't see it anymore.


In [None]:
pip install python-dotenv

In [None]:
## pull secret API key into your notebook


In [None]:
## build api parameter


In [None]:
## create full API call


In [None]:
## get response


In [None]:
## turn response into json


In [None]:
## create dataframe


In [None]:
## rename column headers with more meaning full headers


In [None]:
## iterate throught several years


In [None]:
## length


In [None]:
## place in df


### 2. USDA commodities exports – **using an API key and targeting specific commodities over several years.**

- <a href="https://apps.fas.usda.gov/opendataweb/home">USDA APIs endpoints</a>
- Get an <a href="https://apps.fas.usda.gov/opendataweb/home">API key</a>

We want to create a dataframe with exports between 2020-2022 to all countries for the following commodities:

1. All Wheat
2. Oats
3. Cuts of Beef
4. Cuts of Pork

In [None]:
## load api key secretly


In [None]:
## find the parts to build your API call

headers = {
    'accept': 'application/json',
    'X-Api-Key': ADD HERE
}


In [None]:
## TARGET COMMODITIES

In [None]:
#BUILD base and end url


In [None]:
## iterate through multi year


In [None]:
##SEE ALL DATA - DO NOT RUN THIS CELL
all_data

In [None]:
pd.DataFrame(all_data)

In [None]:
response.json()

In [None]:
headers = {
    'accept': 'application/json',
    'X-Api-Key': 'lx8NMObcTL9PSZSCZzJyykInlqLEx5WZZKtbGITw',
}

In [None]:
url = "https://api.fas.usda.gov/api/esr/exports/commodityCode/601/allCountries/marketYear/2023"

In [None]:


response = requests.get(url = url, headers=headers)

In [None]:
response.json()

In [None]:
response.json()

In [None]:
countries_response = requests.get("https://api.fas.usda.gov/api/esr/countries", headers=headers)
countries_response.json()

In [None]:
## now let's put into get requests
## we check the response status code
response = requests.get(url = com_url, headers = headers)
response.json()

### Get endpoint and test out on a single commodity


In [None]:
## your end point here


In [None]:
## now let's put into get requests
## we check the response status code


In [None]:
## let's store our response into an object called data


In [None]:
## convert that list of dicts into a dataframe called df


In [None]:
## Now iterate through all our target items



In [None]:
## endpoint templates



In [None]:
## iterate to get all the data


In [None]:
## call list


In [None]:
## concat into single df


In [None]:
## call df


In [None]:
## confirm we have all our target commodities


### 3. <a id="federal"></a>Federal Register – **tapping search terms.**

We have decades of <a href="https://docs.google.com/spreadsheets/d/130WeumbMZjcoRP4D-1uJ7bM0aKBZzt4N/edit?usp=sharing&ouid=112307892189798608417&rtpof=true&sd=true">SBA Excel files</a> that detail loans given to small businesses to recover after climate disasters. The only information we have about the type of disasters are codes in one of the columns that look like:

- CA-00279
- IL-00051
- NC-00099
- CA-00288
- LA-00079

The <a href="https://www.federalregister.gov/">Federal Register</a> allows us to search for what these codes stand for. But we can't search for nearly a thousand such disaster codes. When we try to scrape the site, it warns us to use the API instead.

Federal Register <a href="https://www.federalregister.gov/developers/documentation/api/v1#/Federal%20Register%20Documents/get_documents__format_">API documentation</a>

## find the end point

https://www.federalregister.gov/api/v1/documents.json?per_page=20&conditions[docket_id]=LA-00079

https://www.federalregister.gov/api/v1/documents.json?per_page=20&conditions[docket_id]=PA-00115

#### Test on single endpoint after figuring out how to build API call

In [None]:
## endpoint
endpoint = "https://www.federalregister.gov/api/v1/documents.json?per_page=20&conditions[docket_id]=PA-00115"

In [None]:
## get data


In [None]:
## length


In [None]:
## what type is it?


In [None]:
## GET COUNT


In [None]:
## GET RESULTS


In [None]:
## type


In [None]:
## LEN OF LIST


In [None]:
## NARROW OUR FIELD


In [None]:
## targeting incidents


### Iterate through entire list of codes

In [None]:
## Normally will take from df as a list
## build disaster code list


In [None]:
## provide base url


In [None]:
## ITERATE THROUGH ALL


In [None]:
## call list


In [None]:
## DATA FRAME IT
