# **Census API**

The purpose of this notebook is to familiarize ourselves with the Census API, found [here](https://www.census.gov/data/developers/data-sets/acs-1year.html).


## **Step 1: Set up API Key**<br>
To get access to the API, we need to request an API key [here](https://api.census.gov/data/key_signup.html). A few minutes after filling out this form, you will get an email with set up instructions.

## **Step 2: Set key value**<br>
Once you have the API key, set it down below. Make sure to keep this value private / delete it before uploading it anywhere.

In [None]:
# Keep this value SECRET

API_SECRET_KEY = 'PUT YOUR KEY HERE'

## **Step 3: ACS API Handbook**

Refernce this documentation to gain an understanding of how to use the API.

[ACS API Handbook](https://www.census.gov/content/dam/Census/library/publications/2020/acs/acs_api_handbook_2020_ch02.pdf)

Below we're going to make a sample request to get some data from DP03. In the URL variable, you can change the year, and whether you want acs1, acs3, or acs5. Keep in mind, that different acs values may have different years of availability. If you click [here](https://www.census.gov/data/developers/data-sets/acs-3year.html) you can see more information on ACS-3 and some sample calls to different tables.

For this tutorial, we will focus on the *Data Profile (DP)* table.

```
params = {
    'get': 'group(DP03), NAME',
    'for': 'state:11',
    'key': API_SECRET_KEY
}
```
The code above makes a request for the data in DP03, state 11. State 11 represents the District of Columbia. group(DP03) gives us all the variables & values for DP03, a full list of variables is available [here](https://api.census.gov/data/2013/acs/acs3/profile/variables.html). 

In [None]:
import requests

# Set the API endpoint URL and parameters
url = 'https://api.census.gov/data/2019/acs/acs5/profile'
params = {
    'get': 'group(DP03),NAME', # group(table_name) ie group(DP05)
    'for': 'state:11', # State 11 is District of Columbia
    'key': API_SECRET_KEY
}

# Make the API request and get the response
response = requests.get(url, params=params)

# Check if the response was successful
if response.status_code == 200:
    # Extract the JSON data from the response
    data = response.json()
    # Print the first row of the data, which contains the column names
    print(data[0])
    # Print the second row of the data, which contains the data for the District of Columbia
    print(data[1])
else:
    # If the response was not successful, print the status code
    print('Error: ' + str(response.status_code))

['NAME', 'DP03_0001E', 'DP03_0001EA', 'DP03_0001M', 'DP03_0001MA', 'DP03_0001PE', 'DP03_0001PEA', 'DP03_0001PM', 'DP03_0001PMA', 'DP03_0002E', 'DP03_0002EA', 'DP03_0002M', 'DP03_0002MA', 'DP03_0002PE', 'DP03_0002PEA', 'DP03_0002PM', 'DP03_0002PMA', 'DP03_0003E', 'DP03_0003EA', 'DP03_0003M', 'DP03_0003MA', 'DP03_0003PE', 'DP03_0003PEA', 'DP03_0003PM', 'DP03_0003PMA', 'DP03_0004E', 'DP03_0004EA', 'DP03_0004M', 'DP03_0004MA', 'DP03_0004PE', 'DP03_0004PEA', 'DP03_0004PM', 'DP03_0004PMA', 'DP03_0005E', 'DP03_0005EA', 'DP03_0005M', 'DP03_0005MA', 'DP03_0005PE', 'DP03_0005PEA', 'DP03_0005PM', 'DP03_0005PMA', 'DP03_0006E', 'DP03_0006EA', 'DP03_0006M', 'DP03_0006MA', 'DP03_0006PE', 'DP03_0006PEA', 'DP03_0006PM', 'DP03_0006PMA', 'DP03_0007E', 'DP03_0007EA', 'DP03_0007M', 'DP03_0007MA', 'DP03_0007PE', 'DP03_0007PEA', 'DP03_0007PM', 'DP03_0007PMA', 'DP03_0008E', 'DP03_0008EA', 'DP03_0008M', 'DP03_0008MA', 'DP03_0008PE', 'DP03_0008PEA', 'DP03_0008PM', 'DP03_0008PMA', 'DP03_0009E', 'DP03_0009EA', 'D

## **Step 4: Convert JSON to DataFrame**

The object that we got from this GET request is a [requests.Response](https://www.w3schools.com/python/ref_requests_response.asp) object. We see that it's possible to convert this response object into JSON.

```
data = response.json()
```

Once we have our JSON object, we're now able to convert it into a Pandas DataFrame.

In [None]:
import pandas as pd

df = pd.DataFrame(data)

# Set the first row as the column names
df.columns = df.iloc[0]
# Remove the first row
df = df[1:]

# Set the index to 'NAME' column
df.set_index('NAME', inplace=True)

# Replace -999999999 values with NaN
df.replace(-999999999, pd.NA, inplace=True)

# Convert numerical columns to numeric type
df = df.apply(pd.to_numeric, errors='ignore')

# Display the cleaned dataframe
df.reset_index()

Unnamed: 0,NAME,DP03_0001E,DP03_0001EA,DP03_0001M,DP03_0001MA,DP03_0001PE,DP03_0001PEA,DP03_0001PM,DP03_0001PMA,DP03_0002E,...,DP03_0137E,DP03_0137EA,DP03_0137M,DP03_0137MA,DP03_0137PE,DP03_0137PEA,DP03_0137PM,DP03_0137PMA,GEO_ID,state
0,"(District of Columbia, District of Columbia)",579127,,459,,579127,,-888888888,(X),407904,...,-888888888,(X),-888888888,(X),19.5,,0.7,,0400000US11,11


## **Step 5: Block Data**

In the previous section, we made a request for the totals within the "state" of the District of Columbia. Now we want to zoom in, and get data for each block. We will be switching tables to the 2020 Decennial Census (as DP03 doesn't appear to have block data).



In [None]:
# Set the API endpoint URL and parameters
url = 'https://api.census.gov/data/2020/dec/pl'
params = {
    'get': 'NAME',
    'for': 'block:*', # This gets all blocks
    'in' : 'state:11, county:*',
    'key': API_SECRET_KEY
}

# Make the API request and get the response
response = requests.get(url, params=params)

# Check if the response was successful
if response.status_code == 200:
    # Extract the JSON data from the response
    data = response.json()
    # Print the first row of the data, which contains the column names
    print(data[0])
    # Print the second row of the data, which contains the data for the District of Columbia
    print(data[1])
else:
    # If the response was not successful, print the status code
    print('Error: ' + str(response.status_code))

['NAME', 'state', 'county', 'tract', 'block']
['Block 2006, Block Group 2, Census Tract 90, District of Columbia, District of Columbia', '11', '001', '009000', '2006']


In [None]:
df = pd.DataFrame(data)
df.head()

Unnamed: 0,0,1,2,3,4
0,NAME,state,county,tract,block
1,"Block 2006, Block Group 2, Census Tract 90, Di...",11,001,009000,2006
2,"Block 2007, Block Group 2, Census Tract 90, Di...",11,001,009000,2007
3,"Block 2008, Block Group 2, Census Tract 90, Di...",11,001,009000,2008
4,"Block 2009, Block Group 2, Census Tract 90, Di...",11,001,009000,2009
