# US Census API Example

The following use case will demonstrate how to pull data from a US Census API request and plot it using Folium. Details about the US Census API can be found at https://www.census.gov/data/developers/guidance/api-user-guide.html. This use case focuses on the American Community Survey (ACS) https://www.census.gov/programs-surveys/acs, which is a survey conducted by the US Census which details housing and population counts for the nation. This information provides an important tool for communities to assess how they are changing. When people fill out the ACS form, they are helping to ensure that decisions about the future of their community can be made using the best data available. Decision-makers require a clear picture of their population so that scarce resources can be allocated efficiently and effectively.

A US Census API Key is required to use this use case product. So go over to https://api.census.gov/data/key_signup.html and get your API key now! We'll be here when you get back.

## Requirements
folium  
geopandas  
requests  
json  
numpy  
getpass  
US Census API key from https://api.census.gov/data/key_signup.html

## Install packages

To begin building, we will install first install the Folium Python Package which we will use later on, as well as the GeoPandas Package. To install them both, we will use the "pip install" framework. The other required packages are already part of Python, so there is no reason to separately install them. 

In [1]:
!pip install folium -q
!pip install geopandas -q

## Import packages

Next to set-up the notebook we will call various packages and modules using the " import function". This will allow us to more seamlessly use the packages as needed throughout the notebook. Packages inclue NumPy (https://numpy.org/) and Pandas (https://pandas.pydata.org/).

In [2]:
import requests 
import numpy as np
import pandas as pd
import folium
import json
from getpass import getpass

## Enter API key

Below you will need to input the API Key that you received from the US Census website https://api.census.gov/data/key_signup.html. You will need a working Key to proceed through the rest of the use case. Do not share your key with any other individuals. We have used the Python "getpass" method here so you can enter your password without echoing so your neighbors can't see it. This key will be passed to the API later through the variable name "CENSUS_KEY".

In [3]:
CENSUS_KEY = getpass('Enter Census key: ')

Enter Census key:  ········································


## Make an API call

The following section will walk through how to create a Census API call. There are three key inputs before we make our request and that includes variables, year, and the API Key. The variables specify the information we would like to extract from our query. The variable can be changed to pull different population groups that differ on age, sex, and race. A table of the available variables are found here: https://api.census.gov/data/2019/acs/acs1/variables.html. For this use case example, we are looking at total population as well as African American population, both in the United States, which are the variables 'B01001_001E' and 'B02001_003E' respectively. Additionally we are looking at data from 2020. This information along with our Census API key will allow us to extract relevant data.

Using Python requests we are able to gather the data from the URL, and convert it to a JSON (Javascript Object Notation). This allows for easy data manipulation. 


In [4]:
census_variables = ('B01001_001E', 'B02001_003E')
year = 2020
url = (
    f"https://api.census.gov/data/{year}/acs/acs5?get=NAME,{','.join(census_variables)}"
    f"&for=state:*&key={CENSUS_KEY}"
)
response = requests.get(url)
columns = response.json()[0]

In [5]:
df = pd.read_json(response.text)
df

Unnamed: 0,0,1,2,3
0,NAME,B01001_001E,B02001_003E,state
1,Pennsylvania,12794885,1419582,42
2,California,39346023,2250962,06
3,West Virginia,1807426,64285,54
4,Utah,3151239,38059,49
5,New York,19514849,3002401,36
6,District of Columbia,701974,318631,11
7,Alaska,736990,23894,02
8,Florida,21216924,3381061,12
9,South Carolina,5091517,1346560,45


## Put JSON data into a Pandas dataframe

Once we have the data stores as a JSON, we can convert them to a Pandas Data Frame, to allow for more human readable understanding. The columns are renated to delineate the state name, the total USA population for each state, African American population, as well as the State ID (determined by the US Census). 
#### We convert the data frame columns to numeric values( strings to floats). This will allow us to divide the two columns to determine the percentage of African American population in each state. 

In [6]:
df = pd.DataFrame(response.json()[1:]).rename(columns={0: 'NAME', 1: 'total_pop', 2: 'aa_pop', 3: 'state_id'})
df['total_pop'] = pd.to_numeric(df['total_pop'])
df['aa_pop'] = pd.to_numeric(df['aa_pop'])
df['aa_pct'] = (df['aa_pop'] / df['total_pop'] * 100).round()

df

Unnamed: 0,NAME,total_pop,aa_pop,state_id,aa_pct
0,Pennsylvania,12794885,1419582,42,11.0
1,California,39346023,2250962,6,6.0
2,West Virginia,1807426,64285,54,4.0
3,Utah,3151239,38059,49,1.0
4,New York,19514849,3002401,36,15.0
5,District of Columbia,701974,318631,11,45.0
6,Alaska,736990,23894,2,3.0
7,Florida,21216924,3381061,12,16.0
8,South Carolina,5091517,1346560,45,26.0
9,North Dakota,760394,23959,38,3.0


## Map Creation- Step 1

Next, using our data frame we will create a map of the United States, that lists the total population, African American population, and the perecentage of African American population. 

To create our map we will need the state outlines and locations. For accuracy, we can query this data directly from the Census website. The Census provides shape files for the outline of the 50 states. We will temporarily download the files and use them to build our map. The folder with the shape file will be deleted afterwards by the code. The shape files is transformed into a json, and added onto the previous constructed data frame. 

In [7]:
import requests

shape_zip = requests.get('https://www2.census.gov/geo/tiger/GENZ2018/shp/cb_2018_us_state_500k.zip').content

from tempfile import TemporaryDirectory

import geopandas as gpd

with TemporaryDirectory() as temp_dir:
    with open(f"{temp_dir}/states.zip", "wb") as zip_file:
        zip_file.write(shape_zip)
    
    with open(f"{temp_dir}/states.zip", "rb") as zip_file:
        states_gdf = gpd.read_file(zip_file)
        #states_gdf.rename(columns={5: 'state'})

states_json = states_gdf.merge(df, on="NAME").to_json()

## Map Creation- Step 2
Then using the choropleth function in Folium, we are able to create a heat map for the state populations from the census data. 
We will set the map bounds for the contigous United States to prevent zooming to other locations. 
The Choropleth is set up to provide a heat map, based on the  percentage of African American people in each state. This is set through the columns and key_on feature. Lastly to allow the hovering mechancis we can use a folium feature called tool tip. 

In [8]:
pop_map = folium.Map(tiles= 'Stamen Terrain',height=500)

# Bounds for contiguous US - starting bounds for map
map_bounds = (
    (24.396308, -124.848974), (49.384358, -66.885444)
)
pop_map.fit_bounds(map_bounds)

cp = folium.Choropleth(
    geo_data=states_json,
    name="choropleth",
    data=df,
    columns=["NAME", "aa_pct"],
    key_on="feature.properties.NAME",
    fill_color="YlGn",
    fill_opacity=0.7,
    line_opacity=0.2,
    legend_name=f"Total Percent of African American/Black Population, {year}",
)
tooltip = folium.GeoJsonTooltip(
    fields=['NAME','aa_pct', 'aa_pop', 'total_pop'],
    aliases=['Name: ','African American pop %: ', 'African American Population', 'Total Population'],
)

tooltip.add_to(cp.geojson)
cp.add_to(pop_map)

display(pop_map)