# Census API Metric Codes

| Race | Code |
|------|------|
| Total|B03002_001E|
| Black|B03002_004E|
| Asian|B03002_006E|
| Native Hawaiian Pacific Islander|B03002_007E|
| Other|B03002_008E|
| Hispanic or Latino|B03002_012E|
| 2 or More Races|B03002_010E|

| Citizenship / Immigration | Code |
|------|------|
| Foreign Born 1|B06007_033E|
| Foreign Born 2|B05002_013E|
| Not a u.s. Citizen|B05001_006E|
| Speak spanish, speak English less than very well|B06007_037E|
| Speak other, speak English less than very well|B06007_040E|

| Income | Code |
|------|------|
| Total income population|B19001_001E|
| Total income less than 10k|B19001_002E|
| Total income  10-15k|B19001_003E|

| Education | Code |
|------|------|
| Less than HS graduate |B07009_002E|
| High school graduate |B07009_003E|
| Some college or associate's degree |B07009_004E|
| Grad or professional degree |B07009_006E|

In [1]:
import pandas as pd
import numpy as np
import geopandas as gpd
import requests

Let's first check the availability of state data and store the missing url requests in a list

In [4]:
full_state_test = ["%.2d" % i for i in range(1,60)]
bad_apples = []
def state_checker(full_state_test):
    for i in full_state_test:
        url = ("https://api.census.gov/data/2015/acs5?get=NAME,B03002_001E"+
               "&for=tract:*&in=state:" + i + "&key=14ba39dd26088efd8d54c4f01d90023f2d4bfc6d")
        response_code = requests.get(url).status_code
        if response_code != 200:
            bad_apples.append([i, response_code])
state_checker(full_state_test)
print("These states return no content. Bad Apples :(\n", bad_apples)

These states return no content. Bad Apples :(
 [['00', 204], ['03', 204], ['07', 204], ['14', 204], ['43', 204], ['52', 204], ['57', 204], ['58', 204], ['59', 204]]


Arizona (3), Connecticut (7), Indiana (14), Texas (43) are seemingly missing. 
Values beyond 50 exist so we need to do some QA.

http://memorize.com/us-states-in-alphabetical-order

In [14]:
def pull_census(state, url_yes_no):
    url = ("https://api.census.gov/data/2015/acs5?get=NAME,B03002_001E,B03002_004E,B03002_006E," +
           "B03002_007E,B03002_008E,B03002_010E,B03002_012E," +
           "B06007_033E,B05002_013E,B05001_006E,B06007_037E,B06007_040E,B19001_001E,B19001_002E,B19001_003E," +
           "B07009_002E,B07009_003E,B07009_004E,B07009_006E" +
           "&for=tract:*&in=state:" + state + "&key=14ba39dd26088efd8d54c4f01d90023f2d4bfc6d")
    if url_yes_no:
        print(url)        
    html = requests.get(url).json()
    return html

In [16]:
#Make a master list range and remove the bad apples
master_list = ["%.2d" % i for i in range(1,60)]
master_list = [i for i in master_list if i not in [bad_apples[i][0] for i in range(len(bad_apples))]]

#Then stitch together all the data frames for the remaining dataset
for i in master_list:
    if i == "01":
        newstate = pull_census(i, False)
        master = pd.DataFrame(newstate, columns = newstate[0])[1:]
    elif i != "01":
        newstate = pull_census(i, False)
        master = master.append(pd.DataFrame(newstate, columns = newstate[0])[1:])

In [19]:
#Column Creation
master["GEOID"] = master['state'] + master['county'] + master['tract']
master["County Name"] = master["NAME"].str.split(",").str[1]
master["State Name"] = master["NAME"].str.split(",").str[2]

In [20]:
master.columns = ['Name', 'Total_Race', 'Black', 'Asian', 'Native_Hawaiian_Pacific_Islander', 'Other', 'Two_or_More_Races', 'Hispanic_or_Latino', 
                  'Foreign_Born_1', 'Foreign_Born_2', 'Not_a_us_Citizen', 'Speak_spanish_little_English', 'Speak_other_little_english',
                  'Total_income_population', 'Total_income_less_than_10k', 'Total_income_10-15k',
                  'Less_than_HS', 'HS_grad', 'College_grad', 'Graduate_or_professional', 
                  'state', 'county', 'tract', 'GEOID', 'County Name', 'State Name']

In [22]:
print("Dataframe Size", master.shape)

Dataframe Size (73056, 26)


In [21]:
master.head()

Unnamed: 0,Name,Total_Race,Black,Asian,Native_Hawaiian_Pacific_Islander,Other,Two_or_More_Races,Hispanic_or_Latino,Foreign_Born_1,Foreign_Born_2,...,Less_than_HS,HS_grad,College_grad,Graduate_or_professional,state,county,tract,GEOID,County Name,State Name
1,"Census Tract 201, Autauga County, Alabama",1948,150,12,0,0,0,17,45,45,...,184,459,258,176,1,1,20100,1001020100,Autauga County,Alabama
2,"Census Tract 202, Autauga County, Alabama",2156,1149,50,0,0,0,17,43,43,...,356,496,342,70,1,1,20200,1001020200,Autauga County,Alabama
3,"Census Tract 203, Autauga County, Alabama",2968,551,41,8,0,0,0,35,35,...,221,747,674,192,1,1,20300,1001020300,Autauga County,Alabama
4,"Census Tract 204, Autauga County, Alabama",4423,162,0,0,48,5,464,133,133,...,339,1044,806,257,1,1,20400,1001020400,Autauga County,Alabama
5,"Census Tract 205, Autauga County, Alabama",10763,2674,412,0,0,49,80,346,346,...,310,1674,1999,1162,1,1,20500,1001020500,Autauga County,Alabama
