# Median Income by Quantile Over Time

The recovery from the recession has looked very different for the bottom earners and the top-earners. And it's been very different in Bridgeport than it was nationally. In order to illustrate this, we made a graphic to track the percent change in mean income for four groups:

1.A The bottom 20% of income-earners in Bridgeport
1.B The top 5% of income-earners in Bridgeport
2.A The bottom 20% of income-earners nationally
2.B The top 20% of income-earners nationally

### ACS API

The mean income for each of these groups is available from the Census API from 2012-2016. We want the 1 year version of the American Commmunity Survey. Details on the API use can be found here: https://www.census.gov/data/developers/data-sets/acs-1year.html

In [35]:
import pandas,json,urllib,re,numpy,time

# API base url
base = "https://api.census.gov/data/"

#
# B19081_001E = mean income for the bottom 20% of income earners
# B19081_006E = mean income for the top 5% of income earners
#
fields = ["B19081_001E","B19081_006E"]

# Build a url for the given year, geography type, code, and fields
def construct_quant_url(year,geo_type,geo_code,quants):
    url = base + str(year)
    
    # The url structure is different for 2016 than it is for the other years because ¯\_(ツ)_/¯
    if (year == 2016):
        url += "/acs"
    url += "/acs1?get="
    for q in quants:
        url += q+"&"
    url+= "for="+geo_type
    if geo_code!='':
        url+=(":"+geo_code)
    return(url)

# Request & load data from a url
def url_to_data(url):
    req = urllib.request.Request(url)
    res = urllib.request.urlopen(req)
    data = json.load(res)
    return data

# Gets the API data and formats it as a list of lists.
#
# Sample output for make_long_data(2012,2014,'us','',quants=fields)
#
# [['B19081_001E', 'B19081_006E', 'geo', 'year'],
# ['11361', '319918', '1', 2012],
# ['11544', '339950', '1', 2013],
# ['11859', '346522', '1', 2014]]
#
def make_long_data(min_year,max_year,geo_type,geo_code,quants):
    tar = [fields+["geo","year"]]
    for year in range(min_year,max_year+1):
        url = construct_quant_url(year,geo_type,geo_code,quants)
        new_row = url_to_data(url)[1]
        new_row.append(year)
        tar.append(new_row)
    return tar

# Get data for our fields 2012-2016, for Bridgeport's MSA -- fips code 14860
bridgeport_data = make_long_data(2012,2016,"metropolitan%20statistical%20area/micropolitan%20statistical%20area","14860",fields)

# Get data for our fields 2012-2016, nationally -- don't need a fips code
national_data = make_long_data(2012,2016,"us","",fields)

In [None]:
print(bridgeport_data)
bp_df = pandas.DataFrame(bridgeport_data[1:],columns = bridgeport_data[0])
nat_df = pandas.DataFrame(national_data[1:],columns = national_data[0])

bp_df = bp_df.rename(columns={"B19081_001E":"20","B19081_006E":"95"})
nat_df = nat_df.rename(columns={"B19081_001E":"20","B19081_006E":"95"})

# Earlier Years

We want data going back to 2007. Luckily, the Census collected these numbers. Unluckily, they're not available through the API. We went through and found [TABLE TITLE] from American Fact-Finder for 2007-2011 (5 tables), downloaded the spreadsheets, and 