in this notebook, I walk through how to download ACS ses data from CensusData package in python. At the bottom are some questions for next steps. -val

## helpful documentation: https://jtleider.github.io/censusdata/

# 1. Install CensusData Package

In [1]:
# will need to install every few hours, consider making a covidcrew env. to keep installed packages
!pip install CensusData

Collecting CensusData
  Downloading CensusData-1.8.tar.gz (23.2 MB)
[K     |████████████████████████████████| 23.2 MB 3.2 MB/s eta 0:00:01
Building wheels for collected packages: CensusData
  Building wheel for CensusData (setup.py) ... [?25ldone
[?25h  Created wheel for CensusData: filename=CensusData-1.8-py3-none-any.whl size=24706120 sha256=337b809ecbbb8de0de0963b7d687cfd7ac199796e3bd79f20a6be522a194c899
  Stored in directory: /home/jovyan/.cache/pip/wheels/eb/74/d3/75a737e0305a81270bd9a0129077c208a4334e3c202e9d4274
Successfully built CensusData
Installing collected packages: CensusData
Successfully installed CensusData-1.8


In [7]:
# We also need to import these two

import pandas as pd
import censusdata

# 2. Determine the data you want:

### To find the code for the marker you're interested, download and look at the excel sheet 'table shells' at https://www.census.gov/programs-surveys/acs/technical-documentation/summary-file-documentation.html

In [18]:
# here's what that table shells sheet looks like
table_shells = pd.read_excel('ACS2018_Table_Shells.xlsx')
table_shells[:10]

Unnamed: 0,Table ID,Line,UniqueID,Stub,Data Release
0,,,,,
1,B00001,,,UNWEIGHTED SAMPLE COUNT OF THE POPULATION,15.0
2,B00001,,,Universe: Total population,
3,B00001,1.0,B00001_001,Total,
4,,,,,
5,B00002,,,UNWEIGHTED SAMPLE HOUSING UNITS,15.0
6,B00002,,,Universe: Housing units,
7,B00002,1.0,B00002_001,Total,
8,,,,,
9,B01001,,,SEX BY AGE,15.0


In [24]:
markers_codes = ['B01001_001E', 'B01002_001E', 'B19013_001E', 'B02001_002E','B02001_003E','B02001_004E','B02001_005E','B02001_006E', 'B02001_007E','B02001_008E']

In [25]:
markers = ['population size', 'medium age', 'medium household income', 'White alone', 'Black or African American alone','American Indian and Alaska Native alone',
'Asian alone', 'Native Hawaiian and Other Pacific Islander alone','Some other race alone','Two or more races']


# 3. Determine the geographies you want

In [19]:
# find the state code for MA
censusdata.geographies(censusdata.censusgeo([('state', '*')]), 'acs5', 2015)

{'Alabama': censusgeo((('state', '01'),)),
 'Alaska': censusgeo((('state', '02'),)),
 'Arizona': censusgeo((('state', '04'),)),
 'Arkansas': censusgeo((('state', '05'),)),
 'California': censusgeo((('state', '06'),)),
 'Colorado': censusgeo((('state', '08'),)),
 'Connecticut': censusgeo((('state', '09'),)),
 'Delaware': censusgeo((('state', '10'),)),
 'District of Columbia': censusgeo((('state', '11'),)),
 'Florida': censusgeo((('state', '12'),)),
 'Georgia': censusgeo((('state', '13'),)),
 'Hawaii': censusgeo((('state', '15'),)),
 'Idaho': censusgeo((('state', '16'),)),
 'Illinois': censusgeo((('state', '17'),)),
 'Indiana': censusgeo((('state', '18'),)),
 'Iowa': censusgeo((('state', '19'),)),
 'Kansas': censusgeo((('state', '20'),)),
 'Kentucky': censusgeo((('state', '21'),)),
 'Louisiana': censusgeo((('state', '22'),)),
 'Maine': censusgeo((('state', '23'),)),
 'Maryland': censusgeo((('state', '24'),)),
 'Massachusetts': censusgeo((('state', '25'),)),
 'Michigan': censusgeo((('stat

### MA state code: 'Massachusetts': ('state', '25')

In [5]:
# find the city code for counties in MA
censusdata.geographies(censusdata.censusgeo([('state','25'),('county', '*')]), 'acs5', 2015)

{'Barnstable County, Massachusetts': censusgeo((('state', '25'), ('county', '001'))),
 'Berkshire County, Massachusetts': censusgeo((('state', '25'), ('county', '003'))),
 'Bristol County, Massachusetts': censusgeo((('state', '25'), ('county', '005'))),
 'Dukes County, Massachusetts': censusgeo((('state', '25'), ('county', '007'))),
 'Essex County, Massachusetts': censusgeo((('state', '25'), ('county', '009'))),
 'Franklin County, Massachusetts': censusgeo((('state', '25'), ('county', '011'))),
 'Hampden County, Massachusetts': censusgeo((('state', '25'), ('county', '013'))),
 'Hampshire County, Massachusetts': censusgeo((('state', '25'), ('county', '015'))),
 'Middlesex County, Massachusetts': censusgeo((('state', '25'), ('county', '017'))),
 'Nantucket County, Massachusetts': censusgeo((('state', '25'), ('county', '019'))),
 'Norfolk County, Massachusetts': censusgeo((('state', '25'), ('county', '021'))),
 'Plymouth County, Massachusetts': censusgeo((('state', '25'), ('county', '023'

### We're interested in Norfolk (021), Suffolk (025), and Middlesex (017)

# 4. Download data you want

### You can download data for the state of MA by county name 

In [28]:
# for Norfolk, county code = 021
acs_norfolk = censusdata.download('acs5', 2015,censusdata.censusgeo([('state', '25'), ('county','021')]), markers_codes)
acs_norfolk.columns=markers
acs_norfolk

Unnamed: 0,population size,medium age,medium household income,White alone,Black or African American alone,American Indian and Alaska Native alone,Asian alone,Native Hawaiian and Other Pacific Islander alone,Some other race alone,Two or more races
"Norfolk County, Massachusetts: Summary level: 050, state:25> county:021",687721,40.9,88262,555100,43069,564,66682,35,8096,14175


In [52]:
# I'm still learning to code so let's do this to work with the data easier:
data_norfolk = [687721, 40.9 ,88262 ,555100 ,43069 ,564 ,66682 ,35 ,8096 ,14175]
data_norfolk

[687721, 40.9, 88262, 555100, 43069, 564, 66682, 35, 8096, 14175]

In [29]:
# for Suffolk, county code = 025
acs_suffolk = censusdata.download('acs5', 2015,censusdata.censusgeo([('state', '25'), ('county','025')]), markers_codes)
acs_suffolk.columns=markers
acs_suffolk

Unnamed: 0,population size,medium age,medium household income,White alone,Black or African American alone,American Indian and Alaska Native alone,Asian alone,Native Hawaiian and Other Pacific Islander alone,Some other race alone,Two or more races
"Suffolk County, Massachusetts: Summary level: 050, state:25> county:025",758919,32.2,55044,421489,169946,2593,65396,107,56190,43198


In [53]:
# I'm still learning to code so let's do this to work with the data easier:
data_suffolk = [758919 ,32.2 ,55044 ,421489 ,169946 ,2593 ,65396 ,107 ,56190 ,43198]
data_suffolk

[758919, 32.2, 55044, 421489, 169946, 2593, 65396, 107, 56190, 43198]

In [42]:
# for Middlesex, county code = 017
acs_middlesex = censusdata.download('acs5', 2015,censusdata.censusgeo([('state', '25'), ('county','017')]), markers_codes)
acs_middlesex.columns=markers
acs_middlesex

Unnamed: 0,population size,medium age,medium household income,White alone,Black or African American alone,American Indian and Alaska Native alone,Asian alone,Native Hawaiian and Other Pacific Islander alone,Some other race alone,Two or more races
"Middlesex County, Massachusetts: Summary level: 050, state:25> county:017",1556116,38.5,85118,1230158,75980,2074,163386,424,37162,46932


In [54]:
# I'm still learning to code so let's do this to work with the data easier:
data_middlesex = [1556116 ,38.5, 85118 ,1230158 ,75980 ,2074 ,163386 ,424 ,37162 ,46932]
data_middlesex

[1556116, 38.5, 85118, 1230158, 75980, 2074, 163386, 424, 37162, 46932]

## Make a table with the values you want

In [74]:
table_allcounties = pd.DataFrame({"Middlesex": data_middlesex , "Suffolk": data_suffolk, "Norfolk": data_norfolk })
table_allcounties = table_allcounties.T

table_allcounties.columns= markers
table_allcounties

Unnamed: 0,population size,medium age,medium household income,White alone,Black or African American alone,American Indian and Alaska Native alone,Asian alone,Native Hawaiian and Other Pacific Islander alone,Some other race alone,Two or more races
Middlesex,1556116.0,38.5,85118.0,1230158.0,75980.0,2074.0,163386.0,424.0,37162.0,46932.0
Suffolk,758919.0,32.2,55044.0,421489.0,169946.0,2593.0,65396.0,107.0,56190.0,43198.0
Norfolk,687721.0,40.9,88262.0,555100.0,43069.0,564.0,66682.0,35.0,8096.0,14175.0


# Questions:

1. how do we select just the cities grouped by North and South regions from this list?
- Joey providing county names for north and south locations
- consider weighted average of sorts 
2. How do we access demographic data (race/ethnicity? - the documentation I'm following suggests this data is published in the centennial census data? code: 'sf1'. 
- Pranjali will find code for demographic data, which may only be present at county level