### Obtaining data from US Census American Community Survey (ACS) 

Getting data from the 5-year estimates 2014-2018. I will focus only on the Oregon counties discussed with the Project Consultant. Multnomah, Lane, Marion, and Polk Counties. This is a massive dataset so I will start with something manageable for now.

The tutorial for the CensusData package can be found [HERE](https://towardsdatascience.com/accessing-census-data-with-python-3e2f2b56e20d).

In [4]:
import pandas as pd
import censusdata
pd.set_option('display.expand_frame_repr', False)
pd.set_option('display.precision', 2)

Now use the "Search" method of the CensusData package


In [6]:
sample = censusdata.search('acs5', 2018,'concept', 'sex')
print(len(sample))

6755


In [7]:
print(sample[0])

('B01001A_001E', 'SEX BY AGE (WHITE ALONE)', 'Estimate!!Total')


In [9]:
print(sample[4])

('B01001A_005E', 'SEX BY AGE (WHITE ALONE)', 'Estimate!!Total!!Male!!10 to 14 years')


In [23]:
censusdata.search('acs5', 2018, 'label', 'median income', 'detail')[0:100]

[('B06011PR_001E',
  'MEDIAN INCOME IN THE PAST 12 MONTHS (IN 2018 INFLATION-ADJUSTED DOLLARS) BY PLACE OF BIRTH IN PUERTO RICO',
  'Estimate!!Median income in the past 12 months --!!Total'),
 ('B06011PR_002E',
  'MEDIAN INCOME IN THE PAST 12 MONTHS (IN 2018 INFLATION-ADJUSTED DOLLARS) BY PLACE OF BIRTH IN PUERTO RICO',
  'Estimate!!Median income in the past 12 months --!!Total!!Born in Puerto Rico'),
 ('B06011PR_003E',
  'MEDIAN INCOME IN THE PAST 12 MONTHS (IN 2018 INFLATION-ADJUSTED DOLLARS) BY PLACE OF BIRTH IN PUERTO RICO',
  'Estimate!!Median income in the past 12 months --!!Total!!Born in other state of the United States'),
 ('B06011PR_004E',
  'MEDIAN INCOME IN THE PAST 12 MONTHS (IN 2018 INFLATION-ADJUSTED DOLLARS) BY PLACE OF BIRTH IN PUERTO RICO',
  'Estimate!!Median income in the past 12 months --!!Total!!Native; born elsewhere'),
 ('B06011PR_005E',
  'MEDIAN INCOME IN THE PAST 12 MONTHS (IN 2018 INFLATION-ADJUSTED DOLLARS) BY PLACE OF BIRTH IN PUERTO RICO',
  'Estimate!!Me

I'm finding it easier to paruse the parent table directory on the U.S. Census website.
B00001_001E
B00002_001E
B01001_001E
B01001_002E
B01001_026E
B01002_001E
B01002_002E
B01002_003E
B01003_001E
DP05_0002E
DP05_0002PE
DP05_0003E
DP05_0003PE
DP05_0018E
DP05_0018PE
S1901
S1903
B19013

In [24]:
# Now add geography

states = censusdata.geographies(censusdata.censusgeo([('state', '*')]), 'acs5', 2018)

In [25]:
print(states)

{'Minnesota': censusgeo((('state', '27'),)), 'Mississippi': censusgeo((('state', '28'),)), 'Missouri': censusgeo((('state', '29'),)), 'Montana': censusgeo((('state', '30'),)), 'Nebraska': censusgeo((('state', '31'),)), 'Nevada': censusgeo((('state', '32'),)), 'New Hampshire': censusgeo((('state', '33'),)), 'New Jersey': censusgeo((('state', '34'),)), 'New Mexico': censusgeo((('state', '35'),)), 'New York': censusgeo((('state', '36'),)), 'North Carolina': censusgeo((('state', '37'),)), 'North Dakota': censusgeo((('state', '38'),)), 'Ohio': censusgeo((('state', '39'),)), 'Oklahoma': censusgeo((('state', '40'),)), 'Oregon': censusgeo((('state', '41'),)), 'Pennsylvania': censusgeo((('state', '42'),)), 'Rhode Island': censusgeo((('state', '44'),)), 'South Carolina': censusgeo((('state', '45'),)), 'South Dakota': censusgeo((('state', '46'),)), 'Tennessee': censusgeo((('state', '47'),)), 'Texas': censusgeo((('state', '48'),)), 'Utah': censusgeo((('state', '49'),)), 'Vermont': censusgeo((('sta

In [26]:
print(states['Oregon'])

Summary level: 040, state:41


In [27]:
# Get counties
counties = censusdata.geographies(censusdata.censusgeo([('state', '41'), ('county', '*')]), 'acs5', 2018)

In [28]:
print(counties)

{'Marion County, Oregon': censusgeo((('state', '41'), ('county', '047'))), 'Jackson County, Oregon': censusgeo((('state', '41'), ('county', '029'))), 'Grant County, Oregon': censusgeo((('state', '41'), ('county', '023'))), 'Jefferson County, Oregon': censusgeo((('state', '41'), ('county', '031'))), 'Clackamas County, Oregon': censusgeo((('state', '41'), ('county', '005'))), 'Linn County, Oregon': censusgeo((('state', '41'), ('county', '043'))), 'Tillamook County, Oregon': censusgeo((('state', '41'), ('county', '057'))), 'Baker County, Oregon': censusgeo((('state', '41'), ('county', '001'))), 'Josephine County, Oregon': censusgeo((('state', '41'), ('county', '033'))), 'Umatilla County, Oregon': censusgeo((('state', '41'), ('county', '059'))), 'Lincoln County, Oregon': censusgeo((('state', '41'), ('county', '041'))), 'Columbia County, Oregon': censusgeo((('state', '41'), ('county', '009'))), 'Sherman County, Oregon': censusgeo((('state', '41'), ('county', '055'))), 'Wasco County, Oregon'

In [None]:
#Multnomah County, Oregon': censusgeo((('state', '41'), ('county', '051')))
#'Lane County, Oregon': censusgeo((('state', '41'), ('county', '039')))
#'Polk County, Oregon': censusgeo((('state', '41'), ('county', '053')))
#'Marion County, Oregon': censusgeo((('state', '41'), ('county', '047')))
#'Deschutes County, Oregon': censusgeo((('state', '41'), ('county', '017')))

In [32]:
data = censusdata.download('acs5', 2018,
           censusdata.censusgeo([('state', '41'),
                                 ('county', '017','039','047','051','053'),
                                 ('block group', '*')]),
          ['B00001_001E', 'B00002_001E', 'B01001_001E',
           'B01001_002E', 'B01001_026E',
           'B01002_001E', 'B01002_002E',
           'B01002_003E', 'B01003_001E'])

In [33]:
print(data.head)

<bound method NDFrame.head of                                                     B00001_001E  B00002_001E  B01001_001E  B01001_002E  B01001_026E  B01002_001E  B01002_002E  B01002_003E  B01003_001E
Block Group 1, Census Tract 16, Deschutes Count...         50.0         20.0          874          586          288         32.9         31.0         57.0          874
Block Group 2, Census Tract 16, Deschutes Count...        100.0         50.0         2271         1089         1182         33.5         28.1         36.8         2271
Block Group 1, Census Tract 17, Deschutes Count...        150.0         60.0         3118         1374         1744         40.5         42.3         37.8         3118
Block Group 2, Census Tract 17, Deschutes Count...        100.0         50.0         2370         1201         1169         37.2         34.3         39.7         2370
Block Group 4, Census Tract 17, Deschutes Count...         90.0         40.0         2228          952         1276         32.8  

In [36]:
#save that dataframe to a CSV spreadsheet
data.to_csv('ACS_2018_2.csv', index=True)