Author: Dana Chermesh Reshef, DRAW Brooklyn<br>
April 2019

### _Digital CEQR -- #3_
# Data: Housing 2010, 2017; from:
- **_ACS 5-yesr estimates 2013-2017 using Census API_**
- **_ACS 5-yesr estimates 2006-2010 using Census API_**

----

A user guide for Census Data API:

# [Census Data API User Guide](https://www.census.gov/content/dam/Census/data/developers/api-user-guide/api-guide.pdf)

The Census Data API in an API that gives the public access to raw statistical data from various Census Bureau data
programs. In terms of space, we aggregate the data and usually associate them with a
certain Census geographic boundary/area defined by a FIPS code. 

## _get your API key from:_ 
https://api.census.gov/data/key_signup.html

**Recommended:** In order to keep your API key confidential, please save your API key in a .py file named **censusAPI.py** as follows:

```python
myAPI = 'XXXXXXXXXXXXXXX'
```
Then read into this notebook as in the following cell:
```python
from censusAPI import myAPI
```

### The complete list of all available datasets for the API is located here:
https://api.census.gov/data.html
### Examples for calling the API for different geography levels (2017):
https://api.census.gov/data/2017/acs/acs5/examples.html

In [41]:
# imports for reading in, munging and calculating data
import pandas as pd
import json
import requests 
import urllib
import numpy as np

# reading in my api key saved in censusAPI.py as
# myAPI = 'XXXXXXXXXXXXXXX'
# request an api key in: https://api.census.gov/data/key_signup.html
from censusAPI import myAPI

# Python 3 compatibility
from __future__ import print_function, division

# Spatial
import geopandas as gpd
import fiona
import shapely

# Plotting
import matplotlib.pylab as pl
import seaborn as sns
sns.set_style('whitegrid')

%pylab inline

Populating the interactive namespace from numpy and matplotlib


----
# Housing units 2017
### _data were obtained from the  ACS 2013-2017 5-year estimate, all counties in the US_
variables to be acquired:
- **B25001_001E** |	Total Housing Units (occupied+vacant)
- **B25003_002E** | Owner occupied
- **B25003_003E** | Renter occupied

In [42]:
#read in in the variables available. the info you need is in the 1year ACS data
url = "https://api.census.gov/data/2017/acs/acs5/variables.json"
resp = requests.request('GET', url)
aff1y = json.loads(resp.text)

In [43]:
#turning things into arrays to enable broadcasting
#Python3
affkeys = np.array(list(aff1y['variables'].keys()))

affkeys

array(['B21005_008E', 'B26108_060E', 'B27022_009E', ..., 'B26201_108E',
       'B99191_002E', 'B27002_048E'], dtype='<U14')

In [44]:
# keyword for POP estimates
totalHU = 'B25001_001E'
owner = 'B25003_002E'
renter = 'B25003_003E'

aff1y['variables'][totalHU]

{'attributes': 'B25001_001EA,B25001_001M,B25001_001MA',
 'concept': 'HOUSING UNITS',
 'group': 'B25001',
 'label': 'Estimate!!Total',
 'limit': 0,
 'predicateType': 'int'}

In [45]:
# HU2017 data for all counties in the US
totalHU17 = pd.read_json('https://api.census.gov/data/2017/acs/acs5?get='+
                         totalHU + ',' +
                         owner + ',' +
                         renter +',NAME&for=county:*&in=state:36')
totalHU17.columns = totalHU17.iloc[0]
totalHU17 = totalHU17[1:]

totalHU17['state'] = totalHU17['state'].apply(lambda x: '{0:0>2}'.format(x))
totalHU17['county'] = totalHU17['county'].apply(lambda x: '{0:0>3}'.format(x))
totalHU17['STCO'] = totalHU17[['state', 'county']].apply(lambda x: ''.join(x), axis=1)

totalHU17 = totalHU17.drop(['state', 'county'], axis=1)
totalHU17.columns = ['TotalHousing17', 'Owners17', 'renters17',
                     'Name', 'STCO']

print(totalHU17.shape)
totalHU17.head()

(62, 5)


Unnamed: 0,TotalHousing17,Owners17,renters17,Name,STCO
1,17465,9471,3068,"Schoharie County, New York",36095
2,524488,97658,397698,"Bronx County, New York",36005
3,206707,120606,65234,"Onondaga County, New York",36067
4,29004,16085,6450,"Fulton County, New York",36035
5,36352,21542,10138,"Clinton County, New York",36019


----

## Places

City-Suburbs by tenure

In [46]:
# HU2017 data for all counties in the US
placeHU17 = pd.read_json('https://api.census.gov/data/2017/acs/acs5?get='+
                         totalHU + ',' +
                         owner + ',' +
                         renter +',NAME&for=place:*&in=state:*')
placeHU17.columns = placeHU17.iloc[0]
placeHU17 = placeHU17[1:]

placeHU17['state'] = placeHU17['state'].apply(lambda x: '{0:0>2}'.format(x))
placeHU17['place'] = placeHU17['place'].apply(lambda x: '{0:0>3}'.format(x))
placeHU17['STPL'] = placeHU17[['state', 'place']].apply(lambda x: ''.join(x), axis=1)

placeHU17 = placeHU17.drop(['state', 'place'], axis=1)
placeHU17.columns = ['TotalHousing17', 'Owners17', 'renters17',
                     'Name', 'STPL']

print(placeHU17.shape)
placeHU17.head()

(29567, 5)


Unnamed: 0,TotalHousing17,Owners17,renters17,Name,STPL
1,62,0,44,"Boys Ranch CDP, Texas",4809796
2,74,9,53,"Guthrie CDP, Texas",4831640
3,131,53,33,"Gail CDP, Texas",4827972
4,626,570,42,"Bartonville town, Texas",4805768
5,235,206,14,"Annetta North town, Texas",4803340


In [47]:
placeHU17.STPL = placeHU16.STPL.astype(int)
placeHU17.dtypes

TotalHousing17    object
Owners17          object
renters17         object
Name              object
STPL               int64
dtype: object