# US Census API

### Requirements
* install the `census` module and the `us` module before getting started. To do this, run the following commands from the command line: 
    * **`pip install census`**
    * **`pip install us`**

### Documentation
* [Python wrapper for census API](https://github.com/datamade/census)
* [List of available fields and labels](https://gist.github.com/afhaque/60558290d6efd892351c4b64e5c01e9b)

### Import dependencies

In [None]:
import pandas as pd
from census import Census #<-- Python wrapper for census API

# Census API Key
from config import api_key

# provide the api key and the year to establish a session
c = Census(api_key, year=2013)

### Run Census Search to retrieve data on all zip codes (2013 ACS5 Census)

There are a number of convenient methods that the wrapper provides, but the standard function requires a tuple of field IDs that you're interested in, and a geographic reference stored in a dictionary as seen below. In this code, we're saying we want data for these 6 fields for ALL zip codes

**NOTE:** We're using the `acs5` function set to pull our data from the 5-year American Consumer Survey.

In [None]:
census_data = c.acs5.get(("NAME", "B19013_001E", "B01003_001E", "B01002_001E", "B19301_001E", "B17001_002E"), {'for': 'zip code tabulation area:*'})

census_data[0]

### Format the response

In [None]:
# Convert to DataFrame
census_pd = pd.DataFrame(census_data)

# Renaming columns to be more user-friendly
census_pd = census_pd.rename(columns={"B01003_001E": "Population",
                                      "B01002_001E": "Median Age",
                                      "B19013_001E": "Household Income",
                                      "B19301_001E": "Per Capita Income",
                                      "B17001_002E": "Poverty Count",
                                      "NAME": "Name", "zip code tabulation area": "Zipcode"})

# Since Census doesn't provide the poverty rate, we can divide Poverty Count by Population to calculate it ourselves
census_pd["Poverty Rate"] = 100 * census_pd["Poverty Count"].astype(int) / census_pd["Population"].astype(int)

# Reorder columns and only include ones we're interested in for the final DataFrame
census_pd = census_pd[["Zipcode", "Population", "Median Age", "Household Income",
                       "Per Capita Income", "Poverty Rate"]]

# Visualize
print("Total number of zip codes in response: " + str(len(census_pd)))
census_pd.head()

### Save to a CSV

In [None]:
census_pd.to_csv("census_data.csv", encoding="utf-8", index=False)