# Working with the Census API

### PDF Guide
- https://www.census.gov/content/dam/Census/library/publications/2020/acs/acs_api_handbook_2020_ch02.pdf

You can also access and download data from here: https://data.census.gov/cedsci/tableq=United%20States&tid=PEPPOP2019.PEPANNRES&hidePreview=false

### Available Datasets
- *ACS 1-year estimates (2012–2018)* For areas with populations 65,000+, most frequently updated but with the lowest “resolution” since it excludes areas with low population and has the smallest sample size
- *ACS 1-year supplemental estimates (2014–2017)* Supplemental dataset that focuses on lower population areas of 20,000+
- *ACS 3-year estimates (2010–2012 to 2011–2013)* For areas with populations 20,000+, very much the middle ground between the 1 and 5 years. Currently discounted by the Census Bureau but old versions can still be accessed.
- *ACS 5-year estimates (2005–2009 to 2014–2018)* Data for all areas, highest resolution and largest sample size but the least current
- *Decennial Census 2010* Counts every resident of the US, updated every 10 years.

**Discontinued Datasets:** ACS 3-year estimates

### ACS: Single vs. Multi-Year Estimates
- Single-year and multiyear estimates from the ACS are all “period” estimates derived from a sample collected over a period of time, as opposed to “point-in-time” estimates such as those from past decennial censuses. 
- There are two sets of numbers—both 1-year estimates and 5-year estimates—available for geographic areas with at least 65,000 people, such as the state of Virginia. Less populous areas, such as Bath County in Virginia’s Shenandoah Valley, receive only 5-year estimates.
- Multiyear estimates should be labeled to indicate clearly the full period of time (e.g., “The child poverty rate in 2014–2018 was X percent.”). They do not describe any specific day, month, or year within that time period.
- The primary advantage of using multiyear estimates is the increased statistical reliability of the data compared with that of single-year estimates

**Reference on which dataset to use and when:**
- https://www.census.gov/content/dam/Census/library/publications/2020/acs/acs_general_handbook_2020_ch03.pdf


In [4]:
import requests
import pandas as pd

### Accessing Decennial Census Data
- 2010 is the latest decennial dataset available and will be useful for calculating % changes on total counts when 2020 is released
- 2020 Decennial dataset not released until April 1st, 2021
- "dec/sf1" stands for decennial, summary file 1, which provides the most detailed information available from the 2010 Census about a community's entire population, including cross-tabulations of age, sex, households, families, relationship to householder, housing units, detailed race and Hispanic or Latino origin groups, and group quarters
- By contrast, "dec/sf2" adds a layer of detail — making information, such as age, relationship and homeownership available for specific race and ethnic groups within a community

**References:**
- https://www.census.gov/data/developers/data-sets/decennial-census.html
- https://www.census.gov/programs-surveys/decennial-census/guidance/2010.html

In [5]:
# Build base URL
host = "https://api.census.gov/data"
year = "2010"
dataset = "dec/sf1"
base_url = "/".join([host, year, dataset])

# Specify Census variables and other predicates
# P013001 = Median Age, P037001 = Average Family Size
get_vars = ["NAME", "P013001", "P037001"]
predicates = {}
predicates["get"] = ",".join(get_vars)
predicates["for"] = "state:*"

# Execute the request, examine text of response object
r = requests.get(base_url, params=predicates)

# Construct the data frame
col_names = ['name', 'median_age', 'avg_family_size', 'state']
# Skip the first item, which contains the old column names
states = pd.DataFrame(columns = col_names, data = r.json()[1:])

# Convert each column with numeric data to an appropriate type
states["median_age"] = states["median_age"].astype(float)
states["avg_family_size"] = states["avg_family_size"].astype(float)

print(states)

                    name  median_age  avg_family_size state
0                Alabama        37.9             3.02    01
1                 Alaska        33.8             3.21    02
2                Arizona        35.9             3.19    04
3               Arkansas        37.4             3.00    05
4             California        35.2             3.45    06
5              Louisiana        35.8             3.10    22
6               Kentucky        38.1             2.98    21
7               Colorado        36.1             3.08    08
8            Connecticut        40.0             3.08    09
9               Delaware        38.8             3.06    10
10  District of Columbia        33.8             3.01    11
11               Florida        40.7             3.01    12
12               Georgia        35.3             3.17    13
13                Hawaii        38.6             3.42    15
14                 Idaho        34.6             3.16    16
15              Illinois        36.6    