# Census API
## Table of contents
1. [American Community Survey](#1.-American-Community-Survey)
    1. [Single table](#a.-Single-table)
    1. [Group of tables](#b.-Group-of-tables)
1. [County Business Patterns](#2.-County-Business-Patterns)

In [31]:
import pandas as pd
import requests

In [32]:
api_key = ''

## 1. American Community Survey
- [List of groups](https://api.census.gov/data/2019/acs/acs5/groups.html)
- [List of variables](https://api.census.gov/data/2019/acs/acs5/variables.html)
- [Understanding table IDs](https://www.census.gov/programs-surveys/acs/guidance/which-data-tool/table-ids-explained.html)

### There are three main parameters to know
1. key - register for an API key
1. get - table or group ID
1. for - geography
    1. county
    1. state
    1. cbsa

In [10]:
acs5_endpoint = 'https://api.census.gov/data/2019/acs/acs5'

### a. Single table

In [8]:
# get a single table (population)
acs5_params = {
    'key': api_key,
    'get': 'B01001_001E',
    'for': 'state'
}

In [11]:
acs5_r = requests.get(acs5_endpoint, params=acs5_params)

In [12]:
acs5_r.json()[0:5]

[['B01001_001E', 'state'],
 ['4876250', '01'],
 ['737068', '02'],
 ['7050299', '04'],
 ['2999370', '05']]

In [13]:
acs5_df = pd.DataFrame(acs5_r.json()[1:], columns=acs5_r.json()[0])

In [14]:
acs5_df.head()

Unnamed: 0,B01001_001E,state
0,4876250,1
1,737068,2
2,7050299,4
3,2999370,5
4,39283497,6


### b. Group of tables

In [15]:
# lets change the get parameter to get a group
acs5_params['get'] = 'group(B01001)'

In [16]:
acs5_r = requests.get(acs5_endpoint, params=acs5_params)

In [19]:
acs5_df = pd.DataFrame(acs5_r.json()[1:], columns=acs5_r.json()[0])

In [21]:
acs5_df.head()

Unnamed: 0,B01001_001E,B01001_001EA,B01001_001M,B01001_001MA,B01001_002E,B01001_002EA,B01001_002M,B01001_002MA,B01001_003E,B01001_003EA,...,B01001_048EA,B01001_048M,B01001_048MA,B01001_049E,B01001_049EA,B01001_049M,B01001_049MA,GEO_ID,NAME,state
0,4876250,,-555555555,*****,2359355,,1270,,149090,,...,,1175,,56419,,1311,,0400000US01,Alabama,1
1,737068,,-555555555,*****,384915,,401,,27062,,...,,396,,3981,,339,,0400000US02,Alaska,2
2,7050299,,-555555555,*****,3504509,,349,,221817,,...,,1623,,78983,,1869,,0400000US04,Arizona,4
3,2999370,,-555555555,*****,1471760,,979,,96986,,...,,973,,37257,,939,,0400000US05,Arkansas,5
4,39283497,,-555555555,*****,19526298,,1141,,1254607,,...,,3482,,451736,,3547,,0400000US06,California,6


## 2. County Business Patterns
Used to identify the number of workers in various industries, based on NAICS codes
- [CBP landing page](https://api.census.gov/data/2019/cbp.html)
- [CBP variables list](https://api.census.gov/data/2019/cbp/variables.html)
- [Look up NAICS codes](https://www.naics.com/search/)

1. key - API key
1. get - list of variables
1. for - geography
1. NAICS2017 - NAICS code

In [25]:
cbp_endpoint = 'https://api.census.gov/data/2019/cbp'

In [26]:
# let's get the number of Arts, entertainment, and recreation establishments by state
# also return the NAICS code, the NAICS label and the name of the state
cbp_params = {
    'key': api_key,
    'get': 'ESTAB,NAICS2017_LABEL,NAME',
    'for': 'state',
    'NAICS2017': '71'
}

In [27]:
cbp_r = requests.get(cbp_endpoint, params=cbp_params)

In [28]:
cbp_r.json()[0:5]

[['ESTAB', 'NAICS2017_LABEL', 'NAME', 'NAICS2017', 'state'],
 ['692', 'Arts, entertainment, and recreation', 'Mississippi', '71', '28'],
 ['2343', 'Arts, entertainment, and recreation', 'Missouri', '71', '29'],
 ['1277', 'Arts, entertainment, and recreation', 'Montana', '71', '30'],
 ['966', 'Arts, entertainment, and recreation', 'Nebraska', '71', '31']]

In [29]:
cbp_df = pd.DataFrame(cbp_r.json()[1:], columns=cbp_r.json()[0])

In [30]:
cbp_df.head()

Unnamed: 0,ESTAB,NAICS2017_LABEL,NAME,NAICS2017,state
0,692,"Arts, entertainment, and recreation",Mississippi,71,28
1,2343,"Arts, entertainment, and recreation",Missouri,71,29
2,1277,"Arts, entertainment, and recreation",Montana,71,30
3,966,"Arts, entertainment, and recreation",Nebraska,71,31
4,1759,"Arts, entertainment, and recreation",Nevada,71,32
