### Census Data API:

#### Getting your Census API Key

Go to: https://www.census.gov/data/developers/about/terms-of-service.html

Scroll down until you see (or click here):

<a href="https://api.census.gov/data/key_signup.html"><img src="images/CensusAPI_Key.png" width="150" height="150"></a>

Follow that path then enter an Organization Name and Email.

You will quickly receive an email with a confirmation of your request, including an API Key and a link to activate it. 

Click the link to activate the key, store your key in a text file (for consistency we named it CensusAPI_key.txt, which we will be using throughout this project) and save your email to access it in the future in case you lose this file.


#### Begin using the API

Read your API key from the text file and store it in a variable

The query URL will always start with: "https://api.census.gov/data"

We then add the year, in this case 2020

Search for variable names here: https://api.census.gov/data/2022/acs/acs5/variables.html

To check, we'll see the total male and female populations under 5, these are:

- B01001_003E - Estimate!!Total:!!Male:!!Under 5 years
    
- B01001_027E - Estimate!!Total:!!Female:!!Under 5 years

We'll use a census tracts near UChicago 4107 for the first request.

In [1]:
import requests

CensusAPI_fn = "CensusAPI_key.txt"

with open(CensusAPI_fn, "r") as file:
    api_key = file.readline().strip()
    
host = "https://api.census.gov/data"
dataset = "acs/acs5"

year = "2022"

In [2]:
geography = 'TRACT:410700'
state = '17'  # Illinois state code
county = '031'  # Cook County code

variables = "NAME,B01001_003E,B01001_027E"

# Construct the API URL
url = f'{host}/{year}/{dataset}?get={variables}&for={geography}&in=state:{state}+county:{county}&key={api_key}'

response = requests.get(url)
data_4107 = response.json()

In [3]:
data_4107

[['NAME', 'B01001_003E', 'B01001_027E', 'state', 'county', 'tract'],
 ['Census Tract 4107; Cook County; Illinois',
  '0',
  '34',
  '17',
  '031',
  '410700']]

In [4]:
url

'https://api.census.gov/data/2022/acs/acs5?get=NAME,B01001_003E,B01001_027E&for=TRACT:410700&in=state:17+county:031&key=a3b4a5293e7bb84d85a7caa8bae69653be047e10'

#### Illinois tract level data

This worked, we can move on to grabbing the data for Illinois at the tract level.

We'll also see which other variables are necessary for our analysis. 

We wanted some socioeconomic data, including whether it is a majority minority tract, the poverty rate, the homeownership rate, mobility, etc. 

Here are the codes we will use to retreive this:

#### Children under 5
- B01001_003E - Males under 5

- B01001_027E - Female under 5 years

#### Demographics
- B02001_001E - Total Population

- B02001_002E - White Alone

- B02001_003E - Black Alone

- B02001_005E - Asian Alone

- B03001_003E - Hispanic Alone


#### Income/Poverty Status
- B17001_002E - Poverty status

- B19013_001E - Median Income

#### Home Ownership rates

- B25002_002E - Total occupied houses

- B25003_002E - Home owner occupied houses


#### Mobility
- B07003_004E - Lived in same house 1 Year ago

#### Educational Attainment
- B15003_001E: Total population age 25 and over

- B16010_002E: Population 25 Years and over with less than HS

- B16010_041E: Population 25 Years and over with at least Bachelor

In [27]:
geography = 'TRACT:*'
state = '17'  # Illinois state code

variables = "NAME,B01001_003E,B01001_027E,B02001_001E,B02001_002E,B02001_003E,B02001_005E,B03001_003E,B17001_002E,B19013_001E,B25002_002E,B25003_002E,B07003_004E,B15003_001E,B16010_002E,B16010_041E"

# Construct the API URL
url = f'{host}/{year}/{dataset}?get={variables}&for={geography}&in=state:{state}&key={api_key}'

response = requests.get(url)
data = response.json()

In [28]:
data[:5]

[['NAME',
  'B01001_003E',
  'B01001_027E',
  'B02001_001E',
  'B02001_002E',
  'B02001_003E',
  'B02001_005E',
  'B03001_003E',
  'B17001_002E',
  'B19013_001E',
  'B25002_002E',
  'B25003_002E',
  'B07003_004E',
  'B15003_001E',
  'B16010_002E',
  'B16010_041E',
  'state',
  'county',
  'tract'],
 ['Census Tract 1; Adams County; Illinois',
  '112',
  '121',
  '4509',
  '4028',
  '263',
  '60',
  '0',
  '440',
  '61595',
  '2377',
  '2055',
  '3827',
  '3443',
  '341',
  '810',
  '17',
  '001',
  '000100'],
 ['Census Tract 2.01; Adams County; Illinois',
  '49',
  '61',
  '1968',
  '1777',
  '134',
  '9',
  '32',
  '192',
  '44583',
  '887',
  '707',
  '1857',
  '1455',
  '141',
  '390',
  '17',
  '001',
  '000201'],
 ['Census Tract 2.02; Adams County; Illinois',
  '54',
  '31',
  '2473',
  '2171',
  '110',
  '35',
  '51',
  '257',
  '66472',
  '991',
  '679',
  '2025',
  '1491',
  '106',
  '470',
  '17',
  '001',
  '000202'],
 ['Census Tract 4; Adams County; Illinois',
  '145',
  '63'

In [29]:
col_names = ["DETAILS",
             "MALES_UNDER5",
             "FEMALES_UNDER5",
             "TOTPOP",
             "WHITE",
             "BLACK",
             "ASIAN",
             "HISPANIC",
             "BELOW_POVERTY_LINE",
             "MEDIAN_INCOME",
             "TOTAL_OCCUPIED_HOUSES",
             "HOMEOWNER_OCCUPIED_HOUSES",
             "SAME_HOUSE_AS_LAST_YEAR",
             "POP_OVER25",
             "LESS_THAN_HS",
             "BACHELOR_OR_GREATER",
             "STATE",
             "COUNTY",
             "TRACT"]


In [30]:
import pandas as pd

df = pd.DataFrame(data[1:], columns=col_names)
file_path = "data/Census_data_raw.csv"
df.to_csv(file_path, index=False)