# HFB Data Engineer Exercise 
Using the most timely and accurate data available on [https://www.census.gov/en.html](https://www.census.gov/en.html), write a python script that pulls the following poverty data for Harris County, TX in the year 2018:

- Estimate of the number of people under the age of 18 in poverty
- Estimate of the number of people of any age in poverty
- Estimate of the median household income

Your script should then write this data to a .csv file, including appropriately named headers. When finished, push your code to a public git repo.

**Requirements**:

- Use an api key to authenticate your requests
- Use the requests library (no custom libraries that pull Census data)

## Loading in Required Libraries and api_key

In [109]:
import pandas as pd
import json
import requests

In [110]:
def get_keys(path):
    with open(path) as f:
        return json.load(f)

# Using the function to open and load all keys in that file 
api_key = list((get_keys('/Users/weesn/Documents/Flatiron/secrets.json')).values())[2]

In [111]:
State = '48'  #code for Texas in API
County = '201' #code for Harris county in API

Further explaination on [How the Census Bareau Measures Poverty](https://www.census.gov/topics/income-poverty/poverty/guidance/poverty-measures.html).

## Estimate of the number of people under the age of 18 in poverty

In [115]:
Name = 'DP03_0129PE' 
"""Percent Estimate!!PERCENTAGE OF FAMILIES AND PEOPLE WHOSE INCOME IN THE PAST 12 MONTHS IS BELOW THE POVERTY LEVEL!!
All people!!Under 18 years"""

# get request
r_18uPovertyPercent = requests.get(f'https://api.census.gov/data/2018/acs/acs1/profile?get=NAME,{Name}&for=county:{County}&in=state:{State}&key={api_key}')
#checking HTTP response status code for successfull response
r_18uPovertyPercent.status_code 

200

In [116]:
#observing data
r_18uPovertyPercent.json() 

[['NAME', 'DP03_0129PE', 'state', 'county'],
 ['Harris County, Texas', '25.4', '48', '201']]

Value is returned as a str(float) in percent format.

In [117]:
Name = 'B09001_001E'
"""Estimate!!Total
POPULATION UNDER 18 YEARS BY AGE"""

#get request
r_18uPopulation = requests.get(f'https://api.census.gov/data/2018/acs/acs1?get=NAME,{Name}&for=county:{County}&in=state:{State}&key={api_key}')
#checking HTTP response status code for successfull response
r_18uPopulation.status_code 

200

In [119]:
#observing data
r_18uPopulation.json()

[['NAME', 'B09001_001E', 'state', 'county'],
 ['Harris County, Texas', '1251684', '48', '201']]

Estimate of the number of people under the age of 18 in poverty = (total pop. under 18) * (percent of under 18 below the poverty level)

In [121]:
#Need to put the values into a numerical form to multiply.  Rounding because you can't have part of a person
est_18u_poverty = round(int(r_18uPopulation.json()[1][1]) * (float(r_18uPovertyPercent.json()[1][1])/100))
f'Estimate of the number of people under the age of 18 in poverty is {est_18u_poverty} people.'

'Estimate of the number of people under the age of 18 in poverty is 317928 people.'

## Estimate of the number of people of any age in poverty

In [101]:
Name = 'B17001_002E'
"""Estimate!!Total!!Income in the past 12 months below poverty level"""

# get request
r_PopPoverty = requests.get(f'https://api.census.gov/data/2018/acs/acs1?get=NAME,{Name}&for=county:{County}&in=state:{State}&key={api_key}')
#checking HTTP response status code for successfull response
r_PopPoverty.status_code 

200

In [122]:
# Observing Data
r_PopPoverty.json()

[['NAME', 'B17001_002E', 'state', 'county'],
 ['Harris County, Texas', '771892', '48', '201']]

In [103]:
PopPoverty = int(r_PopPoverty.json()[1][1])
f'Estimate of the number of people of any age in poverty is {PopPoverty} people'

'Estimate of the number of people of any age in poverty is 771892 people'

## Estimate of the median household income

In [123]:
Name = 'B19013_001E'
"""Estimate!!Median household income in the past 12 months (in 2018 inflation-adjusted dollars)"""

# get Request
r_median_household_income = requests.get(f'https://api.census.gov/data/2018/acs/acs1?get=NAME,{Name}&for=county:{County}&in=state:{State}&key={api_key}')
#checking HTTP response status code for successfull response
r_median_household_income.status_code 

200

In [125]:
# Observing Data
r_median_household_income.json()

[['NAME', 'B19013_001E', 'state', 'county'],
 ['Harris County, Texas', '60232', '48', '201']]

In [97]:
median_household_income = r_median_household_income.json()[1][1]
f'Estimate of the median household income is ${median_household_income}'

'Estimate of the median household income is $60232'

## Creating a csv File containing answers

In [126]:
pd.DataFrame(data = [[est_18u_poverty,PopPoverty,median_household_income]], 
           columns = ['Estimate_of_the_number_of_people_under_the_age_of_18_in_poverty',
                      'Estimate_of_the_number_of_people_of_any_age_in_poverty',
                      'Estimate_of_the_median_household_income']).to_csv('./Exercise_Answers.csv')