### Data Gathering: Seattle Demographics

The data was obtained using the API made available by the City of Seattle. The source is a dataset where Community Reporting Area boundaries were combined with American Community Survey data and census reports. This resulted in an interactive map displaying neighborhood demographics in Seattle. This dataset is part of the 5-Year Series (this one in particular from 2013-2017), and reflects demographics and was last updated January of 2020. 

The dataset can be found [here](https://data-seattlecitygis.opendata.arcgis.com/datasets/a-community-reporting-areas-profile-acs-5-year-2013-2017?orderBy=PROFILE_NAME), and the API interface can be found [here](https://data-seattlecitygis.opendata.arcgis.com/datasets/a-community-reporting-areas-profile-acs-5-year-2013-2017/geoservice?orderBy=PROFILE_NAME). 

#### Imports

In [1]:
import pandas as pd
import numpy as np 
import requests 

from bs4 import BeautifulSoup

In [58]:
# Link generated by City of Seattle's API 
# API interface can be found at the link indicated in the top section
source = "https://gisrevprxy.seattle.gov/arcgis/rest/services/CENSUS_EXT/CRA_ACS_5Y17/MapServer/0/query?where=1%3D1&outFields=OBJECTID,CRA_NO,CRA_GRP,GEN_ALIAS,DETL_NAMES,NEIGHDIST,AREA_SQMI,DISPLAY_NAME,TOTAL_POPULATION,MEDIAN_AGE,HOUSEHOLDS,FAMILY_HOUSEHOLDS,PCT_FAM_HH,NONFAMILY_HOUSEHOLDS,TOTAL_HOUSING_UNITS,OCCUPIED_HOUSING_UNITS,PCT_OCC_HU,VACANT_HOUSING_UNITS,PCT_VACANT_HU,OWNER_OCCUPIED_HOUSING_UNITS,PCT_OWN_OCC_HU,RENTER_OCCUPIED_HOUSING_UNITS,PCT_RENT_OCC_HU,MEDIAN_GROSS_RENT,RENT_GRAPI_COMPUTED,PERCENT_UNEMPLOYED,POP_DENSITY_ACRE,HU_DENSITY_ACRE,HH_DENSITY_ACRE,SHAPE_Area,PCT_ALL_FAMILY_UNDER_POVERTY,PCT_POPULATION_UNDER_POVERTY,HU_VALUE_MEDIAN_DOLLARS,AVERAGE_HOUSEHOLD_SIZE,CIVILIAN_LABOR_FORCE_UNEMPLOYD,CIVILIAN_LABOR_FORCE,CIVILIAN_LABOR_FORCE_EMPLOYED,NOT_IN_LABOR_FORCE,MEDIAN_HH_INC_PAST_12MO_DOLLAR&returnGeometry=false&outSR=4326&f=json"


In [59]:
# checking success of request
response = requests.get(source)

response.status_code

200

In [60]:
soup = BeautifulSoup(response.content, "lxml")

seattle = response.json()

# displaying all features gathered from this dataset
seattle["features"][0]["attributes"].keys()

dict_keys(['OBJECTID', 'CRA_NO', 'CRA_GRP', 'GEN_ALIAS', 'DETL_NAMES', 'NEIGHDIST', 'AREA_SQMI', 'DISPLAY_NAME', 'TOTAL_POPULATION', 'MEDIAN_AGE', 'HOUSEHOLDS', 'FAMILY_HOUSEHOLDS', 'PCT_FAM_HH', 'NONFAMILY_HOUSEHOLDS', 'TOTAL_HOUSING_UNITS', 'OCCUPIED_HOUSING_UNITS', 'PCT_OCC_HU', 'VACANT_HOUSING_UNITS', 'PCT_VACANT_HU', 'OWNER_OCCUPIED_HOUSING_UNITS', 'PCT_OWN_OCC_HU', 'RENTER_OCCUPIED_HOUSING_UNITS', 'PCT_RENT_OCC_HU', 'MEDIAN_GROSS_RENT', 'RENT_GRAPI_COMPUTED', 'PERCENT_UNEMPLOYED', 'POP_DENSITY_ACRE', 'HU_DENSITY_ACRE', 'HH_DENSITY_ACRE', 'SHAPE_Area', 'PCT_ALL_FAMILY_UNDER_POVERTY', 'PCT_POPULATION_UNDER_POVERTY', 'HU_VALUE_MEDIAN_DOLLARS', 'AVERAGE_HOUSEHOLD_SIZE', 'CIVILIAN_LABOR_FORCE_UNEMPLOYD', 'CIVILIAN_LABOR_FORCE', 'CIVILIAN_LABOR_FORCE_EMPLOYED', 'NOT_IN_LABOR_FORCE', 'MEDIAN_HH_INC_PAST_12MO_DOLLAR'])

#### Formatting and DataFrame Creation

In [29]:
# create list column names
headers = list(seattle["features"][0]["attributes"]
               
# create dictionary to populate data with 
               df_dict = {}

for head in headers: 
    df_dict[head] = []

In [48]:
type(df_dict["OBJECTID"])

list

In [50]:
for i in range(len(seattle["features"])): 
    for column in df_dict: 
        df_dict[f"{column}"].append(seattle["features"][i]["attributes"][f"{column}"])

In [52]:
df = pd.DataFrame(df_dict)

In [53]:
df.head()

Unnamed: 0,OBJECTID,CRA_NO,CRA_GRP,GEN_ALIAS,DETL_NAMES,NEIGHDIST,AREA_SQMI,DISPLAY_NAME,TOTAL_POPULATION,MEDIAN_AGE,...,SHAPE_Area,PCT_ALL_FAMILY_UNDER_POVERTY,PCT_POPULATION_UNDER_POVERTY,HU_VALUE_MEDIAN_DOLLARS,AVERAGE_HOUSEHOLD_SIZE,CIVILIAN_LABOR_FORCE_UNEMPLOYD,CIVILIAN_LABOR_FORCE,CIVILIAN_LABOR_FORCE_EMPLOYED,NOT_IN_LABOR_FORCE,MEDIAN_HH_INC_PAST_12MO_DOLLAR
0,1,10.4,10,Ballard,"Ballard, West Woodland, Adams",Ballard,0.77,CRA - Ballard,8649,34.3,...,21472310.0,0.0,7.0,543200,1.62,324,6313,5989,1736,79162
1,2,10.1,10,North Beach/Blue Ridge,"Crown Hill, North Beach, Blue Ridge",Ballard,2.01,CRA - North Beach-Blue Ridge,12701,42.6,...,55950010.0,6.6,7.8,658600,2.38,268,7787,7519,2666,94804
2,3,7.1,7,Montlake/Portage Bay,"Montlake, Portage Bay, Interlaken Park, Eastla...",Northeast,1.49,CRA - Montlake-Portage Bay,9732,37.3,...,41429080.0,2.0,4.6,821250,2.09,110,6518,6408,1941,132573
3,4,12.2,12,Interbay,Interbay,Magnolia/Queen Anne,1.9,CRA - Interbay,11024,34.4,...,52907760.0,4.5,8.7,571300,1.92,334,7885,7551,1675,74679
4,5,6.3,6,North Capitol Hill,"North Capitol Hill, Capitol Hill, North Broadway",East,0.44,CRA - North Capitol Hill,4807,36.1,...,12356840.0,1.2,2.3,896200,1.93,149,3415,3266,759,96220


In [56]:
df.to_csv("../datasets/seattle_demographics.csv", index = False)