### Data Gathering: Seattle Demographics

The data was obtained using the API made available by the City of Seattle. The source is a dataset where Community Reporting Area boundaries were combined with American Community Survey data and census reports. This resulted in an interactive map displaying neighborhood demographics in Seattle. This dataset is part of the 5-Year Series (this one in particular from 2013-2017), and reflects demographics and was last updated January of 2020. 

The dataset can be found [here](https://data-seattlecitygis.opendata.arcgis.com/datasets/a-community-reporting-areas-profile-acs-5-year-2013-2017?orderBy=PROFILE_NAME), and the API interface can be found [here](https://data-seattlecitygis.opendata.arcgis.com/datasets/a-community-reporting-areas-profile-acs-5-year-2013-2017/geoservice?orderBy=PROFILE_NAME). 

#### Imports

In [1]:
import pandas as pd
import numpy as np 
import requests 

from bs4 import BeautifulSoup

In [2]:
# Link generated by City of Seattle's API 
# API interface can be found at the link indicated in the top section
source = "https://gisrevprxy.seattle.gov/arcgis/rest/services/CENSUS_EXT/CRA_ACS_5Y17/MapServer/0/query?where=1%3D1&outFields=OBJECTID,CRA_NO,CRA_GRP,GEN_ALIAS,DETL_NAMES,NEIGHDIST,AREA_SQMI,DISPLAY_NAME,TOTAL_POPULATION,MEDIAN_AGE,HOUSEHOLDS,FAMILY_HOUSEHOLDS,PCT_FAM_HH,NONFAMILY_HOUSEHOLDS,TOTAL_HOUSING_UNITS,OCCUPIED_HOUSING_UNITS,PCT_OCC_HU,VACANT_HOUSING_UNITS,PCT_VACANT_HU,OWNER_OCCUPIED_HOUSING_UNITS,PCT_OWN_OCC_HU,RENTER_OCCUPIED_HOUSING_UNITS,PCT_RENT_OCC_HU,MEDIAN_GROSS_RENT,RENT_GRAPI_COMPUTED,PERCENT_UNEMPLOYED,POP_DENSITY_ACRE,HU_DENSITY_ACRE,HH_DENSITY_ACRE,SHAPE_Area,PCT_ALL_FAMILY_UNDER_POVERTY,PCT_POPULATION_UNDER_POVERTY,HU_VALUE_MEDIAN_DOLLARS,AVERAGE_HOUSEHOLD_SIZE,CIVILIAN_LABOR_FORCE_UNEMPLOYD,CIVILIAN_LABOR_FORCE,CIVILIAN_LABOR_FORCE_EMPLOYED,NOT_IN_LABOR_FORCE,MEDIAN_HH_INC_PAST_12MO_DOLLAR&returnGeometry=false&outSR=4326&f=json"


In [3]:
# checking success of request
response = requests.get(source)

response.status_code

200

In [4]:
soup = BeautifulSoup(response.content, "lxml")

seattle = response.json()

# displaying all features gathered from this dataset
seattle["features"][0]["attributes"].keys()

dict_keys(['OBJECTID', 'CRA_NO', 'CRA_GRP', 'GEN_ALIAS', 'DETL_NAMES', 'NEIGHDIST', 'AREA_SQMI', 'DISPLAY_NAME', 'TOTAL_POPULATION', 'MEDIAN_AGE', 'HOUSEHOLDS', 'FAMILY_HOUSEHOLDS', 'PCT_FAM_HH', 'NONFAMILY_HOUSEHOLDS', 'TOTAL_HOUSING_UNITS', 'OCCUPIED_HOUSING_UNITS', 'PCT_OCC_HU', 'VACANT_HOUSING_UNITS', 'PCT_VACANT_HU', 'OWNER_OCCUPIED_HOUSING_UNITS', 'PCT_OWN_OCC_HU', 'RENTER_OCCUPIED_HOUSING_UNITS', 'PCT_RENT_OCC_HU', 'MEDIAN_GROSS_RENT', 'RENT_GRAPI_COMPUTED', 'PERCENT_UNEMPLOYED', 'POP_DENSITY_ACRE', 'HU_DENSITY_ACRE', 'HH_DENSITY_ACRE', 'SHAPE_Area', 'PCT_ALL_FAMILY_UNDER_POVERTY', 'PCT_POPULATION_UNDER_POVERTY', 'HU_VALUE_MEDIAN_DOLLARS', 'AVERAGE_HOUSEHOLD_SIZE', 'CIVILIAN_LABOR_FORCE_UNEMPLOYD', 'CIVILIAN_LABOR_FORCE', 'CIVILIAN_LABOR_FORCE_EMPLOYED', 'NOT_IN_LABOR_FORCE', 'MEDIAN_HH_INC_PAST_12MO_DOLLAR'])

#### Formatting and DataFrame Creation

In [8]:
# create list column names
headers = list(seattle["features"][0]["attributes"])

# create dictionary to populate data with 
df_dict = {}
               
               
for head in headers: 
    df_dict[head] = []

In [9]:
type(df_dict["OBJECTID"])

list

In [10]:
for i in range(len(seattle["features"])): 
    for column in df_dict: 
        df_dict[f"{column}"].append(seattle["features"][i]["attributes"][f"{column}"])

In [11]:
df = pd.DataFrame(df_dict)

In [14]:
df["MEDIAN_GROSS_RENT"]

0     1542
1     1476
2     1723
3     1490
4     1576
5     1336
6     1596
7     1519
8     1305
9     1283
10    1238
11    1316
12    1178
13    1045
14    1527
15    1549
16    1804
17     988
18     844
19    1199
20    1341
21    1563
22    1761
23    1476
24    1058
25    1138
26     959
27    1387
28    1503
29    1358
30    1350
31    1316
32    1639
33    1530
34    1361
35    1299
36    1573
37    1950
38     776
39    1445
40     928
41    1142
42    1739
43    1487
44    1561
45    1391
46    1667
47    1117
48    1398
49    1401
50    1465
51    1265
52    1569
Name: MEDIAN_GROSS_RENT, dtype: int64

In [13]:
df.to_csv("../datasets/seattle_demographics.csv", index = False)