# Data Processing for PUMS Data

This notebook is used to filter and recode the PUMS dataset, run logistic regressions, and export the summarized data for use in Observable.

In [1]:
from sklearn.linear_model import LogisticRegression, SGDClassifier
from sklearn import metrics
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split

Documentation: https://august.csscr.washington.edu/~data/ACS/5-Year/2015-2019/2019ACS-PUMS/WA/document%20files/

PUMS:
* Housing file:
* NP: number of persons associated with this record
* GRPIP: gross rent as a percentage of household income (int)
* FINCP: family income past 12 months (int)
* FPARC: family presence (char): b = not a family, 4 = no related children, 1-3 has children
* HINCP: household income last 12 months
* HUPAC: presence and age of children: 4 = no children, b = N/A, 1-3 with children
* MULTG: multigenerational household (char): 1 = no, 2 = yes, b = NA
* MV: when moved into house (char): 1. 12 months or less, 7. 30 years or more
* 
* Person:
* WAGP: Wages or salary income past 12 months (integers)
* AGEP: age
* RAC1P: detailed race code: 1 white alone, 2 black alone, 3-5 native americans alone, 7 native islander alone, 9 two or more races
* RACAIAN: american indian/alaskan native (alone or in combination with others): char, 1 = yes
* RACBLK: black (1+): 1
* RACNH: native Hawaiian
* RACWHT: white recode
* RC: related child: 1 = yes
* SCIENGP: Field of degree science and engineering flag - NSF definition (yes = 1)
* SCIENGRLP: Field of degree science and engineering related flag (yes = 1)
* SCHL: education attainment char: 21 bachelors, 24 doctorate
* SEX: 1 male, 2 female
* POVPIP: Income-to-poverty ratio recode 501, .501 percent or more, 0-500 below 501%

Washington poverty:
* Household size 1: 24980
* 2: 33820
* 3: 42660
* 4: 51500
* 5: 60340
* 6: 69180
* 20: 192940 (around 8k per step)

Merge on serialno

https://august.csscr.washington.edu/~data/ACS/5-Year/2015-2019/2019ACS-PUMS/WA/document%20files/2019%20ACS%205-year%20PUMS_README.pdf

Weighting:
* PWGTP: Person's weight for generating statistics on individuals (such as age).
* WGTP: Household weight for generat

To produce estimates or tabulations of characteristics from the PUMS, add the weights of all
persons or HUs that possess the characteristic of interest.2
 For instance, if the characteristic of
interest is “total number of black teachers”, simply determine the race and occupation of all
persons and cumulate the weights of those who match the characteristics of interest. To obtain
estimates of proportions, divide the weighted estimate of persons or HUs with a given
characteristic by the weighted estimate of the denominator. For example, the proportion of
“black teachers” is obtained by dividing the weighted estimate of black teachers by the estimate
of teachers.

In [140]:
# Person file

df = pd.read_csv("/Users/chaya/Downloads/WA20195y_person.csv", low_memory=False)
# df.describe(include="PUMA")
to_keep = [
    'RT', 'SERIALNO', 'SPORDER', 'PUMA', 'ADJINC', 'PWGTP',
    'AGEP', 'WAGP', 'RAC1P', 'RACAIAN', 'RACBLK', 'RACNH',
    'RACWHT', 'RC', 'SCIENGP', 'SCIENGRLP', 'SCHL',
    'SEX', 'POVPIP', 'SOCP' # removed OCCP
]
df_seattle = df[(df.PUMA >= 11601) & (df.PUMA <= 11605) & (df.AGEP >= 18)]
df_seattle[to_keep].to_csv("/Users/chaya/Downloads/seattle_people.csv")



In [141]:
# Read reduced file back in
df = pd.read_csv("/Users/chaya/Downloads/seattle_people.csv", low_memory=False)

# Add categorical labels
# Categorize variables

df['SOCP_str'] = df['SOCP'].astype('str')

def tech(x):
    if x[0:2] == '15':
        return 1
    else:
        return 0

df['tech'] = df['SOCP_str'].apply(tech)

sw_seattle = 'PUMA == 11605'
cap_hill = 'PUMA == 11604'
queen_anne = 'PUMA == 11603'
ne_seattle = 'PUMA = 11602'
nw_seattle = 'PUMA = 11601'

def qa(x):
    return 1 if x == 11603 else 0
def ch(x):
    return 1 if x == 11604 else 0
def sw(x):
    return 1 if x == 11605 else 0
def ne(x):
    return 1 if x == 11602 else 0
def nw(x):
    return 1 if x == 11601 else 0
def heq(x):
    if (x == 11603 or x == 11601):
        return 1
    elif (x == 11605 or x == 1604):
        return 0
    else:
        return -1

df['queen_anne'] = df['PUMA'].apply(qa)
df['cap_hill'] = df['PUMA'].apply(ch)
df['sw_seattle'] = df['PUMA'].apply(sw)
df['ne_seattle'] = df['PUMA'].apply(ne)
df['nw_seattle'] = df['PUMA'].apply(nw)
df['high_equity_area'] = df['PUMA'].apply(heq)

category = pd.cut(df.AGEP,bins=[-1,34,100],labels=[0,1])
df.insert(1,'age_18_34',category)

category = pd.cut(df.WAGP,bins=[-1,199999,1000000],labels=[0,1])
df.insert(1,'high_income',category)

category = pd.cut(df.WAGP,bins=[-1,30000,1000000],labels=[0,1])
df.insert(1,'low_income',category)

df['SCIENGP'] = df['SCIENGP'].fillna(3.0)
df['SCIENGRLP'] = df['SCIENGRLP'].fillna(3.0)
df['OCCP'] = df['SCIENGRLP'].fillna(0.0)

category = pd.cut(df.SCIENGP,bins=[0,1,4],labels=[1,0])
df.insert(1,'stem_degree',category)

category = pd.cut(df.SCIENGRLP,bins=[0,1,4],labels=[1,0])
df.insert(1,'stem_related_degree',category)

category = pd.cut(df.SCHL,bins=[0,20,30],labels=[0,1])
df.insert(1,'bach_degree_or_higher',category)

df['sum_race'] = df.RACAIAN + df.RACBLK + df.RACNH

category = pd.cut(df.sum_race,bins=[-1,0,5],labels=[0,1])
df.insert(1,'bipoc',category)

category = pd.cut(df.sum_race,bins=[-1,0,5],labels=[1,0])
df.insert(1,'non_bipoc',category)

category = pd.cut(df.SEX,bins=[-1,1,3],labels=[1,0])
df.insert(1,'male',category)

df.describe()

Unnamed: 0.1,Unnamed: 0,SPORDER,PUMA,ADJINC,PWGTP,AGEP,WAGP,RAC1P,RACAIAN,RACBLK,...,POVPIP,tech,queen_anne,cap_hill,sw_seattle,ne_seattle,nw_seattle,high_equity_area,OCCP,sum_race
count,26311.0,26311.0,26311.0,26311.0,26311.0,26311.0,26311.0,26311.0,26311.0,26311.0,...,24600.0,26311.0,26311.0,26311.0,26311.0,26311.0,26311.0,26311.0,26311.0,26311.0
mean,187048.294212,1.575501,11602.930105,1049044.0,23.413629,44.582266,52281.076356,2.390749,0.017635,0.064118,...,375.609431,0.069629,0.183383,0.166546,0.208164,0.231044,0.210862,-0.003345,2.361978,0.083615
std,106848.720491,0.941702,1.438363,26319.69,16.111739,18.454186,77239.684736,2.498937,0.131624,0.244967,...,163.394605,0.254525,0.386988,0.372577,0.406002,0.421509,0.407929,0.889862,0.575125,0.290221
min,22.0,1.0,11601.0,1010145.0,1.0,18.0,0.0,1.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,-1.0,1.0,0.0
25%,94562.5,1.0,11602.0,1031452.0,14.0,29.0,0.0,1.0,0.0,0.0,...,253.0,0.0,0.0,0.0,0.0,0.0,0.0,-1.0,2.0,0.0
50%,191171.0,1.0,11603.0,1054606.0,21.0,41.0,29400.0,1.0,0.0,0.0,...,501.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2.0,0.0
75%,280838.0,2.0,11604.0,1073449.0,26.0,58.0,75000.0,2.0,0.0,0.0,...,501.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,3.0,0.0
max,372897.0,16.0,11605.0,1080470.0,267.0,94.0,536000.0,9.0,1.0,1.0,...,501.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,3.0,3.0


In [None]:
df.dtypes

# Check nulls

df.isnull().sum()

df.head()

In [103]:
flags = [
    'PUMA',    
    'bach_degree_or_higher', 
    'stem_related_degree',
    'stem_degree',
    'low_income', 
    'high_income',
    'age_18_34',
    'bipoc',
    'RACWHT',
    'male',
    'tech',
    'queen_anne',
    'cap_hill',
    'nw_seattle',
    'sw_seattle',
    'ne_seattle',
    'high_equity_area'
    ]

for flag in flags:
    print(df[flag].value_counts())

11602    6079
11601    5548
11605    5477
11603    4825
11604    4382
Name: PUMA, dtype: int64
1    15474
0    10837
Name: bach_degree_or_higher, dtype: int64
0    24998
1     1313
Name: stem_related_degree, dtype: int64
0    18562
1     7749
Name: stem_degree, dtype: int64
0    13611
1    12700
Name: low_income, dtype: int64
0    25201
1     1110
Name: high_income, dtype: int64
1    16417
0     9894
Name: age_18_34, dtype: int64
0    24210
1     2101
Name: bipoc, dtype: int64
1    19994
0     6317
Name: RACWHT, dtype: int64
0    13325
1    12986
Name: male, dtype: int64
0    24479
1     1832
Name: tech, dtype: int64
False    21486
True      4825
Name: queen_anne, dtype: int64
False    21929
True      4382
Name: cap_hill, dtype: int64
False    20763
True      5548
Name: nw_seattle, dtype: int64
False    20834
True      5477
Name: sw_seattle, dtype: int64
False    20232
True      6079
Name: ne_seattle, dtype: int64
-1    10461
 1    10373
 0     5477
Name: high_equity_area, dtype: int64

In [104]:
features = [
    'bach_degree_or_higher', 
    # 'stem_related_degree',
    # 'stem_degree',
    # 'low_income', 
    # 'high_income',
    'age_18_34',
    # 'RACWHT',
    'bipoc',
    'male',
    # 'tech'
    # 'queen_anne',
    # 'cap_hill',
    # 'nw_seattle',
    # 'sw_seattle',
    # 'ne_seattle',
    # 'high_equity_area'
    ]

features_weight = []
for feat in features:
    features_weight.append(feat)
features_weight.append('PWGTP')

print(features)
print(features_weight)

['bach_degree_or_higher', 'age_18_34', 'bipoc', 'male', 'queen_anne', 'cap_hill', 'nw_seattle', 'sw_seattle', 'ne_seattle', 'high_equity_area']
['bach_degree_or_higher', 'age_18_34', 'bipoc', 'male', 'queen_anne', 'cap_hill', 'nw_seattle', 'sw_seattle', 'ne_seattle', 'high_equity_area', 'PWGTP']


In [1]:

X = df[features_weight]

# y = df['low_income']
y = df['high_income']

X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)
print(f"sizes: X_train: {len(X_train)}, X_test: {len(X_test)}")
unique, counts = np.unique(y_train, return_counts=True)
print(f"unique values of y_train: {dict(zip(unique, counts))}")
unique, counts = np.unique(y_test, return_counts=True)
print(f"unique values of test: {dict(zip(unique, counts))}")

logistic_regression= LogisticRegression()
logistic_regression.fit(X_train[features],y_train,sample_weight=X_train['PWGTP'])
y_pred=logistic_regression.predict(X_test[features])
print("y_pred: ", len(y_pred))
unique, counts = np.unique(y_pred, return_counts=True)
print(dict(zip(unique, counts)))
print(logistic_regression.classes_)
print('Accuracy: ',metrics.accuracy_score(y_test, y_pred))

NameError: name 'df' is not defined

## Model high income

In [135]:
# Model with all data
X = df[['bach_degree_or_higher', 'age_18_34', 'bipoc', 'male', 'PWGTP']]

# y = df['low_income']
y = df['high_income']

log_high_inc = LogisticRegression()
log_high_inc.fit(X[['bach_degree_or_higher', 'age_18_34', 'bipoc', 'male']],y,sample_weight=X['PWGTP'])
to_predict = X_test[['bach_degree_or_higher', 'age_18_34', 'bipoc', 'male']][0:0]
# bach degree, young, non-BIPOC, male
to_predict.loc[0] = [1, 1, 0, 1]
# bach degree, young, BIPOC, male
to_predict.loc[1] = [1, 1, 1, 1]
# bach degree, young, non-BIPOC, non-male
to_predict.loc[2] = [1, 1, 0, 0]
# bach degree, young, BIPOC, non-male
to_predict.loc[3] = [1, 1, 1, 0]
# bach degree, not young, non-BIPOC, male
# df_empty.loc[2] = [1, 0, 0, 1]
# no bach degree, not young, not BIPOC, non-male
to_predict.loc[4] = [0, 0, 0, 0]

preds = log_high_inc.predict_proba(to_predict)

# show the inputs and predicted probabilities
for i in range(len(to_predict)):
	print("X=%s, Predicted=%s" % (to_predict.loc[i], preds[i]))


X=bach_degree_or_higher    1
age_18_34                1
bipoc                    0
male                     1
Name: 0, dtype: int64, Predicted=[0.85616022 0.14383978]
X=bach_degree_or_higher    1
age_18_34                1
bipoc                    1
male                     1
Name: 1, dtype: int64, Predicted=[0.92646561 0.07353439]
X=bach_degree_or_higher    1
age_18_34                1
bipoc                    0
male                     0
Name: 2, dtype: int64, Predicted=[0.95646564 0.04353436]
X=bach_degree_or_higher    1
age_18_34                1
bipoc                    1
male                     0
Name: 3, dtype: int64, Predicted=[0.9789496 0.0210504]
X=bach_degree_or_higher    0
age_18_34                0
bipoc                    0
male                     0
Name: 4, dtype: int64, Predicted=[0.99857411 0.00142589]


In [72]:
coeffs = log_high_inc.coef_[0]
sig_arg = 1.0*coeffs[0] + 1.0 * coeffs[1] + 0.0 * coeffs[2] + 1.0 * coeffs[3]
print(coeffs)
print(sig_arg)


[-1.48909645 -1.52152214 -2.91707441 -1.24369043]
-4.254309021894497


In [73]:
def stable_sigmoid(x):
    sig = np.where(x < 0, np.exp(x)/(1 + np.exp(x)), 1/(1 + np.exp(-x)))
    return sig

print(stable_sigmoid(sig_arg))

0.014004003766962985


## Outcomes from tech workers in high/low equity areas

In [137]:
# Model for tech workers in high/low equity areas
X_heq = df[df.high_equity_area != -1]
X = X_heq[['tech', 'high_equity_area', 'PWGTP']]
# y = df['low_income']
y_heq = df.query('high_equity_area != -1')
y = y_heq['high_income']

log_heq = LogisticRegression()
log_heq.fit(X[['tech', 'high_equity_area']],y,sample_weight=X['PWGTP'])

to_predict = X[['tech', 'high_equity_area']][0:0]
# tech worker in high equity area
to_predict.loc[0] = [1, 1]
# tech worker in low equity area
to_predict.loc[1] = [1, 0]
# non-tech worker in high equity area
to_predict.loc[2] = [0, 1]
# non-tech worker in low equity area
to_predict.loc[3] = [0, 0]

preds = log_heq.predict_proba(to_predict)
# show the inputs and predicted probabilities
print("Tech workers in high/low equity areas")
for i in range(len(to_predict)):
	print("X=%s, Predicted=%s" % (to_predict.loc[i], preds[i]))

print(f"Probability of high for tech vs. nontech workers in high equity area: {preds[0][1] / preds[2][1]}")
print(f"Probability of high for tech vs. nontech workers in low equity area: {preds[1][1] / preds[3][1]}")


print("Tech workers in all areas")
# Model for tech workers in all areas
X = df[['tech', 'PWGTP']]
# y = df['low_income']
y = df['high_income']
log_tech = LogisticRegression()
log_tech.fit(X[['tech']],y,sample_weight=X['PWGTP'])
to_predict = X[['tech']][0:0]
# tech worker
to_predict.loc[0] = [1]
# non-tech worker
to_predict.loc[1] = [0]

preds = log_tech.predict_proba(to_predict)
# show the inputs and predicted probabilities
for i in range(len(to_predict)):
	print("X=%s, Predicted=%s" % (to_predict.loc[i], preds[i]))
print(f"Probability of high for tech vs. nontech workers: {preds[0][1] / preds[1][1]}")


Tech workers in high/low equity areas
X=tech                1
high_equity_area    1
Name: 0, dtype: int64, Predicted=[0.87486201 0.12513799]
X=tech                1
high_equity_area    0
Name: 1, dtype: int64, Predicted=[0.93568778 0.06431222]
X=tech                0
high_equity_area    1
Name: 2, dtype: int64, Predicted=[0.95785891 0.04214109]
X=tech                0
high_equity_area    0
Name: 3, dtype: int64, Predicted=[0.97929708 0.02070292]
Probability of high for tech vs. nontech workers in high equity area: 2.9695002546334726
Probability of high for tech vs. nontech workers in low equity area: 3.1064319802327325
Tech workers in all areas
X=tech    1
Name: 0, dtype: int64, Predicted=[0.89059379 0.10940621]
X=tech    0
Name: 1, dtype: int64, Predicted=[0.96373481 0.03626519]
Probability of high for tech vs. nontech workers: 3.016837960849857


## High equity areas - white / young / etc.

In [138]:
# Model with all data
X_heq = df[df.high_equity_area != -1]
X = X_heq[['bach_degree_or_higher', 'age_18_34', 'bipoc', 'male', 'high_equity_area', 'PWGTP']]

y_heq = df.query('high_equity_area != -1')
y = y_heq['high_income']


log_hinc_eq = LogisticRegression()
log_hinc_eq.fit(X[['bach_degree_or_higher', 'age_18_34', 'bipoc', 'male', 'high_equity_area']],y,sample_weight=X['PWGTP'])
to_predict = X[['bach_degree_or_higher', 'age_18_34', 'bipoc', 'male', 'high_equity_area']][0:0]
# bach degree, young, non-BIPOC, male, high equity
to_predict.loc[0] = [1, 1, 0, 1, 1]
# bach degree, young, BIPOC, male, high equity
to_predict.loc[1] = [1, 1, 1, 1, 1]
# bach degree, young, non-BIPOC, male, low equity
to_predict.loc[2] = [1, 1, 0, 1, 0]
# bach degree, young, BIPOC, male, low equity
to_predict.loc[3] = [1, 1, 1, 1, 0]

# bach degree, young, non-BIPOC, non-male
# to_predict.loc[2] = [1, 1, 0, 0]
# bach degree, young, BIPOC, non-male
# to_predict.loc[3] = [1, 1, 1, 0]
# bach degree, not young, non-BIPOC, male
# df_empty.loc[2] = [1, 0, 0, 1]
# no bach degree, not young, not BIPOC, non-male
# to_predict.loc[4] = [0, 0, 0, 0]

preds = log_hinc_eq.predict_proba(to_predict)

# show the inputs and predicted probabilities
for i in range(len(to_predict)):
	print("X=%s, Predicted=%s" % (to_predict.loc[i], preds[i]))
print(f"Non-BIPOC vs. BIPOC in high equity: {preds[0][1] / preds[1][1]}")
print(f"Non-BIPOC vs. BIPOC in low equity: {preds[2][1] / preds[3][1]}")

X=bach_degree_or_higher    1
age_18_34                1
bipoc                    0
male                     1
high_equity_area         1
Name: 0, dtype: int64, Predicted=[0.84112665 0.15887335]
X=bach_degree_or_higher    1
age_18_34                1
bipoc                    1
male                     1
high_equity_area         1
Name: 1, dtype: int64, Predicted=[0.92773353 0.07226647]
X=bach_degree_or_higher    1
age_18_34                1
bipoc                    0
male                     1
high_equity_area         0
Name: 2, dtype: int64, Predicted=[0.91475127 0.08524873]
X=bach_degree_or_higher    1
age_18_34                1
bipoc                    1
male                     1
high_equity_area         0
Name: 3, dtype: int64, Predicted=[0.96298905 0.03701095]
Non-BIPOC vs. BIPOC in high equity: 2.1984379371976526
Non-BIPOC vs. BIPOC in low equity: 2.303338345332858


## Model high income in Queen Anne

In [139]:
# Model with all data
X = (df.query('queen_anne == 1'))[['bach_degree_or_higher', 'age_18_34', 'bipoc', 'male', 'PWGTP']]

# y = df['low_income']
y = (df.query('queen_anne == 1'))['high_income']

log_high_inc = LogisticRegression()
log_high_inc.fit(X[['bach_degree_or_higher', 'age_18_34', 'bipoc', 'male']],y,sample_weight=X['PWGTP'])
to_predict = X[['bach_degree_or_higher', 'age_18_34', 'bipoc', 'male']][0:0]
# bach degree, young, non-BIPOC, male
to_predict.loc[0] = [1, 1, 0, 1]
# bach degree, young, BIPOC, male
to_predict.loc[1] = [1, 1, 1, 1]
# bach degree, young, non-BIPOC, non-male
to_predict.loc[2] = [1, 1, 0, 0]
# bach degree, young, BIPOC, non-male
to_predict.loc[3] = [1, 1, 1, 0]
# bach degree, not young, non-BIPOC, male
# df_empty.loc[2] = [1, 0, 0, 1]
# no bach degree, not young, not BIPOC, non-male
to_predict.loc[4] = [0, 0, 0, 0]

preds = log_high_inc.predict_proba(to_predict)

# show the inputs and predicted probabilities
for i in range(len(to_predict)):
	print("X=%s, Predicted=%s" % (to_predict.loc[i], preds[i]))

print(f"Non-BIPOC vs. BIPOC male: {preds[0][1] / preds[1][1]}")
print(f"Non-BIPOC male vs. non-BIPOC non-male: {preds[0][1] / preds[2][1]}")


X=bach_degree_or_higher    1
age_18_34                1
bipoc                    0
male                     1
Name: 0, dtype: int64, Predicted=[0.82881466 0.17118534]
X=bach_degree_or_higher    1
age_18_34                1
bipoc                    1
male                     1
Name: 1, dtype: int64, Predicted=[0.91670102 0.08329898]
X=bach_degree_or_higher    1
age_18_34                1
bipoc                    0
male                     0
Name: 2, dtype: int64, Predicted=[0.92782425 0.07217575]
X=bach_degree_or_higher    1
age_18_34                1
bipoc                    1
male                     0
Name: 3, dtype: int64, Predicted=[0.9669087 0.0330913]
X=bach_degree_or_higher    0
age_18_34                0
bipoc                    0
male                     0
Name: 4, dtype: int64, Predicted=[0.99659061 0.00340939]
Non-BIPOC vs. BIPOC male: 2.055071297169244
Non-BIPOC male vs. non-BIPOC non-male: 2.3717847214776544


https://machinelearningmastery.com/make-predictions-scikit-learn/

## Output counts

In [143]:
features = [
    'bach_degree_or_higher', 
    'stem_related_degree',
    # 'stem_degree',
    'low_income', 
    'high_income',
    'age_18_34',
    'RACWHT',
    'bipoc',
    'non_bipoc',
    'male',
    'tech',
    'queen_anne',
    'cap_hill',
    'nw_seattle',
    'sw_seattle',
    'ne_seattle',
    'high_equity_area'
    ]

totals = df.groupby(features)['PWGTP'].sum()
totals.describe()
totals.to_csv("/Users/chaya/Downloads/inc_counts2.csv")


In [134]:
df['PWGTP'].sum()

616036

In [144]:
(df[df['low_income'] == 1])['PWGTP'].sum()

311784

## Create SQLite Database

https://stackoverflow.com/questions/14431646/how-to-write-pandas-dataframe-to-sqlite-with-index

```
CREATE TABLE IF NOT EXISTS "inc_table"(
  "bach_degree_or_higher" INTEGER,
  "stem_related_degree" INTEGER,
  "low_income" INTEGER,
  "high_income" INTEGER,
  "age_18_34" INTEGER,
  "RACWHT" INTEGER,
  "bipoc" INTEGER,
  "non_bipoc" INTEGER,
  "male" INTEGER,
  "tech" INTEGER,
  "queen_anne" INTEGER,
  "cap_hill" INTEGER,
  "nw_seattle" INTEGER,
  "sw_seattle" INTEGER,
  "ne_seattle" INTEGER,
  "high_equity_area" INTEGER,
  "PWGTP" INTEGER
);
```