# Exploring COVID in Chicago 

<img src = 'https://upload.wikimedia.org/wikipedia/commons/thumb/8/82/Chicago_sunrise_1.jpg/640px-Chicago_sunrise_1.jpg' width = 800>

#### Dataset: `Chicago_Demographic_Covid_Full.csv`

**Given what we know about John Snow's Grand Experiment, how can we best measure COVID-19's impact on the communities in Chicago?**

In May 2020, Kevin Credit from the Center for Spatial Data Science at the UChicago wrote "Neighborhood inequity: Exploring the factors underlying racial and ethnic disparities in COVID-19 testing and infection rates using ZIP code data in Chicago and New York". 

Kevin was nice enough to share his data with us. Kevin's research also would not have been possible without open source research data from: Illinois Department of Health: https://www.dph.illinois.gov/covid19/covid19-statistics and the U.S. Census Bureau: https://data.census.gov/cedsci

In [2]:
import pandas as pd
pd.options.mode.chained_assignment = None
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
plt.style.use('seaborn')

In [14]:
full = pd.read_csv('../data/Chicago_Demographic_Covid_Full.csv')

In [15]:
full

Unnamed: 0,ZIP,POP,POPDENS,PER0_44,PER45_64,PER65,PERW,PERHIS,PERBLK,PERAUTO,...,PERCROWD,PEROFFTC,PERHSRV,PERPSRV,PERFOOD,PERCLEAN,FDTRTPER,WS__5,CASE_5_1,TEST_5_1
0,60601,14675,0.015708,0.642726,0.215877,0.141397,0.659421,0.086814,0.052266,0.261451,...,0.484966,0.810139,0.007382,0.000868,0.00304,0.010421,0.0,55.353747,51,347
1,60604,782,0.003261,0.599744,0.28133,0.118926,0.612532,0.043478,0.047315,0.150602,...,7.611549,0.766932,0.0,0.0,0.0,0.0,0.0,52.203151,7,36
2,60605,27519,0.008531,0.671536,0.23511,0.093354,0.573458,0.058432,0.16534,0.346656,...,2.484985,0.714412,0.010401,0.014108,0.030927,0.000885,0.0,47.900487,122,751
3,60606,3101,0.005443,0.633344,0.227668,0.138987,0.686553,0.062883,0.023541,0.242353,...,2.451226,0.750098,0.012162,0.0357,0.009808,0.0,0.0,55.775578,30,210
4,60607,29591,0.004992,0.808084,0.137711,0.054206,0.52178,0.083032,0.143963,0.362396,...,2.960289,0.701144,0.005382,0.022648,0.028759,0.016706,0.0,61.040068,247,1036
5,60608,79205,0.004852,0.690234,0.210631,0.099135,0.191251,0.506862,0.172426,0.559437,...,6.32193,0.409088,0.033471,0.020396,0.109536,0.056192,0.166667,61.559835,718,2303
6,60609,61495,0.003067,0.672445,0.214099,0.113456,0.14915,0.534352,0.243288,0.671268,...,7.988031,0.338874,0.035461,0.028845,0.101057,0.068783,0.413793,39.705417,684,1764
7,60610,39019,0.013017,0.64953,0.195956,0.154514,0.687076,0.06666,0.145211,0.301776,...,1.680455,0.705117,0.012628,0.005429,0.038887,0.005198,0.0,53.438216,144,954
8,60611,32426,0.015313,0.544779,0.236909,0.218312,0.714458,0.052458,0.02754,0.264482,...,1.835496,0.761555,0.005898,0.007977,0.01755,0.002466,0.0,93.067471,97,980
9,60612,34311,0.003534,0.704526,0.204074,0.091399,0.213751,0.122585,0.599341,0.473753,...,4.759398,0.534926,0.052931,0.029782,0.051374,0.031542,0.238095,71.271572,375,1854


In [16]:
full.columns

Index(['ZIP', 'POP', 'POPDENS', 'PER0_44', 'PER45_64', 'PER65', 'PERW',
       'PERHIS', 'PERBLK', 'PERAUTO', 'PERTRAN', 'PERPEDB', 'PERTELE',
       'MEDINC', 'PERCROWD', 'PEROFFTC', 'PERHSRV', 'PERPSRV', 'PERFOOD',
       'PERCLEAN', 'FDTRTPER', 'WS__5', 'CASE_5_1', 'TEST_5_1'],
      dtype='object')

### Variables Key

* ZIP: ZIP code

* POP: Population
* POPDENS: Population density (per m2)

**Age**
* PER0_44: Percent of people age 0 to 44
* PER45_64: Percent of people age 45 to 64
* PER65: Percent of people 65 and over


**Racial/Ethnic**
* PERW: Percent white
* PERHIS: Percent hispanic
* PERBLK: Percent black

**Commuting Types**
* PERAUTO: Percent automobile commuters
* PERTRAN: Percent public transportation commuters
* PERPEDB: Percent pedestrian and bike commuters
* PERTELE: Percent teleworkers (work from home)
       '
**Socio-economic status** 
* MEDINC: Median household income

**Household Structure**
* PERCROWD: Percent housing units with >1 person per room

**Occupations**
* PEROFFTC: Percent office workers
* PERHSRV: Percent healthcare service workers
* PERPSRV: Percent public service workers
* PERFOOD: Percent food workers
* PERCLEAN: Percent cleaning service workers
       '
**Healthy Environments**
* FDTRTPER: Percent food desert tracts
* WS__5: Hospital accessibility score

**Covid-19**
* CASE_5_1: Number of positive tests for the week ending 5/1
* TEST_5_1': Number of tests performed folr the week ending 5/1


## Task: Generate a new column of data for positivity rate and add it to the dataframe.