# Analysis for Final Map 
## Lily Cao

The most important map I created maps the average differences between metrics 4 and 5 for each Chicago ward. Before, I found that the average difference for all of Chicago in 2016 was -0.00211, suggesting that outages are associated with a 0.211% *decrease* in crime rate. But is this relationship true for all areas of Chicago? To answer this, I calculated the average difference in every ward (there are 50 in total).

In [1]:
import pandas as pd

In [76]:
street_all_out = pd.read_csv('Street All Out.csv')
street_one_out = pd.read_csv('Street One Out.csv')
df = pd.concat([street_all_out, street_one_out]).dropna()
df = df[~df.Status.str.contains('Dup')]
df = df.loc[df['Creation Date'].str.contains('2016')]
df.head()

Unnamed: 0,Creation Date,Status,Completion Date,Service Request Number,Type of Service Request,Street Address,ZIP Code,X Coordinate,Y Coordinate,Ward,Police District,Community Area,Latitude,Longitude,Location
152147,01/01/2016,Completed,01/04/2016,16-00011231,Street Lights - All/Out,8500 S MARYLAND AVE,60619.0,1183375.0,1848798.0,8.0,6.0,44.0,41.740292,-87.603712,"(41.740292180994, -87.603711740043)"
152148,01/01/2016,Completed,01/07/2016,16-00012453,Street Lights - All/Out,1341 S CENTRAL PARK AVE,60623.0,1152592.0,1893323.0,24.0,10.0,29.0,41.863138,-87.715327,"(41.863137937144, -87.715327086831)"
152149,01/01/2016,Completed,01/04/2016,16-00008824,Street Lights - All/Out,9222 S VINCENNES AVE,60620.0,1171092.0,1843569.0,21.0,22.0,73.0,41.726221,-87.648869,"(41.726220778599, -87.648869410126)"
152150,01/01/2016,Completed,01/04/2016,16-00011796,Street Lights - All/Out,8633 S MARYLAND AVE,60619.0,1183402.0,1847917.0,8.0,6.0,44.0,41.737874,-87.60364,"(41.737874156443, -87.603639940115)"
152151,01/01/2016,Completed,01/04/2016,16-00012426,Street Lights - All/Out,8600 S COTTAGE GROVE AVE,60619.0,1183064.0,1848128.0,6.0,6.0,44.0,41.738463,-87.604875,"(41.738462650302, -87.604874776375)"


In [77]:
metric_45 = pd.read_csv('metric45.csv')
metric_45 = metric_45.dropna()
metric_45.head()

Unnamed: 0,street,4th metric,5th metric,4th > 5th metric,longitude,latitude,metric diff
0,1 E 118TH ST,0.0,0.005731,False,-87.622716,41.67985,-0.005731
1,1 E 63RD ST,0.0,0.035191,False,-87.625375,41.780058,-0.035191
2,1 E 69TH ST,0.033333,0.0,True,-87.624942,41.769194,0.033333
3,1 E WACKER DR,0.0,0.00277,False,-87.627975,41.886814,-0.00277
4,1 N CENTRAL AVE,0.0,0.033241,False,-87.764864,41.880246,-0.033241


I created a dictionary mapping each streetlight to it's ward

In [89]:
d = {}
for street in list(metric_45['street']):
    if street not in d:
        d[street] = set(list(df[df['Street Address'] == street]['Ward']))
        
values = []
for v in d.values():
    values.append(v)

In [90]:
all_wards = []
for v in values:
    all_wards.append(v.pop())

Then, I was able to add a ward column to metric_45. 

In [87]:
metric_45['ward'] = all_wards
metric_45.head()

Unnamed: 0,street,4th metric,5th metric,4th > 5th metric,longitude,latitude,metric diff,ward
0,1 E 118TH ST,0.0,0.005731,False,-87.622716,41.67985,-0.005731,9.0
1,1 E 63RD ST,0.0,0.035191,False,-87.625375,41.780058,-0.035191,20.0
2,1 E 69TH ST,0.033333,0.0,True,-87.624942,41.769194,0.033333,6.0
3,1 E WACKER DR,0.0,0.00277,False,-87.627975,41.886814,-0.00277,42.0
4,1 N CENTRAL AVE,0.0,0.033241,False,-87.764864,41.880246,-0.033241,29.0


In [161]:
import numpy as np
ward_set = list(set(all_wards))[1:]

m_diff = []
for ward in ward_set:
    m_diff.append([metric_45[metric_45['ward'] == ward]['metric diff']])

In [179]:
mean_diffs = []
for m in m_diff:
    mean_diffs.append(np.mean(m))
    
ward_df = pd.DataFrame(ward_set, columns = ['ward'])
ward_df['metric diff avg.'] = mean_diffs
ward_df.head()

Unnamed: 0,ward,metric diff avg.
0,1.0,-0.001774
1,2.0,0.000573
2,3.0,-0.003864
3,4.0,-0.005524
4,5.0,0.000133


To map these averages on QGIS, I needed to create a csv with each ward's boundaries. I got ward data from the Chicago Data Portal: https://data.cityofchicago.org/Facilities-Geographic-Boundaries/Boundaries-Wards-2015-/sp34-6z76

This provides information on wards in Chicago from 2015 - present. 

In [176]:
ward_bound = pd.read_csv('WARDS_2015.csv')
ward_bound = ward_bound.sort_values(by=['WARD'])
ward_bound['metric diff avg.'] = mean_diffs
ward_bound = ward_bound.sort_values(by=['metric diff avg.'], ascending=False)
ward_bound

Unnamed: 0,the_geom,WARD,SHAPE_Leng,SHAPE_Area,metric diff avg.
11,MULTIPOLYGON (((-87.7182670339195 41.968802949...,35,67016.637939,57297720.0,0.007368
4,MULTIPOLYGON (((-87.66420403810295 42.02126158...,49,38122.692826,49733460.0,0.005138
6,MULTIPOLYGON (((-87.80310674705102 41.94000768...,29,107529.243573,128819100.0,0.004918
16,MULTIPOLYGON (((-87.7468767639261 41.939274637...,31,50635.783154,69739760.0,0.004582
22,MULTIPOLYGON (((-87.72098294925175 41.88805384...,28,119977.208819,142879700.0,0.004152
17,MULTIPOLYGON (((-87.6597679100119 41.972676253...,47,53371.30551,87363640.0,0.003688
43,MULTIPOLYGON (((-87.76192932195755 41.94874803...,36,91959.983641,88624180.0,0.002572
7,MULTIPOLYGON (((-87.71438187841963 41.82673338...,14,90165.797407,143011000.0,0.001891
33,MULTIPOLYGON (((-87.63393002737043 41.93301293...,43,48544.534907,65206370.0,0.000686
10,MULTIPOLYGON (((-87.66136715149712 41.92723211...,2,110739.852187,53934810.0,0.000573
