<h1>Snow-Related 311 Complaints in Chicago, 2019-2023</h1>
11 January 2024

This analysis looks at all four types of snow-related 311 complaints:
<table>
    <tr><td><strong>SR_SHORT_CODE</strong></td><td><strong>Complaint Description</strong></td><td><strong>Responsible Department</strong></td></tr>
    <tr><td>SWSNOREM</td><td>Snow – Uncleared Sidewalk Complaint</td><td>Streets and Sanitation</td></tr>
    <tr><td>SDO</td><td>Ice and Snow Removal Request</td><td>Streets and Sanitation</td></tr>
    <tr><td>SDW</td><td>Object/Dibs Removal Request</td><td>CDOT</td></tr>
    <tr><td>SNPBLBS</td><td>Snow Removal - Protected Bike Lane or Bridge Sidewalk</td><td>CDOT</td></tr> 
</table>

My analysis steps:
<ol>
<li><a href="#read">Read Data</a>
<li><a href="#tabulate">Tabulate by Community Area</a>
<li><a href="#summarize">Summarize Results</a>
</ol>

<br>
Reminder that snow "seasons" run from July 1 to June 30 of the following year. For example, in my processed data set the season '2020-2021' refers to July 1, 2020 to June 30, 2021. 

<h3>Preliminary Findings</h3>
<ul>
    <li>The five neighborhoods with the most 311 complaints per capita are (in order) Lincoln Square, Logan Square, Uptown, West Town, Lincoln Park
        <li>The five neighborhoods with the least 311 complaints per capita are (in order) West Pullman, Riverdale, Mount Greenwood, Pullman, Hegewisch. These all seem to be car-centric neighborhoods.
</ul>

<a name="read"></a>
# 1. Read Data

In [2]:
import pandas as pd
#import requests
#import datetime as dt #would only need this if I manipulated dates post-API data retrieval

### Read snow violations

In [3]:
df = pd.read_csv("../../data/05-finalized/311-complaints-snow-all-types.csv")
df.head()

Unnamed: 0,SR_NUMBER,SR_SHORT_CODE,CREATED_DATE,STREET_ADDRESS,COMMUNITY_AREA,WARD,STATUS,ORIGIN,CLOSED_DATE,LATITUDE,LONGITUDE,SR_TYPE,year,month,date,season,GEOID,COMMUNITY_NAME
0,SR21-00149474,SDO,2021-01-27 13:13:50,1700 W 15TH ST,28,28,Completed,Mobile Device,2021-01-27 19:50:47,41.861457,-87.668881,Ice and Snow Removal Request,2021,1,2021-01-27,2020-2021,28,Near West Side
1,SR21-00177207,SDO,2021-02-01 10:19:34,1300 S HEATH AVE,28,28,Completed,Mobile Device,2021-02-01 15:12:41,41.864744,-87.684402,Ice and Snow Removal Request,2021,2,2021-02-01,2020-2021,28,Near West Side
2,SR21-00179217,SDO,2021-02-01 13:17:16,3242 W FULTON BLVD,27,28,Completed,Mobile Device,2021-02-01 21:10:54,41.886675,-87.707985,Ice and Snow Removal Request,2021,2,2021-02-01,2020-2021,27,East Garfield Park
3,SR21-00269268,SDO,2021-02-17 12:43:47,819 S BISHOP ST,28,28,Completed,Mobile Device,2021-02-19 00:58:56,41.871081,-87.662624,Ice and Snow Removal Request,2021,2,2021-02-17,2020-2021,28,Near West Side
4,SR20-05442947,SDW,2020-11-11 19:40:31,3401 W 53RD ST,63,14,Completed,Mobile Device,2020-11-16 06:50:26,41.796901,-87.708733,Snow - Object/Dibs Removal Request,2020,11,2020-11-11,2020-2021,63,Gage Park


In [4]:
len(df)

67361

### read community population

In [5]:
# retrieved on 1/11/24, but 2020 Census Population figures should be static

df_population = pd.read_csv("../../data/00-raw/population_cmap_2022.csv")

In [6]:
# simplify dataframe to get only essentials. COMMUNITY_CAPS is useful for linking with community areas boundary shapefiles 
df_population = df_population[['GEOG','2020_POP']]
df_population = df_population.rename(columns={'GEOG':'COMMUNITY_NAME'})
df_population['COMMUNITY_CAPS']=df_population['COMMUNITY_NAME'].str.upper()
df_population.head()

Unnamed: 0,COMMUNITY_NAME,2020_POP,COMMUNITY_CAPS
0,Albany Park,48396,ALBANY PARK
1,Archer Heights,14196,ARCHER HEIGHTS
2,Armour Square,13890,ARMOUR SQUARE
3,Ashburn,41098,ASHBURN
4,Auburn Gresham,44878,AUBURN GRESHAM


<a name="tabulate"></a>
# 2. Tabulate All 311 Complaints by Community Area, 2019-2023
This exploratory data analysis looked at all four types of snow-related complaints. Skip to the next section to see the "uncleared sidewalk" complaints, which were the focus of the Plow the Sidewalks article for South Side Weekly

In [7]:
df_community_by_type = df.pivot_table(index='COMMUNITY_NAME', columns='SR_TYPE', values='SR_NUMBER', aggfunc='size', fill_value=0)
df_community_by_type.head()

SR_TYPE,Ice and Snow Removal Request,Snow - Object/Dibs Removal Request,Snow Removal - Protected Bike Lane or Bridge Sidewalk,Snow – Uncleared Sidewalk Complaint
COMMUNITY_NAME,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Albany Park,540,195,16,400
Archer Heights,205,253,0,60
Armour Square,65,52,16,77
Ashburn,1506,75,1,108
Auburn Gresham,1449,179,1,128


### merge 311 complaints data with community data

In [8]:
df_community_summary = pd.merge(left=df_community_by_type,right=df_population,on='COMMUNITY_NAME')
df_community_summary.head()

Unnamed: 0,COMMUNITY_NAME,Ice and Snow Removal Request,Snow - Object/Dibs Removal Request,Snow Removal - Protected Bike Lane or Bridge Sidewalk,Snow – Uncleared Sidewalk Complaint,2020_POP,COMMUNITY_CAPS
0,Albany Park,540,195,16,400,48396,ALBANY PARK
1,Archer Heights,205,253,0,60,14196,ARCHER HEIGHTS
2,Armour Square,65,52,16,77,13890,ARMOUR SQUARE
3,Ashburn,1506,75,1,108,41098,ASHBURN
4,Auburn Gresham,1449,179,1,128,44878,AUBURN GRESHAM


### calculate per capita rates of 311 complaints by community

In [9]:
# per 10,000 capita, per year over 4 years
df_community_summary['Streets Per 10k'] = \
(10000/4)*df_community_summary['Ice and Snow Removal Request']/df_community_summary['2020_POP']

df_community_summary['Dibs Per 10k'] = \
(10000/4)*df_community_summary['Snow - Object/Dibs Removal Request']/df_community_summary['2020_POP']

df_community_summary['Sidewalks Per 10k'] = \
(10000/4)*df_community_summary['Snow – Uncleared Sidewalk Complaint']/df_community_summary['2020_POP']

df_community_summary['Bike-Bridge Per 10k'] = \
(10000/4)*df_community_summary['Snow Removal - Protected Bike Lane or Bridge Sidewalk']/df_community_summary['2020_POP']

In [10]:
df_community_summary.head()

Unnamed: 0,COMMUNITY_NAME,Ice and Snow Removal Request,Snow - Object/Dibs Removal Request,Snow Removal - Protected Bike Lane or Bridge Sidewalk,Snow – Uncleared Sidewalk Complaint,2020_POP,COMMUNITY_CAPS,Streets Per 10k,Dibs Per 10k,Sidewalks Per 10k,Bike-Bridge Per 10k
0,Albany Park,540,195,16,400,48396,ALBANY PARK,27.894867,10.073147,20.662865,0.826515
1,Archer Heights,205,253,0,60,14196,ARCHER HEIGHTS,36.101719,44.554804,10.566357,0.0
2,Armour Square,65,52,16,77,13890,ARMOUR SQUARE,11.699064,9.359251,13.858891,2.87977
3,Ashburn,1506,75,1,108,41098,ASHBURN,91.610297,4.562266,6.569663,0.06083
4,Auburn Gresham,1449,179,1,128,44878,AUBURN GRESHAM,80.718838,9.971478,7.130443,0.055707


<a name="#summarize"></a>
# 3. Summarize Stats for 311 Uncleared Sidewalk Complaints

In [13]:
# filter to look at only uncleared sidewalk complaints
df_uncleared = df[(df['SR_SHORT_CODE']=='SWSNOREM')]
len(df_uncleared)

21079

### uncleared complaints by season

In [14]:
df_by_season = df_uncleared.groupby('season').agg(complaints=('SR_NUMBER','count')).reset_index()
df_by_season

Unnamed: 0,season,complaints
0,2019-2020,6541
1,2020-2021,6494
2,2021-2022,6416
3,2022-2023,1628


### uncleared complaints by complaint origin

In [15]:
df_by_type = df_uncleared.groupby('ORIGIN').agg(complaints=('SR_NUMBER','count')).reset_index()
df_by_type

Unnamed: 0,ORIGIN,complaints
0,Alderman's Office,724
1,E-Mail,22
2,Generated In House,1
3,Internet,7068
4,Mobile Device,8260
5,Open311 Interface,1
6,Phone Call,4818
7,Salesforce Mobile App,36
8,spot-open311-Chicago+Works,97
9,spot-open311-SeeClickFix,52


In [16]:
# consolidate other
df_uncleared['ORIGIN_binned'] = df_uncleared['ORIGIN'].replace(['E-Mail', 'Generated In House', 'Open311 Interface','Salesforce Mobile App','spot-open311-Chicago+Works','spot-open311-SeeClickFix'], 'Other')

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_uncleared['ORIGIN_binned'] = df_uncleared['ORIGIN'].replace(['E-Mail', 'Generated In House', 'Open311 Interface','Salesforce Mobile App','spot-open311-Chicago+Works','spot-open311-SeeClickFix'], 'Other')


In [17]:
df_by_type = df_uncleared.groupby('ORIGIN_binned').agg(complaints=('SR_NUMBER','count')).reset_index()
df_by_type

Unnamed: 0,ORIGIN_binned,complaints
0,Alderman's Office,724
1,Internet,7068
2,Mobile Device,8260
3,Other,209
4,Phone Call,4818


### uncleared complaints by status

In [18]:
df_by_status = df_uncleared.groupby('STATUS')['SR_NUMBER'].agg('count').reset_index()
df_by_status

Unnamed: 0,STATUS,SR_NUMBER
0,Canceled,978
1,Completed,20041
2,Open,60


### uncleared complaints per capita by community

In [19]:
df_uncleared_by_community = df_community_summary[['COMMUNITY_NAME','Sidewalks Per 10k','Snow – Uncleared Sidewalk Complaint','2020_POP','COMMUNITY_CAPS']].sort_values(by='Sidewalks Per 10k', ascending=False)
df_uncleared_by_community.head(10)

Unnamed: 0,COMMUNITY_NAME,Sidewalks Per 10k,Snow – Uncleared Sidewalk Complaint,2020_POP,COMMUNITY_CAPS
39,Lincoln Square,57.354176,929,40494,LINCOLN SQUARE
40,Logan Square,55.361753,1587,71665,LOGAN SQUARE
66,Uptown,46.124655,1055,57182,UPTOWN
75,West Town,45.824267,1609,87781,WEST TOWN
38,Lincoln Park,42.628951,1202,70492,LINCOLN PARK
37,Lake View,40.393013,1665,103050,LAKE VIEW
50,North Center,37.23586,523,35114,NORTH CENTER
34,Irving Park,36.19561,752,51940,IRVING PARK
7,Avondale,35.096671,509,36257,AVONDALE
21,Edgewater,26.511653,597,56296,EDGEWATER


In [20]:
# get sum of uncleared sidewalks
df_uncleared_by_community['Snow – Uncleared Sidewalk Complaint'].sum()

21079

# Export

In [21]:
df_uncleared_by_community.to_csv("../../results/ssw01-plow/311_uncleared_by_community.csv", index=False)

In [27]:
# complaints by origin by ward
#df_uncleared.pivot_table(index='WARD',columns='ORIGIN',aggfunc = 'count', values='SR_NUMBER')