<h1>Snow-Related 311 Complaints in Chicago, 2019-2023</h1>
11 January 2024

This analysis looks at all four types of snow-related 311 complaints:
<table>
    <tr><td><strong>SR_SHORT_CODE</strong></td><td><strong>Complaint Description</strong></td><td><strong>Responsible Department</strong></td></tr>
    <tr><td>SWSNOREM</td><td>Snow – Uncleared Sidewalk Complaint</td><td>Streets and Sanitation</td></tr>
    <tr><td>SDO</td><td>Ice and Snow Removal Request</td><td>Streets and Sanitation</td></tr>
    <tr><td>SDW</td><td>Object/Dibs Removal Request</td><td>CDOT</td></tr>
    <tr><td>SNPBLBS</td><td>Snow Removal - Protected Bike Lane or Bridge Sidewalk</td><td>CDOT</td></tr> 
</table>

My analysis steps:
<ol>
<li><a href="#read">Read Data</a>
<li><a href="#tabulate">Tabulate by Community Area</a>
<li><a href="#summarize">Summarize Results</a>
</ol>

<br>
Reminder that snow "seasons" run from July 1 to June 30 of the following year. For example, in my processed data set the season '2020-2021' refers to July 1, 2020 to June 30, 2021. 

<h3>Preliminary Findings</h3>
<ul>
    <li>The five neighborhoods with the most 311 complaints per capita are (in order) Lincoln Square, Logan Square, Uptown, West Town, Lincoln Park
        <li>The five neighborhoods with the least 311 complaints per capita are (in order) West Pullman, Riverdale, Mount Greenwood, Pullman, Hegewisch. These all seem to be car-centric neighborhoods.
</ul>

<a name="read"></a>
# 1. Read Data

In [1]:
import pandas as pd
import requests
#import datetime as dt #would only need this if I manipulated dates post-API data retrieval

### Read snow violations

In [2]:
df = pd.read_csv("../data/02-prepped/311-complaints-snow.csv")
df.head()

In [3]:
len(df)

#### filter for just full snow seasons, from July 2019 to June 2023

In [4]:
df = df[(df['season'].isin(['2019-2020','2020-2021','2021-2022','2022-2023']))]
len(df)

### read community population

In [5]:
# retrieved on 1/11/24, but 2020 Census Population figures should be static

df_population = pd.read_csv("../data/01-raw/population_cmap_2022.csv")

In [6]:
# simplify dataframe to get only essentials. COMMUNITY_CAPS is useful for linking with community areas boundary shapefiles 
df_population = df_population[['GEOG','2020_POP']]
df_population = df_population.rename(columns={'GEOG':'COMMUNITY_NAME'})
df_population['COMMUNITY_CAPS']=df_population['COMMUNITY_NAME'].str.upper()
df_population.head()

<a name="tabulate"></a>
# 2. Tabulate 311 Complaints by Community Area, 2019-2023

In [7]:
df_community_by_type = df.pivot_table(index='COMMUNITY_NAME', columns='SR_TYPE', values='SR_NUMBER', aggfunc='size', fill_value=0)
df_community_by_type.head()

### merge 311 complaints data with community data

In [8]:
df_community_summary = pd.merge(left=df_community_by_type,right=df_population,on='COMMUNITY_NAME')
df_community_summary.head()

### calculate per capita rates of 311 complaints by community

In [9]:
# per 10,000 capita, per year over 4 years
df_community_summary['Streets Per 10k'] = \
(10000/4)*df_community_summary['Ice and Snow Removal Request']/df_community_summary['2020_POP']

df_community_summary['Dibs Per 10k'] = \
(10000/4)*df_community_summary['Snow - Object/Dibs Removal Request']/df_community_summary['2020_POP']

df_community_summary['Sidewalks Per 10k'] = \
(10000/4)*df_community_summary['Snow – Uncleared Sidewalk Complaint']/df_community_summary['2020_POP']

df_community_summary['Bike-Bridge Per 10k'] = \
(10000/4)*df_community_summary['Snow Removal - Protected Bike Lane or Bridge Sidewalk']/df_community_summary['2020_POP']

In [10]:
df_community_summary.head()

<a name="#summarize"></a>
# 3. Summarize Stats for 311 Uncleared Sidewalk Complaints

In [11]:
# filter to look at only uncleared sidewalk complaints
df_uncleared = df[(df['SR_SHORT_CODE']=='SWSNOREM')]
len(df_uncleared)

### uncleared complaints by season

In [12]:
df_by_season = df_uncleared.groupby('season').agg(complaints=('SR_NUMBER','count')).reset_index()
df_by_season

### uncleared complaints by complaint origin

In [13]:
df_by_type = df_uncleared.groupby('ORIGIN').agg(complaints=('SR_NUMBER','count')).reset_index()
df_by_type

In [14]:
# consolidate other
df_uncleared['ORIGIN_binned'] = df_uncleared['ORIGIN'].replace(['E-Mail', 'Generated In House', 'Open311 Interface','Salesforce Mobile App','spot-open311-Chicago+Works','spot-open311-SeeClickFix'], 'Other')

In [15]:
df_by_type = df_uncleared.groupby('ORIGIN_binned').agg(complaints=('SR_NUMBER','count')).reset_index()
df_by_type

### uncleared complaints by status

In [16]:
df_by_status = df_uncleared.groupby('STATUS')['SR_NUMBER'].agg('count').reset_index()
df_by_status

### uncleared complaints per capita by community

In [17]:
df_uncleared_by_community = df_community_summary[['COMMUNITY_NAME','Sidewalks Per 10k','Snow – Uncleared Sidewalk Complaint','2020_POP','COMMUNITY_CAPS']].sort_values(by='Sidewalks Per 10k', ascending=False)
df_uncleared_by_community.head(10)

In [18]:
df_uncleared_by_community.to_csv("../results/311_uncleared_by_community.csv", index=False)