<h1>Snow Violations</h1>
9 January 2024

Revisiting 311 complaints related to snow clearance, I'm broadening my review to look at all related categories (not just uncleared sidewalk complaints)
<br>
My analysis steps:
<ol>
<li><a href="#docs">Review API Documentation</a>
<li><a href="#import">Import Libraries</a>
<li><a href="#retrieve_data">Get Data</a>
<li><a href="#review">Review Data</a>
</ol>

<h3>Preliminary Findings</h3>
<ul>
    <li>the origin of most complaints is phone call (30,697), internet (17,551), mobile (16,278), and alderman's office (3,353)
<li>most complaints are flagged with status as completed
</ul>

<a name = "docs"></a>
    <h1>1. Review Documentation</h1>

<h3>311 data</h3>
<ul>
    <li>Chicago Open Data Portal: <a href="https://data.cityofchicago.org/Service-Requests/311-Service-Requests/v6vf-nfxy">https://data.cityofchicago.org/Service-Requests/311-Service-Requests/v6vf-nfxy</a>
    <li>Chicago API: <a href="https://data.cityofchicago.org/resource/v6vf-nfxy.json">https://data.cityofchicago.org/resource/v6vf-nfxy.json</a>
        <li>API Documentation: https://dev.socrata.com/foundry/data.cityofchicago.org/v6vf-nfxy
            <li><b>Developer Portal (Socrata):</b> <a href="https://dev.socrata.com/">https://dev.socrata.com/</a> (general reference for Socrata)<br>
                <ul>
                    <li>SoQL Like: https://dev.socrata.com/docs/functions/like.html
                </ul>
</ul>

<a name = "import"></a>
<h1>2. Import Libraries</h1>

In [1]:
import pandas as pd
import requests
#import datetime as dt #would only need this if I manipulated dates post-API data retrieval

<a name = "retrieve_data"></a>
    <h1>3. Get Data</h1>

In [13]:
base_url = "https://data.cityofchicago.org/resource/v6vf-nfxy.json"
#select = "SR_NUMBER, SR_TYPE, SR_SHORT_CODE, CREATED_DATE, STREET_ADDRESS, COMMUNITY_AREA, WARD, OWNER_DEPARTMENT, STATUS, ORIGIN, CLOSED_DATE"
where = "SR_TYPE like '%25Snow%25'"
limit = 99999

url = f"{base_url}?$WHERE={where}&$LIMIT={limit}"
#url = f"{base_url}?$SELECT={select}&$WHERE={where}&$LIMIT={limit}"
print (url)

https://data.cityofchicago.org/resource/v6vf-nfxy.json?$WHERE=SR_TYPE like '%25Snow%25'&$LIMIT=99999


In [16]:
response = requests.get(url)
data = response.json()
print (response)

<Response [200]>


In [17]:
df=pd.DataFrame(data)
df.head()

Unnamed: 0,sr_number,sr_type,sr_short_code,owner_department,status,origin,created_date,last_modified_date,closed_date,street_address,...,y_coordinate,latitude,longitude,location,city,state,created_department,parent_sr_number,electrical_district,sanitation_division_days
0,SR19-00102142,Snow - Object/Dibs Removal Request,SDW,Streets and Sanitation,Completed,Internet,2019-01-22T17:47:53.000,2020-02-13T18:22:55.000,2019-03-15T07:25:34.000,23 S Drake AVE,...,,,,,,,,,,
1,SR22-00050149,Ice and Snow Removal Request,SDO,Streets and Sanitation,Completed,Mobile Device,2022-01-10T11:05:32.000,2023-09-27T00:01:48.000,2022-01-10T11:06:41.000,4151 W WASHINGTON BLVD,...,1900052.806391,41.881691809,-87.730165409,"{'latitude': '41.881691808747455', 'longitude'...",,,,,,
2,SR19-00123488,Snow - Object/Dibs Removal Request,SDW,Streets and Sanitation,Completed,Internet,2019-01-27T16:13:36.000,2020-02-13T18:27:44.000,2019-03-15T07:22:15.000,2320 N Luna AVE,...,,,,,,,,,,
3,SR23-01930406,Ice and Snow Removal Request,SDO,Streets and Sanitation,Completed,Internet,2023-11-06T21:46:58.000,2023-11-09T18:25:10.000,2023-11-09T18:25:10.000,1410 E 62ND ST,...,1864254.816946813,41.782635001,-87.5906235,"{'latitude': '41.78263500094009', 'longitude':...",Chicago,Illinois,,,,
4,SR23-01979807,Snow - Object/Dibs Removal Request,SDW,Streets and Sanitation,Completed,Mobile Device,2023-11-14T14:38:50.000,2023-11-27T11:56:32.000,2023-11-27T11:56:32.000,1421 N MENARD AVE,...,1908973.258247206,41.906376001,-87.770403,"{'latitude': '41.90637600094052', 'longitude':...",,,,,,


In [18]:
# total number of records containing "snow"
len(df)

68551

In [19]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 68551 entries, 0 to 68550
Data columns (total 38 columns):
 #   Column                    Non-Null Count  Dtype 
---  ------                    --------------  ----- 
 0   sr_number                 68551 non-null  object
 1   sr_type                   68551 non-null  object
 2   sr_short_code             68551 non-null  object
 3   owner_department          68551 non-null  object
 4   status                    68551 non-null  object
 5   origin                    68551 non-null  object
 6   created_date              68551 non-null  object
 7   last_modified_date        68551 non-null  object
 8   closed_date               68440 non-null  object
 9   street_address            68349 non-null  object
 10  zip_code                  64325 non-null  object
 11  street_number             68349 non-null  object
 12  street_direction          68345 non-null  object
 13  street_name               68349 non-null  object
 14  street_type           

In [7]:
# parse date
df['CREATED_DATE'] = pd.to_datetime(df['CREATED_DATE'])
df['year'] = df['CREATED_DATE'].dt.year
df['date'] = df['CREATED_DATE'].dt.date

In [8]:
df.head()

Unnamed: 0,SR_NUMBER,SR_TYPE,SR_SHORT_CODE,CREATED_DATE,STREET_ADDRESS,COMMUNITY_AREA,WARD,OWNER_DEPARTMENT,STATUS,ORIGIN,CLOSED_DATE,year,date
0,SR21-00149474,Ice and Snow Removal Request,SDO,2021-01-27 13:13:50,1700 W 15TH ST,28,28,Streets and Sanitation,Completed,Mobile Device,2021-01-27T19:50:47.000,2021,2021-01-27
1,SR21-00177207,Ice and Snow Removal Request,SDO,2021-02-01 10:19:34,1300 S HEATH AVE,28,28,Streets and Sanitation,Completed,Mobile Device,2021-02-01T15:12:41.000,2021,2021-02-01
2,SR21-00179217,Ice and Snow Removal Request,SDO,2021-02-01 13:17:16,3242 W FULTON BLVD,27,28,Streets and Sanitation,Completed,Mobile Device,2021-02-01T21:10:54.000,2021,2021-02-01
3,SR21-00269268,Ice and Snow Removal Request,SDO,2021-02-17 12:43:47,819 S BISHOP ST,28,28,Streets and Sanitation,Completed,Mobile Device,2021-02-19T00:58:56.000,2021,2021-02-17
4,SR20-05442947,Snow - Object/Dibs Removal Request,SDW,2020-11-11 19:40:31,3401 W 53RD ST,63,14,Streets and Sanitation,Completed,Mobile Device,2020-11-16T06:50:26.000,2020,2020-11-11


In [9]:
df[df['SR_SHORT_CODE']=='SWSNOREM'].head()

Unnamed: 0,SR_NUMBER,SR_TYPE,SR_SHORT_CODE,CREATED_DATE,STREET_ADDRESS,COMMUNITY_AREA,WARD,OWNER_DEPARTMENT,STATUS,ORIGIN,CLOSED_DATE,year,date
5,SR19-02929866,Snow – Uncleared Sidewalk Complaint,SWSNOREM,2019-11-12 09:31:20,,,,CDOT - Department of Transportation,Completed,Internet,2019-11-21T09:15:03.000,2019,2019-11-12
81,SR20-05649092,Snow – Uncleared Sidewalk Complaint,SWSNOREM,2020-12-17 13:24:37,5200 S BLACKSTONE AVE,41.0,4.0,CDOT - Department of Transportation,Completed,Phone Call,2020-12-18T15:12:51.000,2020,2020-12-17
96,SR21-00001364,Snow – Uncleared Sidewalk Complaint,SWSNOREM,2021-01-01 11:40:33,66 E CHESTNUT ST,8.0,42.0,CDOT - Department of Transportation,Completed,Mobile Device,2021-01-04T14:33:44.000,2021,2021-01-01
97,SR21-00000774,Snow – Uncleared Sidewalk Complaint,SWSNOREM,2021-01-01 09:11:15,2648 N WHIPPLE ST,22.0,32.0,CDOT - Department of Transportation,Completed,Internet,2021-01-06T13:37:58.000,2021,2021-01-01
99,SR20-05723378,Snow – Uncleared Sidewalk Complaint,SWSNOREM,2020-12-31 07:19:37,5015 N SPRINGFIELD AVE,14.0,39.0,CDOT - Department of Transportation,Completed,Mobile Device,2021-01-08T15:30:57.000,2020,2020-12-31


<a name = "review"></a>
# 4. Explore Data

### Complaints by Status

In [10]:
df_by_status = df.pivot_table(index='STATUS', columns='year', values='SR_NUMBER', aggfunc='size', fill_value=0)
df_by_status

year,2018,2019,2020,2021,2022,2023,2024
STATUS,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
Canceled,0,12,18,556,375,32,0
Completed,1,3750,7881,29569,23600,2550,62
Open,0,3,13,21,31,7,56


### Complaints by Origin

In [12]:
df_by_origin = df.groupby('ORIGIN')['SR_NUMBER'].agg('count').reset_index()
df_by_origin

Unnamed: 0,ORIGIN,SR_NUMBER
0,Alderman's Office,3353
1,Chicago Police Department,2
2,City Department,19
3,E-Mail,56
4,Generated In House,1
5,Internet,17551
6,Mobile Device,16278
7,Open311,1
8,Open311 Interface,2
9,Phone Call,30697


In [11]:
df_by_origin_by_year = df.pivot_table(index='ORIGIN', columns='year', values='SR_NUMBER', aggfunc='size', fill_value=0)
df_by_origin_by_year

year,2018,2019,2020,2021,2022,2023,2024
ORIGIN,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
Alderman's Office,0,187,227,1240,1624,73,2
Chicago Police Department,0,0,0,0,0,2,0
City Department,0,0,3,16,0,0,0
E-Mail,0,1,2,37,11,5,0
Generated In House,0,0,0,0,1,0,0
Internet,0,944,2129,7727,5986,741,24
Mobile Device,1,1078,2883,5500,5913,861,42
Open311,0,1,0,0,0,0,0
Open311 Interface,0,0,0,1,0,1,0
Phone Call,0,1543,2624,15324,10277,879,50


### 311 Complaints by Year
note that 2024 is only a partial year and data are missing for 1/1/23 to 1/12/23

In [None]:
#get latest date
max_date = df['date'].max()
max_date

In [None]:
df_by_year = df.groupby('year')['date'].agg(['count', 'min', 'max']).reset_index()
df_by_year

### Complaint Type Codes (SR_SHORT_CODE)

In [None]:
#connect short codes with descriptions
df_types = df.pivot_table(index=['SR_TYPE','OWNER_DEPARTMENT'], columns='SR_SHORT_CODE', values='SR_NUMBER', aggfunc='size', fill_value=0)
df_types

In [None]:
df_by_year = df.pivot_table(index='SR_TYPE', columns='year', values='SR_NUMBER', aggfunc='size', fill_value=0)
df_by_year

# Prior Analysis (November 2023)

### 311 Complaints by Address
look for repeat offenders with more than one 311 complaint

In [None]:
df_by_address = df_311.groupby('STREET_ADDRESS').agg(
    count=('STREET_ADDRESS', 'size'),
    min_date=('date', 'min'),
    max_date=('date', 'max')
).reset_index()
df_repeat_offenders = df_by_address[df_by_address['count'] > 1].sort_values(by='count', ascending=False)
df_repeat_offenders

### 311 Complaints by Ward

In [None]:
df_by_ward = df_311.groupby('WARD')['SR_NUMBER'].agg(['count']).reset_index().sort_values(by='count', ascending=False)
df_by_ward