<h1>Snow Violations</h1>
9 January 2024

Revisiting 311 complaints related to snow clearance, I'm broadening my review to look at all related categories (not just uncleared sidewalk complaints)
<br>
My analysis steps:
<ol>
<li><a href="#docs">Review API Documentation</a>
<li><a href="#import">Import Libraries</a>
<li><a href="#retrieve_data">Get Data</a>
<li><a href="#review">Review Data</a>
</ol>

<h3>Preliminary Findings</h3>
<ul>
    <li>the origin of most complaints is phone call (30,697), internet (17,551), mobile (16,278), and alderman's office (3,353)
<li>most complaints are flagged with status as completed
</ul>

<a name = "docs"></a>
    <h1>1. Review Documentation</h1>

<h3>311 data</h3>
<ul>
    <li>Chicago Open Data Portal: <a href="https://data.cityofchicago.org/Service-Requests/311-Service-Requests/v6vf-nfxy">https://data.cityofchicago.org/Service-Requests/311-Service-Requests/v6vf-nfxy</a>
    <li>Chicago API: <a href="https://data.cityofchicago.org/resource/v6vf-nfxy.json">https://data.cityofchicago.org/resource/v6vf-nfxy.json</a>
        <li>API Documentation: https://dev.socrata.com/foundry/data.cityofchicago.org/v6vf-nfxy
            <li><b>Developer Portal (Socrata):</b> <a href="https://dev.socrata.com/">https://dev.socrata.com/</a> (general reference for Socrata)<br>
                <ul>
                    <li>SoQL Like: https://dev.socrata.com/docs/functions/like.html
                </ul>
</ul>

<a name = "import"></a>
<h1>2. Import Libraries</h1>

In [1]:
import pandas as pd
import requests
#import datetime as dt #would only need this if I manipulated dates post-API data retrieval

<a name = "retrieve_data"></a>
    <h1>3. Get Data</h1>

In [2]:
base_url = "https://data.cityofchicago.org/resource/v6vf-nfxy.json"
#select = "SR_NUMBER, SR_TYPE, SR_SHORT_CODE, CREATED_DATE, STREET_ADDRESS, COMMUNITY_AREA, WARD, OWNER_DEPARTMENT, STATUS, ORIGIN, CLOSED_DATE"
where = "SR_TYPE like '%25Snow%25'"
limit = 99999

url = f"{base_url}?$WHERE={where}&$LIMIT={limit}"
#url = f"{base_url}?$SELECT={select}&$WHERE={where}&$LIMIT={limit}"
print (url)

https://data.cityofchicago.org/resource/v6vf-nfxy.json?$WHERE=SR_TYPE like '%25Snow%25'&$LIMIT=99999


In [3]:
response = requests.get(url)
data = response.json()
print (response)

<Response [200]>


In [4]:
df=pd.DataFrame(data)
df.head()

Unnamed: 0,sr_number,sr_type,sr_short_code,owner_department,status,origin,created_date,last_modified_date,closed_date,street_address,...,y_coordinate,latitude,longitude,location,city,state,created_department,parent_sr_number,electrical_district,sanitation_division_days
0,SR19-00102142,Snow - Object/Dibs Removal Request,SDW,Streets and Sanitation,Completed,Internet,2019-01-22T17:47:53.000,2020-02-13T18:22:55.000,2019-03-15T07:25:34.000,23 S Drake AVE,...,,,,,,,,,,
1,SR22-00050149,Ice and Snow Removal Request,SDO,Streets and Sanitation,Completed,Mobile Device,2022-01-10T11:05:32.000,2023-09-27T00:01:48.000,2022-01-10T11:06:41.000,4151 W WASHINGTON BLVD,...,1900052.806391,41.881691809,-87.730165409,"{'latitude': '41.881691808747455', 'longitude'...",,,,,,
2,SR19-00123488,Snow - Object/Dibs Removal Request,SDW,Streets and Sanitation,Completed,Internet,2019-01-27T16:13:36.000,2020-02-13T18:27:44.000,2019-03-15T07:22:15.000,2320 N Luna AVE,...,,,,,,,,,,
3,SR23-01930406,Ice and Snow Removal Request,SDO,Streets and Sanitation,Completed,Internet,2023-11-06T21:46:58.000,2023-11-09T18:25:10.000,2023-11-09T18:25:10.000,1410 E 62ND ST,...,1864254.816946813,41.782635001,-87.5906235,"{'latitude': '41.78263500094009', 'longitude':...",Chicago,Illinois,,,,
4,SR23-01979807,Snow - Object/Dibs Removal Request,SDW,Streets and Sanitation,Completed,Mobile Device,2023-11-14T14:38:50.000,2023-11-27T11:56:32.000,2023-11-27T11:56:32.000,1421 N MENARD AVE,...,1908973.258247206,41.906376001,-87.770403,"{'latitude': '41.90637600094052', 'longitude':...",,,,,,


In [5]:
# total number of records containing "snow"
len(df)

68578

In [6]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 68578 entries, 0 to 68577
Data columns (total 38 columns):
 #   Column                    Non-Null Count  Dtype 
---  ------                    --------------  ----- 
 0   sr_number                 68578 non-null  object
 1   sr_type                   68578 non-null  object
 2   sr_short_code             68578 non-null  object
 3   owner_department          68578 non-null  object
 4   status                    68578 non-null  object
 5   origin                    68578 non-null  object
 6   created_date              68578 non-null  object
 7   last_modified_date        68578 non-null  object
 8   closed_date               68469 non-null  object
 9   street_address            68376 non-null  object
 10  zip_code                  64345 non-null  object
 11  street_number             68376 non-null  object
 12  street_direction          68372 non-null  object
 13  street_name               68376 non-null  object
 14  street_type           

In [7]:
# parse date
df['CREATED_DATE'] = pd.to_datetime(df['CREATED_DATE'])
df['year'] = df['CREATED_DATE'].dt.year
df['date'] = df['CREATED_DATE'].dt.date

KeyError: 'CREATED_DATE'

In [None]:
df.head()

In [None]:
df[df['SR_SHORT_CODE']=='SWSNOREM'].head()

<a name = "review"></a>
# 4. Explore Data

### Complaints by Status

In [None]:
df_by_status = df.pivot_table(index='STATUS', columns='year', values='SR_NUMBER', aggfunc='size', fill_value=0)
df_by_status

### Complaints by Origin

In [None]:
df_by_origin = df.groupby('ORIGIN')['SR_NUMBER'].agg('count').reset_index()
df_by_origin

In [None]:
df_by_origin_by_year = df.pivot_table(index='ORIGIN', columns='year', values='SR_NUMBER', aggfunc='size', fill_value=0)
df_by_origin_by_year

### 311 Complaints by Year
note that 2024 is only a partial year and data are missing for 1/1/23 to 1/12/23

In [None]:
#get latest date
max_date = df['date'].max()
max_date

In [None]:
df_by_year = df.groupby('year')['date'].agg(['count', 'min', 'max']).reset_index()
df_by_year

### Complaint Type Codes (SR_SHORT_CODE)

In [None]:
#connect short codes with descriptions
df_types = df.pivot_table(index=['SR_TYPE','OWNER_DEPARTMENT'], columns='SR_SHORT_CODE', values='SR_NUMBER', aggfunc='size', fill_value=0)
df_types

In [None]:
df_by_year = df.pivot_table(index='SR_TYPE', columns='year', values='SR_NUMBER', aggfunc='size', fill_value=0)
df_by_year