<h1>Snow Violations</h1>
9 November 2023

This analysis looks at 311 complaints filed as "Snow – Uncleared Sidewalk Complaint," or SWSNOREM.<br>
<br>
My analysis steps:
<ol>
<li><a href="#docs">Review API Documentation</a>
<li><a href="#import">Import Libraries</a>
<li><a href="#retrieve_data">Get Data</a>
<li><a href="#summarize">Summarize Data</a>
</ol>

<h3>Preliminary Findings</h3>
<ul>
    <li>4165 W BERTEAU AVE has the most complaints (49), occurring between 11/13/19 to 2/18/21
        <li>The city began tracking complaints on 1/12/19, but data is missing between 1/1/23 and 1/12/23
            <li>complaints increased by 13% from 2021 to 2020 (the first two full years)
                <li>complaints increased by 11% from 2022 to 2021
                    <li>Wards vary from 1542 complaints in the 25th Ward (Sigcho-Lopez, Pilsen) to 88 complaints in the 27th Ward (Burnett, near west side)
</ul>

<h3>Possible Next Steps</h3>
<ul>
    <li>TBD
</ul>

<h3>Questions</h3>
<ul>
    <li>Why does Pilsen have so many 311 complaints?
        <li>TBD
</ul>

<a name = "docs"></a>
    <h1>1. Review API Documentation</h1>

<h3>Socrata Portal Info</h3>
 <ul>
<li><b>API Docs:</b> <a href="https://dev.socrata.com/">https://dev.socrata.com/</a> (general reference for Socrata)<br>
    </ul>   

<h3>311 data</h3>
<ul>
    <li>https://data.cityofchicago.org/Service-Requests/311-Service-Requests/v6vf-nfxy
        <li>API: https://data.cityofchicago.org/resource/v6vf-nfxy.json 
</ul>

<a name = "import"></a>
<h1>2. Import Libraries</h1>

In [1]:
import pandas as pd
import requests
#import datetime as dt #would only need this if I manipulated dates post-API data retrieval

<a name = "retrieve_data"></a>
    <h1>3. Get Data</h1>

In [2]:
base_url = "https://data.cityofchicago.org/resource/v6vf-nfxy.json"
select = "SR_NUMBER, CREATED_DATE, STREET_ADDRESS, COMMUNITY_AREA, WARD"
where = "SR_SHORT_CODE='SWSNOREM'"
limit = 99999

url = f"{base_url}?$SELECT={select}&$WHERE={where}&$LIMIT={limit}"
print (url)

https://data.cityofchicago.org/resource/v6vf-nfxy.json?$SELECT=SR_NUMBER, CREATED_DATE, STREET_ADDRESS, COMMUNITY_AREA, WARD&$WHERE=SR_SHORT_CODE='SWSNOREM'&$LIMIT=99999


In [3]:
response = requests.get(url)
data = response.json()
print (response)

<Response [200]>


In [4]:
df_311=pd.DataFrame(data)
df_311.head()

Unnamed: 0,SR_NUMBER,CREATED_DATE,STREET_ADDRESS,COMMUNITY_AREA,WARD
0,SR23-01894991,2023-11-01T09:19:22.000,4555 N SHERIDAN RD,3,46
1,SR23-01897633,2023-11-01T13:23:12.000,3469 N BROADWAY ST,6,44
2,SR23-01903733,2023-11-02T11:59:47.000,5237 W WARWICK AVE,15,30
3,SR19-01048063,2019-02-21T10:26:44.000,2548 W RASCHER AVE,4,40
4,SR19-00125910,2019-01-28T10:27:35.000,116 W Pershing RD,38,3


In [5]:
df_311['CREATED_DATE'] = pd.to_datetime(df_311['CREATED_DATE'])
df_311['year'] = df_311['CREATED_DATE'].dt.year
df_311['date'] = df_311['CREATED_DATE'].dt.date
df_311

Unnamed: 0,SR_NUMBER,CREATED_DATE,STREET_ADDRESS,COMMUNITY_AREA,WARD,year,date
0,SR23-01894991,2023-11-01 09:19:22,4555 N SHERIDAN RD,3,46,2023,2023-11-01
1,SR23-01897633,2023-11-01 13:23:12,3469 N BROADWAY ST,6,44,2023,2023-11-01
2,SR23-01903733,2023-11-02 11:59:47,5237 W WARWICK AVE,15,30,2023,2023-11-02
3,SR19-01048063,2019-02-21 10:26:44,2548 W RASCHER AVE,4,40,2019,2019-02-21
4,SR19-00125910,2019-01-28 10:27:35,116 W Pershing RD,38,3,2019,2019-01-28
...,...,...,...,...,...,...,...
21506,SR23-00304094,2023-02-26 16:55:36,1923 W ARGYLE ST,4,47,2023,2023-02-26
21507,SR23-00298636,2023-02-25 07:30:17,3701 W CULLOM AVE,16,33,2023,2023-02-25
21508,SR23-00303780,2023-02-26 15:20:57,2237 W GIDDINGS ST,4,47,2023,2023-02-26
21509,SR23-00298980,2023-02-25 09:03:05,441 W BELDEN AVE,7,43,2023,2023-02-25


In [6]:
df_311.head()

Unnamed: 0,SR_NUMBER,CREATED_DATE,STREET_ADDRESS,COMMUNITY_AREA,WARD,year,date
0,SR23-01894991,2023-11-01 09:19:22,4555 N SHERIDAN RD,3,46,2023,2023-11-01
1,SR23-01897633,2023-11-01 13:23:12,3469 N BROADWAY ST,6,44,2023,2023-11-01
2,SR23-01903733,2023-11-02 11:59:47,5237 W WARWICK AVE,15,30,2023,2023-11-02
3,SR19-01048063,2019-02-21 10:26:44,2548 W RASCHER AVE,4,40,2019,2019-02-21
4,SR19-00125910,2019-01-28 10:27:35,116 W Pershing RD,38,3,2019,2019-01-28


<a name = "summarize"></a>
# 4. Summarize Data

### 311 Complaints by Year
note that 2023 is not yet a full year and data are missing for 1/1/23 to 1/12/23

In [7]:
df_by_year = df_311.groupby('year')['date'].agg(['count', 'min', 'max']).reset_index()
df_by_year['pctChange']=df_by_year['count'].pct_change()*100
df_by_year

Unnamed: 0,year,count,min,max,pctChange
0,2018,1,2018-12-30,2018-12-30,
1,2019,1978,2019-01-12,2019-12-31,197700.0
2,2020,5389,2020-01-01,2020-12-31,172.446916
3,2021,6090,2021-01-01,2021-12-28,13.007979
4,2022,6807,2022-01-01,2022-12-29,11.773399
5,2023,1246,2023-01-13,2023-11-02,-81.695314


### 311 Complaints by Address
look for repeat offenders with more than one 311 complaint

In [8]:
df_by_address = df_311.groupby('STREET_ADDRESS').agg(
    count=('STREET_ADDRESS', 'size'),
    min_date=('date', 'min'),
    max_date=('date', 'max')
).reset_index()
df_repeat_offenders = df_by_address[df_by_address['count'] > 1].sort_values(by='count', ascending=False)
df_repeat_offenders

Unnamed: 0,STREET_ADDRESS,count,min_date,max_date
9223,4165 W BERTEAU AVE,49,2019-11-13,2021-02-18
4962,241 W SCOTT ST,38,2021-01-05,2023-01-29
7496,3330 N LAKE SHORE DR,25,2019-11-11,2023-01-31
5023,2424 N KEDZIE BLVD,24,2019-11-12,2022-01-11
12098,579 W HAWTHORNE PL,24,2019-11-13,2020-01-23
...,...,...,...,...
6197,2817 W LOGAN BLVD,2,2019-11-12,2020-02-26
6208,2820 W BERWYN AVE,2,2022-01-04,2022-04-28
6209,2820 W GLENLAKE AVE,2,2021-02-03,2022-01-03
6218,2822 W LYNDALE ST,2,2022-01-26,2022-02-03


### 311 Complaints by Ward

In [9]:
df_by_ward = df_311.groupby('WARD')['SR_NUMBER'].agg(['count']).reset_index().sort_values(by='count', ascending=False)
df_by_ward

Unnamed: 0,WARD,count
25,32,1542
41,47,1495
0,1,1219
38,44,971
37,43,894
34,40,835
11,2,771
40,46,767
19,27,741
45,50,657
