<h1>Snow Clearance Fines, 2019-2023</h1>
13 February 2024

This analysis looks at fines levied for uncleared sidewalks, based on FOIA data requested from the Department of Administrative Hearings (H064920-011124.xlsx). This dataset contained 3058 records dating from 1/1/2001 to 9/12/2023; filtered for those between 7/1/2019 and 6/30/2023 has 2560 records. Four of these could not be geocoded due to "unknown" address.<br>
<br>
<div style="color:red;">My analysis steps:</div>
<ol>
<li><a href="#read">Read Data</a>
    <li><a href="#summarize">Summarize</a>- by issuing department, by year, by community area
        <li><a href="#community">Community Summary</a>
</ol>

### Record Count
<ul>
    <li>2556 valid records
        <li>1912 dockets
            <li>1735 addresses
                <li>some addresses have been fined by multiple agencies
    </ul>

### Record Review
<ul>
    <li>some addresses have been fined by both CDOT and Streets & Sanitation
        <li>within a docket with multiple records, each record is identical except for the fine amount
    </ul>

### Preliminary Findings
<ul>
    <li>Englewood, Garfield Ridge, and West Englewood are the three communities with the highest number of dockets per capita
    <li>17 of 25 police-related dockets related to snow clearance were in Englewood, West Englewood, Garfield Ridge, and Belmont Cragin
    </ul>

<a name="read"></a>
# 1. Read Data

In [22]:
import pandas as pd
import requests
import numpy as np
import altair as alt
#import datetime as dt #would only need this if I manipulated dates post-API data retrieval

In [23]:
df = pd.read_csv("../data/fines-geocoded-w-communities.csv")
df.head()

Unnamed: 0,field_1,Docket Number,Violation Date,Violation Address,Issuing Department Code,Imposed Fine Detailed,year,month,date,season,...,latlong,community,area,shape_area,perimeter,area_num_1,area_numbe,comarea_id,comarea,shape_len
0,416,20DT000917,2020/02/14,3100 S INDIANA,TRANPORT,0.0,2020,2,2020/02/14,2019-2020,...,"41.838241,-87.622033",DOUGLAS,0.0,46004620.0,0.0,35.0,35.0,0.0,0.0,31027.05451
1,417,20DT000917,2020/02/14,3100 S INDIANA,TRANPORT,150.0,2020,2,2020/02/14,2019-2020,...,"41.838241,-87.622033",DOUGLAS,0.0,46004620.0,0.0,35.0,35.0,0.0,0.0,31027.05451
2,418,20DT000917,2020/02/14,3100 S INDIANA,TRANPORT,500.0,2020,2,2020/02/14,2019-2020,...,"41.838241,-87.622033",DOUGLAS,0.0,46004620.0,0.0,35.0,35.0,0.0,0.0,31027.05451
3,757,21DT000478,2021/01/28,3317 S PRAIRIE,TRANPORT,110.0,2021,1,2021/01/28,2020-2021,...,"41.834323092878925,-87.62050356526285",DOUGLAS,0.0,46004620.0,0.0,35.0,35.0,0.0,0.0,31027.05451
4,773,21DT000493,2021/01/29,3658 S PRAIRIE,TRANPORT,500.0,2021,1,2021/01/29,2020-2021,...,"41.828145583678925,-87.62059581323",DOUGLAS,0.0,46004620.0,0.0,35.0,35.0,0.0,0.0,31027.05451


In [24]:
len(df)

2557

### parse lat and long

In [25]:
#parse lat and long
df['lat']=df['latlong'].str.split(',').str[0]
df['long']=df['latlong'].str.split(',').str[1]

### rename and reduce columns

In [26]:
df = df[['Docket Number','Cleaned Address','Issuing Department Code','Imposed Fine Detailed','date','season','community','lat','long']]

In [27]:
df=df.rename(columns={"Docket Number":"docket","Cleaned Address":"address","Issuing Department Code":"dept","Imposed Fine Detailed":"fine_amount","date":"violation_date"})

### review and clean records with no community assigned

In [28]:
df[df["community"].isna()]

Unnamed: 0,docket,address,dept,fine_amount,violation_date,season,community,lat,long
2551,19DS72153L,300 W WASHINGTON ST,STRTSAN,0.0,2019/11/12,2019-2020,,41.882868,-88.210529
2552,19DS72153L,300 W WASHINGTON ST,STRTSAN,150.0,2019/11/12,2019-2020,,41.882868,-88.210529
2553,19DS72153L,300 W WASHINGTON ST,STRTSAN,500.0,2019/11/12,2019-2020,,41.882868,-88.210529
2554,21CP002398,1850 W MARQUETTE ST,POLICE,0.0,2021/02/12,2020-2021,,42.3236639,-87.8381484
2555,21CP002398,1850 W MARQUETTE ST,POLICE,50.0,2021/02/12,2020-2021,,42.3236639,-87.8381484
2556,21DT002747,5450 S 47TH,TRANPORT,500.0,2021/01/27,2020-2021,,41.8654712,-87.7423005


One address, 5450 S 47TH (docket 21DT002747), is not a valid address but was geocoded at 41.8654712,-87.7423005 at the edge of Lawndale by Roosevelt, which doesn't look right. If it's 5450 W. 47th it'd be in suburban Stickney. So I'm choosing to skip it.

In [29]:
#drop record with invalid address
index_to_drop = df[df['address'] == '5450 S 47TH'].index
df = df.drop(index_to_drop)

In [30]:
# manually assign community areas for the other two addresses
df.loc[df['address'] == '300 W WASHINGTON ST', 'community'] = 'LOOP'
df.loc[df['address'] == '1850 W MARQUETTE ST', 'community'] = 'WEST ENGLEWOOD'

In [31]:
df.tail()

Unnamed: 0,docket,address,dept,fine_amount,violation_date,season,community,lat,long
2551,19DS72153L,300 W WASHINGTON ST,STRTSAN,0.0,2019/11/12,2019-2020,LOOP,41.882868,-88.210529
2552,19DS72153L,300 W WASHINGTON ST,STRTSAN,150.0,2019/11/12,2019-2020,LOOP,41.882868,-88.210529
2553,19DS72153L,300 W WASHINGTON ST,STRTSAN,500.0,2019/11/12,2019-2020,LOOP,41.882868,-88.210529
2554,21CP002398,1850 W MARQUETTE ST,POLICE,0.0,2021/02/12,2020-2021,WEST ENGLEWOOD,42.3236639,-87.8381484
2555,21CP002398,1850 W MARQUETTE ST,POLICE,50.0,2021/02/12,2020-2021,WEST ENGLEWOOD,42.3236639,-87.8381484


<a name="summarize"></a>
# 2. Summarize by Docket

In [32]:
# by season
df_dockets=df.groupby(['docket','dept','address','lat','long','community','violation_date']).agg(
    n_records=('docket','count'),
    total_fine=('fine_amount', 'sum')
).reset_index()

In [33]:
df_dockets.head()

Unnamed: 0,docket,dept,address,lat,long,community,violation_date,n_records,total_fine
0,19DS68300L,STRTSAN,4710 S WESTERN AVE,41.807859,-87.68479703503766,BRIGHTON PARK,2019/11/13,1,150.0
1,19DS69216L,STRTSAN,1425 W MORSE AVE,42.0074513,-87.6668285,ROGERS PARK,2019/11/13,1,50.0
2,19DS70010L,STRTSAN,715 E 47TH ST,41.8093383,-87.6080127,GRAND BOULEVARD,2019/11/13,1,150.0
3,19DS72153L,STRTSAN,300 W WASHINGTON ST,41.882868,-88.210529,LOOP,2019/11/12,3,650.0
4,19DS72160L,STRTSAN,6929 N SHERIDAN RD,41.9598134,-87.654693,UPTOWN,2019/11/14,1,500.0


### by issuing department

In [34]:
# by issuing department
df_dockets.groupby('dept').agg(
    total_fine=('total_fine', 'sum'),
    n_dockets=('docket','count'),
    n_records=('n_records','sum')
).reset_index()

Unnamed: 0,dept,total_fine,n_dockets,n_records
0,BAFCONP,0.0,2,2
1,POLICE,1700.0,25,30
2,STRTSAN,176610.0,497,659
3,TRANPORT,406959.0,1388,1865


# 3. Summarize by Address

In [35]:
# create pivot table
df_addresses = df_dockets.pivot_table(index=['address', 'lat','long','community'],
                             columns='dept',
                             values='total_fine',
                             aggfunc=['count', 'sum'],
                             fill_value=0)

# Rename columns for clarity
df_addresses.columns = ['_'.join(col).strip() for col in df_addresses.columns.values]

df_addresses_dates = df_dockets.groupby('address')['violation_date'].agg(['min', 'max']).reset_index()

# Add total columns for each row
df_addresses['n_dockets'] = df_addresses.filter(like='count_').sum(axis=1)
df_addresses['total_fines'] = df_addresses.filter(like='sum_').sum(axis=1)

df_addresses.reset_index(inplace=True)

In [36]:
df_addresses.head()

Unnamed: 0,address,lat,long,community,count_BAFCONP,count_POLICE,count_STRTSAN,count_TRANPORT,sum_BAFCONP,sum_POLICE,sum_STRTSAN,sum_TRANPORT,n_dockets,total_fines
0,10 N KILBOURN AVE,41.88096235,-87.73835576302265,WEST GARFIELD PARK,0,0,1,0,0.0,0.0,150.0,0.0,1,150.0
1,100 E CHESTNUT ST,41.8985878,-87.62589246260751,NEAR NORTH SIDE,0,0,0,1,0.0,0.0,0.0,550.0,1,550.0
2,100 N KEDZIE,41.9467224,-87.7078351,IRVING PARK,0,0,0,1,0.0,0.0,0.0,0.0,1,0.0
3,100 N KEDZIE AVE,41.88315915,-87.70652906434054,EAST GARFIELD PARK,0,0,3,0,0.0,0.0,150.0,0.0,3,150.0
4,100 W GRAND,41.7655813,-87.6216949,GREATER GRAND CROSSING,0,0,0,1,0.0,0.0,0.0,500.0,1,500.0


In [37]:
def get_department(BAFCONP, POLICE, STRTSAN, TRANPORT):
    conditions = [
        (TRANPORT >= 1) & (STRTSAN >= 1),
        (TRANPORT >= 1),
        (STRTSAN >= 1),
        (POLICE >= 1),
        (BAFCONP >= 1)
    ]
    choices = ['cdot_and_streets', 'cdot', 'streets', 'police', 'bafconp']
    return np.select(conditions, choices, default='unknown')

In [38]:
df_addresses['dept']= get_department(df_addresses['count_BAFCONP'],
                                     df_addresses['count_POLICE'],
                                     df_addresses['count_STRTSAN'],
                                     df_addresses['count_TRANPORT'])

In [39]:
df_addresses.head()

Unnamed: 0,address,lat,long,community,count_BAFCONP,count_POLICE,count_STRTSAN,count_TRANPORT,sum_BAFCONP,sum_POLICE,sum_STRTSAN,sum_TRANPORT,n_dockets,total_fines,dept
0,10 N KILBOURN AVE,41.88096235,-87.73835576302265,WEST GARFIELD PARK,0,0,1,0,0.0,0.0,150.0,0.0,1,150.0,streets
1,100 E CHESTNUT ST,41.8985878,-87.62589246260751,NEAR NORTH SIDE,0,0,0,1,0.0,0.0,0.0,550.0,1,550.0,cdot
2,100 N KEDZIE,41.9467224,-87.7078351,IRVING PARK,0,0,0,1,0.0,0.0,0.0,0.0,1,0.0,cdot
3,100 N KEDZIE AVE,41.88315915,-87.70652906434054,EAST GARFIELD PARK,0,0,3,0,0.0,0.0,150.0,0.0,3,150.0,streets
4,100 W GRAND,41.7655813,-87.6216949,GREATER GRAND CROSSING,0,0,0,1,0.0,0.0,0.0,500.0,1,500.0,cdot


# 4. Dockets by Community

In [40]:
df_dockets

Unnamed: 0,docket,dept,address,lat,long,community,violation_date,n_records,total_fine
0,19DS68300L,STRTSAN,4710 S WESTERN AVE,41.807859,-87.68479703503766,BRIGHTON PARK,2019/11/13,1,150.0
1,19DS69216L,STRTSAN,1425 W MORSE AVE,42.0074513,-87.6668285,ROGERS PARK,2019/11/13,1,50.0
2,19DS70010L,STRTSAN,715 E 47TH ST,41.8093383,-87.6080127,GRAND BOULEVARD,2019/11/13,1,150.0
3,19DS72153L,STRTSAN,300 W WASHINGTON ST,41.882868,-88.210529,LOOP,2019/11/12,3,650.0
4,19DS72160L,STRTSAN,6929 N SHERIDAN RD,41.9598134,-87.654693,UPTOWN,2019/11/14,1,500.0
...,...,...,...,...,...,...,...,...,...
1907,23DT000667,TRANPORT,5825 S PULASKI RD,41.786671111642875,-87.72287320717435,WEST ELSDON,2023/01/31,1,150.0
1908,23DT000668,TRANPORT,3959 W 58TH ST,41.78769055748644,-87.72197548711587,WEST ELSDON,2023/01/31,1,150.0
1909,23DT000669,TRANPORT,4122 W 63RD ST,41.77880795,-87.72624239552133,WEST LAWN,2023/02/01,1,500.0
1910,23DT000940,TRANPORT,6501 W ARCHER,41.7920606,-87.78442464913152,GARFIELD RIDGE,2023/02/01,1,150.0


In [68]:
df_community = df_dockets.pivot_table(index='community',
                             columns='dept',
                             values='total_fine',
                             aggfunc=['count'],
                             fill_value=0).reset_index()
# Flatten the MultiIndex in columns
df_community.columns = ['_'.join(col).strip() for col in df_community.columns.values]

df_community = df_community.rename(columns={'community_':'community'})

# Reset the index to flatten it
df_community.reset_index(drop=True, inplace=True)

df_community['n_dockets']=df_community['count_BAFCONP']+df_community['count_POLICE']+df_community['count_STRTSAN']+df_community['count_TRANPORT']

df_community

Unnamed: 0,community,count_BAFCONP,count_POLICE,count_STRTSAN,count_TRANPORT,n_dockets
0,ALBANY PARK,0,0,0,15,15
1,ARCHER HEIGHTS,0,0,0,12,12
2,ARMOUR SQUARE,0,0,0,26,26
3,ASHBURN,0,0,0,2,2
4,AUBURN GRESHAM,0,0,3,16,19
...,...,...,...,...,...,...
66,WEST LAWN,0,0,2,15,17
67,WEST PULLMAN,0,1,1,3,5
68,WEST RIDGE,0,0,7,17,24
69,WEST TOWN,0,0,1,96,97


### read community population

In [69]:
# retrieved on 1/11/24, but 2020 Census Population figures should be static
df_population = pd.read_csv("../data/population_cmap_2022.csv")

# simplify dataframe to get only essentials
df_population = df_population[['GEOID','GEOG','2020_POP']]
df_population = df_population.rename(columns={'GEOG':'COMMUNITY_NAME'})
df_population['COMMUNITY_CAPS']=df_population['COMMUNITY_NAME'].str.upper()
df_population.head()

Unnamed: 0,GEOID,COMMUNITY_NAME,2020_POP,COMMUNITY_CAPS
0,14,Albany Park,48396,ALBANY PARK
1,57,Archer Heights,14196,ARCHER HEIGHTS
2,34,Armour Square,13890,ARMOUR SQUARE
3,70,Ashburn,41098,ASHBURN
4,71,Auburn Gresham,44878,AUBURN GRESHAM


### merge in community population data

In [74]:
df_community_summary = pd.merge(df_community,df_population,left_on='community',right_on='COMMUNITY_CAPS',how='right')
df_community_summary.head()

Unnamed: 0,community,count_BAFCONP,count_POLICE,count_STRTSAN,count_TRANPORT,n_dockets,GEOID,COMMUNITY_NAME,2020_POP,COMMUNITY_CAPS
0,ALBANY PARK,0.0,0.0,0.0,15.0,15.0,14,Albany Park,48396,ALBANY PARK
1,ARCHER HEIGHTS,0.0,0.0,0.0,12.0,12.0,57,Archer Heights,14196,ARCHER HEIGHTS
2,ARMOUR SQUARE,0.0,0.0,0.0,26.0,26.0,34,Armour Square,13890,ARMOUR SQUARE
3,ASHBURN,0.0,0.0,0.0,2.0,2.0,70,Ashburn,41098,ASHBURN
4,AUBURN GRESHAM,0.0,0.0,3.0,16.0,19.0,71,Auburn Gresham,44878,AUBURN GRESHAM


In [81]:
# per 10,000 capita, per year over 4 years
df_community_summary['dp10k'] = \
(10000/4)*df_community_summary['n_dockets']/df_community_summary['2020_POP']
df_community_summary['streets_p10k'] = \
(10000/4)*df_community_summary['count_STRTSAN']/df_community_summary['2020_POP']
df_community_summary['cdot_p10k'] = \
(10000/4)*df_community_summary['count_TRANPORT']/df_community_summary['2020_POP']
df_community_summary['police_p10k'] = \
(10000/4)*df_community_summary['count_POLICE']/df_community_summary['2020_POP']

### sort overall

In [79]:
df_community_summary.sort_values(by='dp10k',ascending = False).head()

Unnamed: 0,community,count_BAFCONP,count_POLICE,count_STRTSAN,count_TRANPORT,n_dockets,GEOID,COMMUNITY_NAME,2020_POP,COMMUNITY_CAPS,dp10k,streets_p10k
23,ENGLEWOOD,0.0,4.0,58.0,7.0,69.0,68,Englewood,24369,ENGLEWOOD,7.078666,5.950183
27,GARFIELD RIDGE,0.0,3.0,2.0,86.0,91.0,56,Garfield Ridge,35439,GARFIELD RIDGE,6.419481,0.141088
70,WEST ENGLEWOOD,0.0,6.0,60.0,9.0,75.0,67,West Englewood,29647,WEST ENGLEWOOD,6.324417,5.059534
28,GRAND BOULEVARD,0.0,0.0,5.0,47.0,52.0,38,Grand Boulevard,24589,GRAND BOULEVARD,5.286917,0.508357
54,OAKLAND,0.0,0.0,1.0,12.0,13.0,36,Oakland,6799,OAKLAND,4.780115,0.367701


In [82]:
df_community_summary.sort_values(by='streets_p10k',ascending = False).head()

Unnamed: 0,community,count_BAFCONP,count_POLICE,count_STRTSAN,count_TRANPORT,n_dockets,GEOID,COMMUNITY_NAME,2020_POP,COMMUNITY_CAPS,dp10k,streets_p10k,cdot_p10k,police_p10k
23,ENGLEWOOD,0.0,4.0,58.0,7.0,69.0,68,Englewood,24369,ENGLEWOOD,7.078666,5.950183,0.718125,0.410357
70,WEST ENGLEWOOD,0.0,6.0,60.0,9.0,75.0,67,West Englewood,29647,WEST ENGLEWOOD,6.324417,5.059534,0.75893,0.505953
11,BRIGHTON PARK,0.0,0.0,70.0,11.0,81.0,58,Brighton Park,45053,BRIGHTON PARK,4.494706,3.884314,0.610392,0.0
49,NEW CITY,0.0,0.0,62.0,7.0,69.0,61,New City,43628,NEW CITY,3.953883,3.552764,0.401119,0.0
29,GREATER GRAND CROSSING,0.0,1.0,34.0,1.0,36.0,69,Greater Grand Crossing,31471,GREATER GRAND CROSSING,2.859776,2.700899,0.079438,0.079438


In [83]:
df_community_summary.sort_values(by='cdot_p10k',ascending = False).head()

Unnamed: 0,community,count_BAFCONP,count_POLICE,count_STRTSAN,count_TRANPORT,n_dockets,GEOID,COMMUNITY_NAME,2020_POP,COMMUNITY_CAPS,dp10k,streets_p10k,cdot_p10k,police_p10k
27,GARFIELD RIDGE,0.0,3.0,2.0,86.0,91.0,56,Garfield Ridge,35439,GARFIELD RIDGE,6.419481,0.141088,6.066763,0.211631
28,GRAND BOULEVARD,0.0,0.0,5.0,47.0,52.0,38,Grand Boulevard,24589,GRAND BOULEVARD,5.286917,0.508357,4.77856,0.0
2,ARMOUR SQUARE,0.0,0.0,0.0,26.0,26.0,34,Armour Square,13890,ARMOUR SQUARE,4.679626,0.0,4.679626,0.0
38,LINCOLN PARK,0.0,0.0,2.0,126.0,128.0,7,Lincoln Park,70492,LINCOLN PARK,4.539522,0.07093,4.468592,0.0
54,OAKLAND,0.0,0.0,1.0,12.0,13.0,36,Oakland,6799,OAKLAND,4.780115,0.367701,4.412414,0.0


# 5. Export for Analysis

In [19]:
df_addresses.to_csv("../results/fines_by_address.csv")
df_dockets.to_csv("../results/fines_by_docket.csv")