<h1>Snow Clearance Fines, 2019-2023</h1>
31 January 2024
updated 3/6/24

This analysis looks at administrative hearing dockets for uncleared sidewalks, based on FOIA data requested from the Department of Administrative Hearings (H064920-011124.xlsx).

Calculations of total fines are commented out for now. I rebuilt the data pipeline in March 2024 to support part II of the investigation, and need to revise the calculation of total fines.

<br>
My analysis steps:
<ol>
<li><a href="#read">Read Data</a>
    <li><a href="#summarize">Summarize Citywide</a>- by issuing department, by year, by community area
        <li><a href="#community">Summarize by Community</a>
</ol>

### Preliminary Findings
<ul>
    <li>Englewood, Garfield Ridge, and West Englewood are the three communities with the highest number of dockets per capita
    <li>17 of 25 police-related dockets related to snow clearance were in Englewood, West Englewood, Garfield Ridge, and Belmont Cragin
    </ul>

<a name="read"></a>
# 1. Read Data

In [1]:
import pandas as pd
# import requests
import numpy as np
import altair as alt
#import datetime as dt #would only need this if I manipulated dates post-API data retrieval

In [2]:
df = pd.read_csv("../../data/05-finalized/dockets-summary.csv")
df.rename(columns={'community':'community_caps'},inplace=True)
df.head()

Unnamed: 0,docket,dept,violation_address,lat,long,community_caps,ward_1523,violation_date,n_records
0,19DS68300L,STRTSAN,4710 S WESTERN AVE,41.807859,-87.68479703503766,BRIGHTON PARK,15,2019-11-13,2
1,19DS69216L,STRTSAN,1425 W MORSE AVE,42.0074513,-87.6668285,ROGERS PARK,49,2019-11-13,1
2,19DS70010L,STRTSAN,715 E 47TH ST,41.8093383,-87.6080127,GRAND BOULEVARD,4,2019-11-13,1
3,19DS72153L,STRTSAN,300 W WASHINGTON ST,41.8818694,-87.7401431,WEST GARFIELD PARK,28,2019-11-12,5
4,19DS72160L,STRTSAN,6929 N SHERIDAN RD,41.9598134,-87.654693,UPTOWN,46,2019-11-14,1


In [3]:
len(df)

1918

<a name="summarize"></a>
# 2. Summarize Citywide

### by season

In [4]:
def get_season(month, year):
# Input:
#   month: numeric month of the year (1-12)
#   year (int): 4 digit year
# Returns:
#   season in yyyy-yyyy format. E.g. 2022-2023 means winter 2022-2023, or July 2022 to June 2023.

    if month >= 7:
        return f"{str(year)[-4:]}-{str(year + 1)[-4:]}"
    else:
        return f"{str(year - 1)[-4:]}-{str(year)[-4:]}"

In [5]:
# format and parse dates
df['violation_date'] = pd.to_datetime(df['violation_date'])
df['year'] = df['violation_date'].dt.year
df['month'] = df['violation_date'].dt.month
df['date'] = df['violation_date'].dt.date
df['season'] = df.apply(lambda row: get_season(row['month'], row['year']), axis=1)
df.head()

Unnamed: 0,docket,dept,violation_address,lat,long,community_caps,ward_1523,violation_date,n_records,year,month,date,season
0,19DS68300L,STRTSAN,4710 S WESTERN AVE,41.807859,-87.68479703503766,BRIGHTON PARK,15,2019-11-13,2,2019,11,2019-11-13,2019-2020
1,19DS69216L,STRTSAN,1425 W MORSE AVE,42.0074513,-87.6668285,ROGERS PARK,49,2019-11-13,1,2019,11,2019-11-13,2019-2020
2,19DS70010L,STRTSAN,715 E 47TH ST,41.8093383,-87.6080127,GRAND BOULEVARD,4,2019-11-13,1,2019,11,2019-11-13,2019-2020
3,19DS72153L,STRTSAN,300 W WASHINGTON ST,41.8818694,-87.7401431,WEST GARFIELD PARK,28,2019-11-12,5,2019,11,2019-11-12,2019-2020
4,19DS72160L,STRTSAN,6929 N SHERIDAN RD,41.9598134,-87.654693,UPTOWN,46,2019-11-14,1,2019,11,2019-11-14,2019-2020


In [6]:
# by season
df.groupby('season').agg(
#     sum_fine_amt=('Imposed Fine Detailed', 'sum'),
#     max_fine=('Imposed Fine Detailed', 'max'),
    n_dockets=('docket','nunique'),
    n_records=('docket','count')
).reset_index()

Unnamed: 0,season,n_dockets,n_records
0,2019-2020,357,357
1,2020-2021,764,764
2,2021-2022,700,700
3,2022-2023,97,97


### by issuing department

In [7]:
# by issuing department
df.groupby('dept').agg(
#     sum_fine_amt=('Imposed Fine Detailed', 'sum'),
    n_dockets=('docket','nunique'),
    n_records=('docket','count')
).reset_index()

Unnamed: 0,dept,n_dockets,n_records
0,BAFCONP,2,2
1,POLICE,25,25
2,STRTSAN,497,497
3,TRANPORT,1393,1393
4,unknown,1,1


### by Address

In [8]:
df_by_address = df.groupby(['violation_address']).agg(
#     sum_fine_amt=('Imposed Fine Detailed', 'sum'),
    n_dockets=('docket','nunique'),
    n_records=('docket','count')
).reset_index()
df_by_address[df_by_address['n_dockets']>=3].sort_values("n_dockets",ascending=False).head()

Unnamed: 0,violation_address,n_dockets,n_records
717,3110 W 61ST ST,6,6
1,100 UNKNOWN,4,4
1427,627 W RANDOLPH ST,4,4
1058,4710 S WESTERN AVE,4,4
1722,932 W 59TH ST,4,4


### distribution of fines by year

In [9]:
# bins = [0, 1, 150, 500, 501, 5000]
# labels = ['0', '1-150', '151-499', '500', '501-5000']

# df['Binned Fine'] = pd.cut(df['Imposed Fine Detailed'], bins=bins, labels=labels, right=False, include_lowest=True)

# pivot_table = df.pivot_table(
#     values='Imposed Fine Detailed',  # We sum over the original fine amounts
#     index='Binned Fine',  # Use the binned column as the new index
#     columns='season',  # Using 'season' for the columns
#     aggfunc='count',  # Summing up the fine amounts
#     fill_value=0  # Replace NaN with 0
# )

# pivot_table

In [10]:
# pivot_table_long = pivot_table.reset_index().melt(id_vars='Binned Fine', var_name='Season', value_name='Count')

# # Create the histogram (bar chart) with Altair
# chart = alt.Chart(pivot_table_long).mark_bar().encode(
#     x=alt.X('Binned Fine:N', title='Fine Amount Bins', sort=labels),  # Ensure custom order
#     y=alt.Y('sum(Count):Q', title='Count'),
#     color='Season:N',
#     column='Season:N'  # Separate charts for each season
# ).properties(
#     width=220,
#     height=200
# ).resolve_scale(
#     y='independent'  # This allows each chart to have a separate Y-axis scale
# ).properties(
#     width=220,
#     height=200
# )

# chart.display()

<a name="community"></a>
# 3. Summarize by Community

### by community

In [11]:
df_community = df.groupby('community_caps').agg(
#     sum_fine_amt=('Imposed Fine Detailed', 'sum'),
    n_dockets=('docket','nunique'),
    n_records=('docket','count')
).reset_index()
df_community.head()

Unnamed: 0,community_caps,n_dockets,n_records
0,ALBANY PARK,15,15
1,ARCHER HEIGHTS,14,14
2,ARMOUR SQUARE,26,26
3,ASHBURN,2,2
4,AUBURN GRESHAM,19,19


In [12]:
df_community.dtypes

community_caps    object
n_dockets          int64
n_records          int64
dtype: object

### read community population

In [13]:
# retrieved on 1/11/24, but 2020 Census Population figures should be static
df_population = pd.read_csv("../../data/05-finalized/census-by-community.csv")

In [14]:
# simplify dataframe to get only essentials
df_population = df_population[['community_caps','2020_pop','community_name']]
df_population.head()

Unnamed: 0,community_caps,2020_pop,community_name
0,ALBANY PARK,48396,Albany Park
1,ARCHER HEIGHTS,14196,Archer Heights
2,ARMOUR SQUARE,13890,Armour Square
3,ASHBURN,41098,Ashburn
4,AUBURN GRESHAM,44878,Auburn Gresham


In [15]:
df_population.dtypes

community_caps    object
2020_pop           int64
community_name    object
dtype: object

### merge in community population data

In [16]:
df_community_summary = pd.merge(df_community,df_population,on='community_caps')
df_community_summary.head()

Unnamed: 0,community_caps,n_dockets,n_records,2020_pop,community_name
0,ALBANY PARK,15,15,48396,Albany Park
1,ARCHER HEIGHTS,14,14,14196,Archer Heights
2,ARMOUR SQUARE,26,26,13890,Armour Square
3,ASHBURN,2,2,41098,Ashburn
4,AUBURN GRESHAM,19,19,44878,Auburn Gresham


In [17]:
# per 10,000 capita, per year over 4 years
df_community_summary['dockets per 10k'] = \
(10000/4)*df_community_summary['n_dockets']/df_community_summary['2020_pop']

In [18]:
df_community_summary

Unnamed: 0,community_caps,n_dockets,n_records,2020_pop,community_name,dockets per 10k
0,ALBANY PARK,15,15,48396,Albany Park,0.774857
1,ARCHER HEIGHTS,14,14,14196,Archer Heights,2.465483
2,ARMOUR SQUARE,26,26,13890,Armour Square,4.679626
3,ASHBURN,2,2,41098,Ashburn,0.121660
4,AUBURN GRESHAM,19,19,44878,Auburn Gresham,1.058425
...,...,...,...,...,...,...
66,WEST LAWN,17,17,33662,West Lawn,1.262551
67,WEST PULLMAN,5,5,26104,West Pullman,0.478854
68,WEST RIDGE,24,24,77122,West Ridge,0.777988
69,WEST TOWN,97,97,87781,West Town,2.762557


In [19]:
df_community_summary.sort_values('dockets per 10k',ascending=False).head(10)

Unnamed: 0,community_caps,n_dockets,n_records,2020_pop,community_name,dockets per 10k
20,ENGLEWOOD,69,69,24369,Englewood,7.078666
64,WEST ENGLEWOOD,75,75,29647,West Englewood,6.324417
24,GARFIELD RIDGE,89,89,35439,Garfield Ridge,6.278394
25,GRAND BOULEVARD,52,52,24589,Grand Boulevard,5.286917
51,OAKLAND,13,13,6799,Oakland,4.780115
17,EAST GARFIELD PARK,38,38,19992,East Garfield Park,4.751901
2,ARMOUR SQUARE,26,26,13890,Armour Square,4.679626
10,BRIGHTON PARK,82,82,45053,Brighton Park,4.550196
35,LINCOLN PARK,127,127,70492,Lincoln Park,4.504057
33,KENWOOD,33,33,19116,Kenwood,4.315756


In [20]:
df_community_summary.to_csv("../../results/ssw-part1-admin-hearings-by-community.csv", index=False)