<h1>Snow Clearance Fines, 2019-2023</h1>
15 February 2024

This analysis looks at fines levied for uncleared sidewalks, based on FOIA data requested from the Department of Administrative Hearings (H064920-011124.xlsx). This dataset contained 3058 records dating from 1/1/2001 to 9/12/2023; filtered for those between 7/1/2019 and 6/30/2023 has 2560 records. Four of these could not be geocoded due to "unknown" address.<br>
<br>
<ol>
<li><a href="#read">Read Data</a>
    <li><a href="#docket">Summarize by Dockets</a>- roll up fines data to get one record per court docket
        <li><a href="#address">Summarize by Address</a>- roll up dockets to get one record per address
            <li><a href="#community">Summarize Dockets by Community</a>- look for patterns across Chicago community areas
</ol>

### Record Count
<ul>
    <li>2556 valid fines records from July 1 2019 to June 30 2023
        <li>1912 dockets. Some dockets contain multiple fines records, which are identical except for a different fine amount in each record.
            <li>1735 addresses. some addresses have been fined by both CDOT and Streets & Sanitation, with a separate court docket for each.
                <li>some addresses have been fined by multiple agencies
    </ul>

### Preliminary Findings    
<ul>
    <li>73% of court dockets were issued by CDOT, 26% by Streets and Sanitation. The remaining 1% were issued by the police or Business Affairs and Consumer Protection
    <li>Englewood, Garfield Ridge, and West Englewood are the three communities with the highest number of dockets per capita
    <li>Only 25 court dockets were issued by police. West Englewood, Englewood, and Garfield Ridge have the highest rates and account for half the dockets citywide.
        <li>For dockets issued by CDOT, Garfield Ridge, Grand Boulevard, and Armour Square have the highest per capita rate
            <li>For dockets issued by Streets and Sanitation, Englewood, West Englewood, and Brighton Park have the highest per capita rate
    </ul>

<a name="read"></a>
# 1. Read and Prepare Geocoded Fines Data

In [1]:
import pandas as pd
import requests
import altair as alt
import numpy as np
#import datetime as dt #would only need this if I manipulated dates post-API data retrieval

Note the following data preparation steps prior to this notebook
<ol>
<li>Prepared data by parsing dates and correcting data entry errors in addresses; see <a href="fines-01-prep-data.ipynb">fines-01-prep-data.ipynb</a>.
    <li>Geocoded addresses to identify lat and long coordinates, and spatially joined addresses to Community Areas shapefile. I did this offline in QGIS.
        </ol>

In [2]:
df = pd.read_csv("../data/03-geocoded/fines-geocoded-w-communities.csv")
df.head()

Unnamed: 0,field_1,Docket Number,Violation Date,Violation Address,Issuing Department Code,Imposed Fine Detailed,year,month,date,season,...,latlong,community,area,shape_area,perimeter,area_num_1,area_numbe,comarea_id,comarea,shape_len
0,416,20DT000917,2020/02/14,3100 S INDIANA,TRANPORT,0.0,2020,2,2020/02/14,2019-2020,...,"41.838241,-87.622033",DOUGLAS,0.0,46004620.0,0.0,35.0,35.0,0.0,0.0,31027.05451
1,417,20DT000917,2020/02/14,3100 S INDIANA,TRANPORT,150.0,2020,2,2020/02/14,2019-2020,...,"41.838241,-87.622033",DOUGLAS,0.0,46004620.0,0.0,35.0,35.0,0.0,0.0,31027.05451
2,418,20DT000917,2020/02/14,3100 S INDIANA,TRANPORT,500.0,2020,2,2020/02/14,2019-2020,...,"41.838241,-87.622033",DOUGLAS,0.0,46004620.0,0.0,35.0,35.0,0.0,0.0,31027.05451
3,757,21DT000478,2021/01/28,3317 S PRAIRIE,TRANPORT,110.0,2021,1,2021/01/28,2020-2021,...,"41.834323092878925,-87.62050356526285",DOUGLAS,0.0,46004620.0,0.0,35.0,35.0,0.0,0.0,31027.05451
4,773,21DT000493,2021/01/29,3658 S PRAIRIE,TRANPORT,500.0,2021,1,2021/01/29,2020-2021,...,"41.828145583678925,-87.62059581323",DOUGLAS,0.0,46004620.0,0.0,35.0,35.0,0.0,0.0,31027.05451


In [3]:
len(df)

2557

### parse lat and long

In [4]:
#parse lat and long
df['lat']=df['latlong'].str.split(',').str[0]
df['long']=df['latlong'].str.split(',').str[1]

### rename and reduce columns
in future iterations, this renaming should be done upstream

In [5]:
df = df[['Docket Number','Cleaned Address','Issuing Department Code','Imposed Fine Detailed','date','season','community','lat','long']]

In [6]:
df=df.rename(columns={"Docket Number":"docket","Cleaned Address":"address","Issuing Department Code":"dept","Imposed Fine Detailed":"fine_amount","date":"violation_date"})

### review and clean records with no community assigned

In [7]:
df[df["community"].isna()]

Unnamed: 0,docket,address,dept,fine_amount,violation_date,season,community,lat,long
2551,19DS72153L,300 W WASHINGTON ST,STRTSAN,0.0,2019/11/12,2019-2020,,41.882868,-88.210529
2552,19DS72153L,300 W WASHINGTON ST,STRTSAN,150.0,2019/11/12,2019-2020,,41.882868,-88.210529
2553,19DS72153L,300 W WASHINGTON ST,STRTSAN,500.0,2019/11/12,2019-2020,,41.882868,-88.210529
2554,21CP002398,1850 W MARQUETTE ST,POLICE,0.0,2021/02/12,2020-2021,,42.3236639,-87.8381484
2555,21CP002398,1850 W MARQUETTE ST,POLICE,50.0,2021/02/12,2020-2021,,42.3236639,-87.8381484
2556,21DT002747,5450 S 47TH,TRANPORT,500.0,2021/01/27,2020-2021,,41.8654712,-87.7423005


One address, 5450 S 47TH (docket 21DT002747), is not a valid address but was geocoded at 41.8654712,-87.7423005 at the edge of Lawndale by Roosevelt, which doesn't look right. If it's 5450 W. 47th it'd be in suburban Stickney. So I'm choosing to skip it.

In [8]:
#drop record with invalid address
index_to_drop = df[df['address'] == '5450 S 47TH'].index
df = df.drop(index_to_drop)

In [9]:
# manually assign community areas for the other two addresses
df.loc[df['address'] == '300 W WASHINGTON ST', 'community'] = 'LOOP'
df.loc[df['address'] == '1850 W MARQUETTE ST', 'community'] = 'WEST ENGLEWOOD'

In [10]:
df.tail()

Unnamed: 0,docket,address,dept,fine_amount,violation_date,season,community,lat,long
2551,19DS72153L,300 W WASHINGTON ST,STRTSAN,0.0,2019/11/12,2019-2020,LOOP,41.882868,-88.210529
2552,19DS72153L,300 W WASHINGTON ST,STRTSAN,150.0,2019/11/12,2019-2020,LOOP,41.882868,-88.210529
2553,19DS72153L,300 W WASHINGTON ST,STRTSAN,500.0,2019/11/12,2019-2020,LOOP,41.882868,-88.210529
2554,21CP002398,1850 W MARQUETTE ST,POLICE,0.0,2021/02/12,2020-2021,WEST ENGLEWOOD,42.3236639,-87.8381484
2555,21CP002398,1850 W MARQUETTE ST,POLICE,50.0,2021/02/12,2020-2021,WEST ENGLEWOOD,42.3236639,-87.8381484


<a name="docket"></a>
# 2. Summarize by Docket

In [11]:
# by season
df_dockets=df.groupby(['docket','dept','address','lat','long','community','violation_date','season']).agg(
    n_records=('docket','count'),
    total_fine=('fine_amount', 'sum')
).reset_index()

In [12]:
df_dockets.head()

Unnamed: 0,docket,dept,address,lat,long,community,violation_date,season,n_records,total_fine
0,19DS68300L,STRTSAN,4710 S WESTERN AVE,41.807859,-87.68479703503766,BRIGHTON PARK,2019/11/13,2019-2020,1,150.0
1,19DS69216L,STRTSAN,1425 W MORSE AVE,42.0074513,-87.6668285,ROGERS PARK,2019/11/13,2019-2020,1,50.0
2,19DS70010L,STRTSAN,715 E 47TH ST,41.8093383,-87.6080127,GRAND BOULEVARD,2019/11/13,2019-2020,1,150.0
3,19DS72153L,STRTSAN,300 W WASHINGTON ST,41.882868,-88.210529,LOOP,2019/11/12,2019-2020,3,650.0
4,19DS72160L,STRTSAN,6929 N SHERIDAN RD,41.9598134,-87.654693,UPTOWN,2019/11/14,2019-2020,1,500.0


<a name="address"></a>
# 3. Summarize by Address

### summarize total amount of fines and number of dockets by department, by address

In [13]:
# create pivot table
df_addresses = df_dockets.pivot_table(index=['address', 'lat','long','community'],
                             columns='dept',
                             values='total_fine',
                             aggfunc=['count', 'sum'],
                             fill_value=0)

# Rename columns for clarity
df_addresses.columns = ['_'.join(col).strip() for col in df_addresses.columns.values]

df_addresses_dates = df_dockets.groupby('address')['violation_date'].agg(['min', 'max']).reset_index()

# Add total columns for each row
df_addresses['n_dockets'] = df_addresses.filter(like='count_').sum(axis=1)
df_addresses['total_fines'] = df_addresses.filter(like='sum_').sum(axis=1)

df_addresses.reset_index(inplace=True)

In [14]:
df_addresses.head()

Unnamed: 0,address,lat,long,community,count_BAFCONP,count_POLICE,count_STRTSAN,count_TRANPORT,sum_BAFCONP,sum_POLICE,sum_STRTSAN,sum_TRANPORT,n_dockets,total_fines
0,10 N KILBOURN AVE,41.88096235,-87.73835576302265,WEST GARFIELD PARK,0,0,1,0,0.0,0.0,150.0,0.0,1,150.0
1,100 E CHESTNUT ST,41.8985878,-87.62589246260751,NEAR NORTH SIDE,0,0,0,1,0.0,0.0,0.0,550.0,1,550.0
2,100 N KEDZIE,41.9467224,-87.7078351,IRVING PARK,0,0,0,1,0.0,0.0,0.0,0.0,1,0.0
3,100 N KEDZIE AVE,41.88315915,-87.70652906434054,EAST GARFIELD PARK,0,0,3,0,0.0,0.0,150.0,0.0,3,150.0
4,100 W GRAND,41.7655813,-87.6216949,GREATER GRAND CROSSING,0,0,0,1,0.0,0.0,0.0,500.0,1,500.0


In [15]:
def get_department(BAFCONP, POLICE, STRTSAN, TRANPORT):
    conditions = [
        (TRANPORT >= 1) & (STRTSAN >= 1),
        (TRANPORT >= 1),
        (STRTSAN >= 1),
        (POLICE >= 1),
        (BAFCONP >= 1)
    ]
    choices = ['cdot_and_streets', 'cdot', 'streets', 'police', 'bafconp']
    return np.select(conditions, choices, default='unknown')

In [16]:
df_addresses['dept']= get_department(df_addresses['count_BAFCONP'],
                                     df_addresses['count_POLICE'],
                                     df_addresses['count_STRTSAN'],
                                     df_addresses['count_TRANPORT'])

In [17]:
df_addresses.head()

Unnamed: 0,address,lat,long,community,count_BAFCONP,count_POLICE,count_STRTSAN,count_TRANPORT,sum_BAFCONP,sum_POLICE,sum_STRTSAN,sum_TRANPORT,n_dockets,total_fines,dept
0,10 N KILBOURN AVE,41.88096235,-87.73835576302265,WEST GARFIELD PARK,0,0,1,0,0.0,0.0,150.0,0.0,1,150.0,streets
1,100 E CHESTNUT ST,41.8985878,-87.62589246260751,NEAR NORTH SIDE,0,0,0,1,0.0,0.0,0.0,550.0,1,550.0,cdot
2,100 N KEDZIE,41.9467224,-87.7078351,IRVING PARK,0,0,0,1,0.0,0.0,0.0,0.0,1,0.0,cdot
3,100 N KEDZIE AVE,41.88315915,-87.70652906434054,EAST GARFIELD PARK,0,0,3,0,0.0,0.0,150.0,0.0,3,150.0,streets
4,100 W GRAND,41.7655813,-87.6216949,GREATER GRAND CROSSING,0,0,0,1,0.0,0.0,0.0,500.0,1,500.0,cdot


# 5. Export for Analysis

In [18]:
df_addresses.to_csv("../data/04-standardized/fines-by-address.csv", index=False)
df_dockets.to_csv("../data/04-standardized/fines-by-docket.csv", index= False)