## Toronto Parking Tickets Analysis ##

#### Dataset Description ####

Approximately 2.8 million parking tickets are issued annually across the City of Toronto. This dataset contains non-identifiable information relating to each parking ticket issued for each calendar year. The tickets are issued by Toronto Police Services (TPS) personnel as well as persons certified and authorized to issue tickets by TPS.

This data set contains complete records only. Incomplete records in the City database are not included in the data set. Incomplete records may exist due to a variety of reasons e.g. the vehicle registration is out-of-province, tickets paid prior to staff entering the ticket data, etc.The volume of incomplete records relative to the overall volume is low and therefore presents insignificant impact to trend analysis.

Please note, you may need to download an open source application that splits the file into little chunks to import into Excel. We are hoping to present this data in the future in smaller file sizes.

Data source: https://open.toronto.ca/dataset/parking-tickets/

In [47]:
# Import libraries

import pandas as pd
import requests
import numpy as np
import plotly.express as px
import plotly.io as pio
import plotly.graph_objects as go

In [48]:
# Get the dataset metadata by passing package_id to the package_search endpoint
# For example, to retrieve the metadata for this dataset:

url = "https://ckan0.cf.opendata.inter.prod-toronto.ca/api/3/action/package_show"
params = { "id": "8c233bc2-1879-44ff-a0e4-9b69a9032c54"}
package = requests.get(url, params = params).json()
#print(package['result']['resources'])

In [49]:
# Import and prepare data

data = pd.read_csv('parking-tickets-2018/Parking_Tags_Data_2018_1.csv')
df = pd.DataFrame(data)
df.shape

(750000, 11)

In [50]:
df.head()

Unnamed: 0,tag_number_masked,date_of_infraction,infraction_code,infraction_description,set_fine_amount,time_of_infraction,location1,location2,location3,location4,province
0,***92517,20180101,16,PARK-WITHIN 9M INTERSECT ROAD,50,0.0,S/S,PRYOR AVE,E/O,CLOVERDALE RD,ON
1,***71708,20180101,29,PARK PROHIBITED TIME NO PERMIT,30,2.0,NR,266 DOVERCOURT RD,,,ON
2,***92311,20180101,29,PARK PROHIBITED TIME NO PERMIT,30,2.0,NR,15 FAIRBANK AVE,,,ON
3,***92312,20180101,29,PARK PROHIBITED TIME NO PERMIT,30,2.0,NR,15 FAIRBANK AVE,,,ON
4,***71709,20180101,29,PARK PROHIBITED TIME NO PERMIT,30,3.0,NR,266 DOVERCOURT RD,,,ON


In [51]:
df.dtypes

tag_number_masked          object
date_of_infraction          int64
infraction_code             int64
infraction_description     object
set_fine_amount             int64
time_of_infraction        float64
location1                  object
location2                  object
location3                  object
location4                  object
province                   object
dtype: object

In [52]:
# check of NaN values
print(df['tag_number_masked'].isnull().values.any())
print(df['infraction_description'].isnull().values.any())

False
False


In [53]:
# extract year, month, date in their own column
df['infraction_yr'] = df.date_of_infraction.astype(str).str[:4]
df['infraction_mth'] = df.date_of_infraction.astype(str).str[4:6]
df['infraction_date'] = df.date_of_infraction.astype(str).str[6:8]
df.tail()

Unnamed: 0,tag_number_masked,date_of_infraction,infraction_code,infraction_description,set_fine_amount,time_of_infraction,location1,location2,location3,location4,province,infraction_yr,infraction_mth,infraction_date
749995,***36033,20180517,207,PARK MACHINE-REQD FEE NOT PAID,30,1237.0,NR,37 ELM ST,,,ON,2018,5,17
749996,***52274,20180517,3,PARK ON PRIVATE PROPERTY,30,1237.0,AT,33 DAVISVILLE AVE,,,ON,2018,5,17
749997,***53832,20180517,5,PARK-SIGNED HWY-PROHIBIT DY/TM,50,1237.0,OPP,188 MC CAUL ST,,,ON,2018,5,17
749998,***56548,20180517,9,STOP-SIGNED HWY-PROHIBIT TM/DY,60,1237.0,NR,124 AVENUE RD,,,ON,2018,5,17
749999,***59300,20180517,29,PARK PROHIBITED TIME NO PERMIT,30,1237.0,NR,508 MARKHAM ST,,,ON,2018,5,17


In [77]:
daily_infraction_df = pd.DataFrame(df.groupby(['infraction_yr','infraction_mth', 'infraction_date']).tag_number_masked.count())
daily_infraction_df = daily_infraction_df.reset_index()
daily_infraction_df['infraction_ymd'] = daily_infraction_df['infraction_yr'].map(str) + '-' + daily_infraction_df['infraction_mth'].map(str) + '-' + daily_infraction_df['infraction_date'].map(str)
daily_infraction_df.head()

Unnamed: 0,infraction_yr,infraction_mth,infraction_date,tag_number_masked,infraction_ymd
0,2018,1,1,1269,2018-01-01
1,2018,1,2,5489,2018-01-02
2,2018,1,3,5104,2018-01-03
3,2018,1,4,5002,2018-01-04
4,2018,1,5,4177,2018-01-05


## plot total infractions by month##

In [81]:
# plot daily infractions by month
fig = px.line(daily_infraction_df, x='infraction_ymd', y='tag_number_masked', title='Number of Infractions Per Day')
fig.show()

In [74]:
# plot infraction volume by month

fig2 = px.bar(daily_infraction_df, 
              x='infraction_mth', 
              y='tag_number_masked',
              title='Number of Infractions Per Month') 
fig2.show()

In [68]:
# display as heatmap

fig_heatmap = go.Figure(data=go.Heatmap(
                        z=daily_infraction_df['tag_number_masked'],
                        x=daily_infraction_df['infraction_date'], 
                        y=daily_infraction_df['infraction_mth'],
                        colorscale='Mint'
                        ))

fig_heatmap.update_xaxes(title_text = 'Date')
fig_heatmap.update_yaxes(title_text = 'Month')
fig_heatmap.show()