# Parking Violations Issued - Fiscal Year 2022
This dataset provides data on Parking Violations Issued between July 1, 2021 to June 30, 2022. In New York City, the fiscal year begins on July 1st of one calendar year and ends on June 30th of the following calendar year. Click here to find out more about the NYC Fiscal Year.

download dataset from: https://data.cityofnewyork.us/City-Government/Parking-Violations-Issued-Fiscal-Year-2022/7mxj-7a6y/about_data

In [1]:
import pandas as pd 
import os 

In [4]:
# Load 5 columns of data
df = pd.read_parquet(
    "C:\\Users\\hamza\\Documents\\Github\\nyc_parking_violations_2022.parquet",
    columns=["Registration State", "Violation Description", "Vehicle Body Type", "Issue Date", "Summons Number"]
)

In [5]:
df.sample(10)

Unnamed: 0,Registration State,Violation Description,Vehicle Body Type,Issue Date,Summons Number
10920273,NY,70A-Reg. Sticker Expired (NYS),4DSD,03/22/2022,8833996980
423124,NY,53-Safety Zone,SUBN,07/22/2021,8796048256
8410885,NJ,PHTO SCHOOL ZN SPEED VIOLATION,UT,01/10/2022,4763790262
13050754,NY,50-Crosswalk,SUBN,05/05/2022,8995023417
12500612,NY,21-No Parking (street clean),4DSD,04/13/2022,8785021477
14904219,NY,38-Failure to Dsplay Meter Rec,4DSD,06/18/2022,8582084766
15202064,NY,20A-No Parking (Non-COM),SUBN,05/31/2022,8906913102
14116079,NY,PHTO SCHOOL ZN SPEED VIOLATION,SUBN,06/02/2022,4781333760
6635777,NY,,SUBN,11/26/2021,1484521900
13310268,NY,PHTO SCHOOL ZN SPEED VIOLATION,BUS,05/12/2022,4779362416


### Which parking violation is most commonly comitted by vehicles from various U.S states?
i.e., Print the most commonly comittied parking violation in each state 

In [6]:
# Group by state and violations
violations_count = df.groupby(['Registration State', 'Violation Description']).size()

# add violations count to dataframe
violations_df = violations_count.reset_index(name='Count')

# Sort by violation count for each state
violations_df_sorted = violations_df.sort_values(by=['Registration State', 'Count'], ascending=[True, False])

# Choose only the highest violation count for each state
violations_by_state = violations_df_sorted.groupby('Registration State').first().reset_index()
violations_by_state

Unnamed: 0,Registration State,Violation Description,Count
0,99,,17550
1,AB,14-No Standing,22
2,AK,PHTO SCHOOL ZN SPEED VIOLATION,125
3,AL,PHTO SCHOOL ZN SPEED VIOLATION,3668
4,AR,PHTO SCHOOL ZN SPEED VIOLATION,537
...,...,...,...
62,VT,PHTO SCHOOL ZN SPEED VIOLATION,3024
63,WA,21-No Parking (street clean),3732
64,WI,14-No Standing,1639
65,WV,PHTO SCHOOL ZN SPEED VIOLATION,1185


### Which vehicle body types are most frequently involved in parking violations?

In [None]:
# Group by vehicle body type
body_types = df.groupby(['Vehicle Body Type']).size().reset_index(name='Count')

# sort by count 
body_types_sorted = body_types.sort_values(['Count', ], ascending=False)
body_types_sorted

Unnamed: 0,Vehicle Body Type,Count
792,SUBN,6449007
50,4DSD,4402991
918,VAN,1317899
290,DELV,436430
663,PICK,429798
...,...,...
183,CARY,1
421,ISUZ,1
423,IXMR,1
139,BILB,1


### How do parking violations vary across days of the week?
The result below reveals that there are much more parking tickets on weekdays compared to weekends which intuitively makes sense.

In [29]:
weekday_names = {
    0: "Monday",
    1: "Tuesday",
    2: "Wednesday",
    3: "Thursday",
    4: "Friday",
    5: "Saturday",
    6: "Sunday",
}

# Convert Issue Date to Datetime object 
df['Issue Date'] = df['Issue Date'].astype('datetime64[ms]')

# Map each Issue Date to the corresponding Day in the week
df["issue_weekday"] = df["Issue Date"].dt.weekday.map(weekday_names)

#
df.groupby(['issue_weekday'])["Summons Number"].count().sort_values()

issue_weekday
Sunday        462992
Saturday     1108385
Monday       2488563
Wednesday    2760088
Tuesday      2809949
Friday       2891679
Thursday     2913951
Name: Summons Number, dtype: int64