# Parking Violations Issued - Fiscal Year 2022
This dataset provides data on Parking Violations Issued between July 1, 2021 to June 30, 2022. In New York City, the fiscal year begins on July 1st of one calendar year and ends on June 30th of the following calendar year. Click here to find out more about the NYC Fiscal Year.

download dataset from: https://data.cityofnewyork.us/City-Government/Parking-Violations-Issued-Fiscal-Year-2022/7mxj-7a6y/about_data

In [1]:
import pandas as pd 
import os 

In [None]:
# Load 5 columns of data
df = pd.read_parquet(
    "C:\\Users\\hamza\\Documents\\Github\\nyc_parking_violations_2022.parquet",
    columns=["Registration State", "Violation Description", "Vehicle Body Type", "Issue Date", "Summons Number"]
)

In [3]:
df.sample(10)

Unnamed: 0,Registration State,Violation Description,Vehicle Body Type,Issue Date,Summons Number
2342640,NY,38-Failure to Dsplay Meter Rec,MOPD,08/07/2021,8897121925
4352680,NY,21-No Parking (street clean),4DSD,10/08/2021,8952488350
8612165,MA,38-Failure to Dsplay Meter Rec,SUBN,12/29/2021,8683647160
8112825,NY,38-Failure to Dsplay Meter Rec,4DSD,01/17/2022,8844632147
14626895,NY,PHTO SCHOOL ZN SPEED VIOLATION,4DSD,06/15/2022,4783027079
14350358,NY,20A-No Parking (Non-COM),SUBN,06/03/2022,8905330290
5922864,NY,PHTO SCHOOL ZN SPEED VIOLATION,4DSD,11/09/2021,4756047427
298472,NY,PHTO SCHOOL ZN SPEED VIOLATION,SUBN,07/23/2021,4741566997
14933130,NJ,19-No Stand (bus stop),SUBN,06/17/2022,8702347283
6924937,NY,21-No Parking (street clean),SUBN,12/03/2021,8884307119


In [5]:
# Load 5 columns of data
df2 = pd.read_parquet(
    "C:\\Users\\hamza\\Documents\\Github\\nyc_parking_violations_2022.parquet"
)
df2.sample(10)

Unnamed: 0,Summons Number,Plate ID,Registration State,Plate Type,Issue Date,Violation Code,Vehicle Body Type,Vehicle Make,Issuing Agency,Street Code1,...,Vehicle Color,Unregistered Vehicle?,Vehicle Year,Meter Number,Feet From Curb,Violation Post Code,Violation Description,No Standing or Stopping Violation,Hydrant Violation,Double Parking Violation
3063769,8825079503,JBV4532,NC,PAS,09/11/2021,40,4DSD,DODGE,T,0,...,WHITE,,0,,0,05,40-Fire Hydrant,,,
2698030,8968282286,15540MN,NY,COM,08/10/2021,46,VAN,FORD,T,59520,...,WH,,2020,,0,42,46B-Double Parking (Com-100Ft),,,
9078473,4768031717,HPE5616,NY,PAS,02/15/2022,36,SUBN,GMC,V,0,...,BK,,2020,,0,,PHTO SCHOOL ZN SPEED VIOLATION,,,
2218422,8810070148,539253N,NJ,PAS,08/25/2021,21,SUBN,HONDA,T,16290,...,RED,,0,,0,51-A,21-No Parking (street clean),,,
9393396,2003272124,176701B,ME,PAS,02/09/2022,66,TRLR,N/S,S,46420,...,WH,,0,,0,,Detached Trailer,,,
11085475,8883677237,EFB6583,NY,PAS,03/16/2022,74,4DSD,ME/BE,T,25390,...,WH,,2020,,0,10,74A-Improperly Displayed Plate,,,
5917308,4756034391,CRA8144,NY,PAS,11/09/2021,36,PICK,NISSA,V,0,...,GY,,2008,,0,,PHTO SCHOOL ZN SPEED VIOLATION,,,
844940,8966968806,16340PF,NY,APP,07/06/2021,19,TRAC,FRUEH,T,78050,...,RD,,2005,,0,09,19-No Stand (bus stop),,,
2115921,4746116003,KNU9861,NY,PAS,08/20/2021,36,SUBN,HYUND,V,0,...,BL,,2021,,0,,PHTO SCHOOL ZN SPEED VIOLATION,,,
1738553,8977078430,KPC7654,NY,PAS,08/05/2021,21,SUBN,CHEVR,T,10030,...,GR,,2002,,0,E,21-No Parking (street clean),,,


### Save parquet file as CSV for analysis on MSSMS

In [6]:
df.to_csv("C:\\Users\\hamza\\Documents\\Github\\nyc_parking_violations_2022.csv", index=False)

### Which parking violation is most commonly comitted by vehicles from various U.S states?
i.e., Print the most commonly comittied parking violation in each state 

In [6]:
# Group by state and violations
violations_count = df.groupby(['Registration State', 'Violation Description']).size()

# add violations count to dataframe
violations_df = violations_count.reset_index(name='Count')

# Sort by violation count for each state
violations_df_sorted = violations_df.sort_values(by=['Registration State', 'Count'], ascending=[True, False])

# Choose only the highest violation count for each state
violations_by_state = violations_df_sorted.groupby('Registration State').first().reset_index()
violations_by_state

Unnamed: 0,Registration State,Violation Description,Count
0,99,,17550
1,AB,14-No Standing,22
2,AK,PHTO SCHOOL ZN SPEED VIOLATION,125
3,AL,PHTO SCHOOL ZN SPEED VIOLATION,3668
4,AR,PHTO SCHOOL ZN SPEED VIOLATION,537
...,...,...,...
62,VT,PHTO SCHOOL ZN SPEED VIOLATION,3024
63,WA,21-No Parking (street clean),3732
64,WI,14-No Standing,1639
65,WV,PHTO SCHOOL ZN SPEED VIOLATION,1185


### Which vehicle body types are most frequently involved in parking violations?

In [None]:
# Group by vehicle body type
body_types = df.groupby(['Vehicle Body Type']).size().reset_index(name='Count')

# sort by count 
body_types_sorted = body_types.sort_values(['Count', ], ascending=False)
body_types_sorted

Unnamed: 0,Vehicle Body Type,Count
792,SUBN,6449007
50,4DSD,4402991
918,VAN,1317899
290,DELV,436430
663,PICK,429798
...,...,...
183,CARY,1
421,ISUZ,1
423,IXMR,1
139,BILB,1


### How do parking violations vary across days of the week?
The result below reveals that there are much more parking tickets on weekdays compared to weekends which intuitively makes sense.

In [29]:
weekday_names = {
    0: "Monday",
    1: "Tuesday",
    2: "Wednesday",
    3: "Thursday",
    4: "Friday",
    5: "Saturday",
    6: "Sunday",
}

# Convert Issue Date to Datetime object 
df['Issue Date'] = df['Issue Date'].astype('datetime64[ms]')

# Map each Issue Date to the corresponding Day in the week
df["issue_weekday"] = df["Issue Date"].dt.weekday.map(weekday_names)

#
df.groupby(['issue_weekday'])["Summons Number"].count().sort_values()

issue_weekday
Sunday        462992
Saturday     1108385
Monday       2488563
Wednesday    2760088
Tuesday      2809949
Friday       2891679
Thursday     2913951
Name: Summons Number, dtype: int64

In [1]:
print('testing git from mac')

testing git from mac
