# Parking Violations Issued - Fiscal Year 2022
This dataset provides data on Parking Violations Issued between July 1, 2021 to June 30, 2022. In New York City, the fiscal year begins on July 1st of one calendar year and ends on June 30th of the following calendar year. Click here to find out more about the NYC Fiscal Year.

download dataset from: https://data.cityofnewyork.us/City-Government/Parking-Violations-Issued-Fiscal-Year-2022/7mxj-7a6y/about_data

In [2]:
import pandas as pd 
import os 

In [8]:
# Load data
df0 = pd.read_parquet(
    "C:\\Users\\hamza\\Documents\\Github\\nyc_parking_violations_2022.parquet"
)

In [9]:
df0.sample(10)

Unnamed: 0,Summons Number,Plate ID,Registration State,Plate Type,Issue Date,Violation Code,Vehicle Body Type,Vehicle Make,Issuing Agency,Street Code1,...,Vehicle Color,Unregistered Vehicle?,Vehicle Year,Meter Number,Feet From Curb,Violation Post Code,Violation Description,No Standing or Stopping Violation,Hydrant Violation,Double Parking Violation
6015184,4757392151,CNE2105,NY,PAS,11/19/2021,36,SUBN,TOYOT,V,0,...,WH,,2015,,0,,PHTO SCHOOL ZN SPEED VIOLATION,,,
1175576,8823898250,JKC1811,NY,PAS,07/09/2021,21,4DSD,HYUN,T,14140,...,GY,,2019,,0,07,21-No Parking (street clean),,,
3719086,8911223049,63312MB,NY,COM,09/13/2021,69,VAN,NISSA,T,34230,...,BL,,2012,102416.0,0,02,69-Fail to Dsp Prking Mtr Rcpt,,,
2059573,4745516459,UGK6503,VA,PAS,08/17/2021,36,4D,HONDA,V,17550,...,,,2020,,0,,PHTO SCHOOL ZN SPEED VIOLATION,,,
716865,8940852400,13542JU,NY,COM,07/12/2021,21,SUBN,CHEVR,T,8590,...,GY,,0,,0,14,21-No Parking (street clean),,,
1085538,4738834496,JPD6000,NY,PAS,07/02/2021,36,SUBN,HONDA,V,0,...,RD,,2019,,0,,PHTO SCHOOL ZN SPEED VIOLATION,,,
4204536,5601044910,KRS2866,NY,PAS,10/07/2021,12,SUBN,TOYOT,V,0,...,GR,,2021,,0,,MOBILE BUS LANE VIOLATION,,,
751185,8947720800,52275MH,NY,COM,06/29/2021,38,VAN,ME/BE,T,15710,...,WH,,2014,114439.0,0,49,38-Failure to Dsplay Meter Rec,,,
7410666,8734748222,DZNL28,FL,PAS,12/09/2021,40,SUBN,MAZDA,T,8390,...,SILVE,,0,,3,T,40-Fire Hydrant,,,
9733788,8867039118,DNG2624,NY,PAS,02/17/2022,20,4DSD,INFIN,T,36320,...,BK,,2011,,0,28,20A-No Parking (Non-COM),,,


# Split data into chunks accomodate for memory issues
- Data file is too large for memory

In [12]:
# Load the data from a Parquet file
df = pd.read_parquet(
    "C:\\Users\\hamza\\Documents\\Github\\nyc_parking_violations_2022.parquet",
    columns=[
        "Registration State", "Violation Description", "Vehicle Body Type", "Issue Date", "Summons Number",
        "Plate Type", "Vehicle Make", "Vehicle Color", "Street Code1", "Vehicle Year"
    ]
)

# Define the number of rows per chunk
chunk_size = 1000000
chunks = [df[i:i + chunk_size] for i in range(0, df.shape[0], chunk_size)]

### Save parquet file as CSV for analysis on MSSMS

In [13]:
# Saving each chunk to a CSV file
for index, chunk in enumerate(chunks):
    chunk.to_csv(f"C:\\Users\\hamza\\Documents\\Github\\nyc_parking_violations_2022_part{index+1}.csv", index=False)

### Which vehicle body types are most frequently involved in parking violations?

In [None]:
# Group by vehicle body type
body_types = df.groupby(['Vehicle Body Type']).size().reset_index(name='Count')

# sort by count 
body_types_sorted = body_types.sort_values(['Count', ], ascending=False)
body_types_sorted

Unnamed: 0,Vehicle Body Type,Count
792,SUBN,6449007
50,4DSD,4402991
918,VAN,1317899
290,DELV,436430
663,PICK,429798
...,...,...
183,CARY,1
421,ISUZ,1
423,IXMR,1
139,BILB,1


### How do parking violations vary across days of the week?
The result below reveals that there are much more parking tickets on weekdays compared to weekends which intuitively makes sense.

In [29]:
weekday_names = {
    0: "Monday",
    1: "Tuesday",
    2: "Wednesday",
    3: "Thursday",
    4: "Friday",
    5: "Saturday",
    6: "Sunday",
}

# Convert Issue Date to Datetime object 
df['Issue Date'] = df['Issue Date'].astype('datetime64[ms]')

# Map each Issue Date to the corresponding Day in the week
df["issue_weekday"] = df["Issue Date"].dt.weekday.map(weekday_names)

#
df.groupby(['issue_weekday'])["Summons Number"].count().sort_values()

issue_weekday
Sunday        462992
Saturday     1108385
Monday       2488563
Wednesday    2760088
Tuesday      2809949
Friday       2891679
Thursday     2913951
Name: Summons Number, dtype: int64