# NY Motor Vehicle Collisions – Exploratory Data Analysis

In this notebook, I analyze New York’s Vehicular crash data available at https://data.cityofnewyork.us/Public-Safety/Motor-Vehicle-Collisions-Crashes/h9gi-nx95


The data contains information from all police reported motor vehicle collisions in NYC. The information for this dataset is collated from the police report, called MV104-AN, which is required to be filled out for collisions where someone is injured or killed, or where there is at least $1000 worth of damage.

Data is available from 2012-07-01 onwards, however for this analysis, we will limit ourselves to the period up to 2023-08-15, which is when the data was downloaded.

I find that this data contains over 2 million observations, which allows us to explore several aspects of vehicle crashes in NY's boroughs.

The data dictionary for the data is also available at the URL above.

First, we perform unstructured exploration of the data, and then try to answer the following questions:

- I look for which borough has had the maximum number of crashes reported since 2012.

- I relate the number of crashes to to the borough's population to find out which borough has the maximum number of crashes for every 100,000 people. Even though the data does not have this information, we can combine the crash data with the population estimates for the boroughs also available from the City of New York's website (https://data.cityofnewyork.us/City-Government/New-York-City-Population-by-Borough-1950-2040/xywu-7bv9)

| Borough | Population |
| --- | --- |
| Bronx | 1446788 |
| Brooklyn | 2648452 |
| Manhattan | 1638281 |
| Queens | 2330295 |
| Staten Island | 487155 |  

  
- I look for the leading cause of crashes

- I also look for the top-3 causes of crashes, and try to calculate what proportion of all crashes are caused by these top-3 causes.

- I then look to some of the more serious implications of crashes by examining how many accidents involved at least one fatality.

- I then compute, on average, out of every 1000 accidents, how many have resulted in at least one person dead.

- I also look for missing data and try to compute the proportion of accidents in the data that do not have a Borough code.

- The fields 'VEHICLE TYPE CODE 1' and 'VEHICLE TYPE CODE 2' represent the first two vehicles involved in the accident.  I look for which combinations of vehicles have the most number of accidents.


In [3]:
import pandas as pd
import numpy as np
import seaborn as sns
import os
import matplotlib.pyplot as plt
import phik
from IPython.display import Markdown as md

In [4]:
df = pd.read_pickle(r"shared/Motor_Vehicle_Collisions_-_Crashes.pkl")
df

Unnamed: 0,CRASH DATE_CRASH TIME,BOROUGH,ZIP CODE,LATITUDE,LONGITUDE,LOCATION,ON STREET NAME,CROSS STREET NAME,OFF STREET NAME,NUMBER OF PERSONS INJURED,...,CONTRIBUTING FACTOR VEHICLE 2,CONTRIBUTING FACTOR VEHICLE 3,CONTRIBUTING FACTOR VEHICLE 4,CONTRIBUTING FACTOR VEHICLE 5,COLLISION_ID,VEHICLE TYPE CODE 1,VEHICLE TYPE CODE 2,VEHICLE TYPE CODE 3,VEHICLE TYPE CODE 4,VEHICLE TYPE CODE 5
0,2021-09-11 02:39:00,,,,,,WHITESTONE EXPRESSWAY,20 AVENUE,,2.0,...,Unspecified,,,,4455765,Sedan,Sedan,,,
1,2022-03-26 11:45:00,,,,,,QUEENSBORO BRIDGE UPPER,,,1.0,...,,,,,4513547,Sedan,,,,
2,2022-06-29 06:55:00,,,,,,THROGS NECK BRIDGE,,,0.0,...,Unspecified,,,,4541903,Sedan,Pick-up Truck,,,
3,2021-09-11 09:35:00,BROOKLYN,11208.0,40.667202,-73.866500,"(40.667202, -73.8665)",,,1211 LORING AVENUE,0.0,...,,,,,4456314,Sedan,,,,
4,2021-12-14 08:13:00,BROOKLYN,11233.0,40.683304,-73.917274,"(40.683304, -73.917274)",SARATOGA AVENUE,DECATUR STREET,,0.0,...,,,,,4486609,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2018240,2023-07-03 18:05:00,,,40.866806,-73.931010,"(40.866806, -73.93101)",RIVERSIDE DRIVE,,,0.0,...,Unspecified,,,,4648110,Sedan,Sedan,,,
2018241,2023-07-22 21:39:00,BRONX,10457.0,40.844177,-73.902920,"(40.844177, -73.90292)",EAST 174 STREET,WEBSTER AVENUE,,1.0,...,,,,,4648117,Sedan,,,,
2018242,2023-07-02 17:55:00,MANHATTAN,10006.0,40.711033,-74.014540,"(40.711033, -74.01454)",WEST STREET,LIBERTY STREET,,0.0,...,,,,,4648366,Taxi,,,,
2018243,2023-07-22 13:15:00,QUEENS,11433.0,40.691580,-73.793190,"(40.69158, -73.79319)",110 AVENUE,157 STREET,,1.0,...,Driver Inattention/Distraction,,,,4648129,Station Wagon/Sport Utility Vehicle,E-Bike,,,


## 1. Borough with the second highest number of crashes reported since 2012

In [45]:
#list the total number of crashes reported in each borough.
df.BOROUGH.value_counts()

BOROUGH
BROOKLYN         441026
QUEENS           372457
MANHATTAN        313266
BRONX            205345
STATEN ISLAND     58297
Name: count, dtype: int64

In [9]:
#Sort the total number of car accidents reported in each administrative area from largest to smallest.
df.BOROUGH.value_counts(dropna = False)

BOROUGH
NaN              627854
BROOKLYN         441026
QUEENS           372457
MANHATTAN        313266
BRONX            205345
STATEN ISLAND     58297
Name: count, dtype: int64

In [10]:
#Step 3：Ignore NaN
df.BOROUGH.value_counts(dropna = True)

BOROUGH
BROOKLYN         441026
QUEENS           372457
MANHATTAN        313266
BRONX            205345
STATEN ISLAND     58297
Name: count, dtype: int64

## 2.Borough with the minimum number of crashes adjusted for population

We relate the number of crashes to to the borough's population to find out which borough has the minimum number of crashes for every 100,000 people. Even though the data does not have this information, we can combine the crash data with the population estimates for the boroughs also available from the City of New York's website (https://data.cityofnewyork.us/City-Government/New-York-City-Population-by-Borough-1950-2040/xywu-7bv9)

**Borough - Population**

Bronx - 1446788

Brooklyn - 2648452 

Manhattan - 1638281 

Queens - 2330295 

Staten Island - 487155 

In [15]:
#Make a DataFrame for population of each Borough
df_Population = pd.DataFrame(
          {'BOROUGH': ["Bronx", "Brooklyn", "Manhattan","Queens","Staten Island"],
           'Population': [1446788, 2648452, 1638281, 2330295, 487155]})
df_Population

Unnamed: 0,BOROUGH,Population
0,Bronx,1446788
1,Brooklyn,2648452
2,Manhattan,1638281
3,Queens,2330295
4,Staten Island,487155


In [46]:
df.BOROUGH.value_counts()

BOROUGH
BROOKLYN         441026
QUEENS           372457
MANHATTAN        313266
BRONX            205345
STATEN ISLAND     58297
Name: count, dtype: int64

In [48]:
#Make a DataFrame for Crash of each Borough

df_Crush = pd.DataFrame(
          {'BOROUGH': ["Bronx", "Brooklyn", "Manhattan","Queens","Staten Island"],
           'Crush': [205345, 441026, 313266, 372457, 58297]})
df_Crush

Unnamed: 0,BOROUGH,Crush
0,Bronx,205345
1,Brooklyn,441026
2,Manhattan,313266
3,Queens,372457
4,Staten Island,58297


In [60]:
#Merge Two DataFrame
df_merged = df_Population.merge(df_Crush, how='inner', left_on='BOROUGH', right_on='BOROUGH')
df_merged

Unnamed: 0,BOROUGH,Population,Crush
0,Bronx,1446788,205345
1,Brooklyn,2648452,441026
2,Manhattan,1638281,313266
3,Queens,2330295,372457
4,Staten Island,487155,58297


In [63]:
#Create a new column to calculate crashes for every 100,000 people in each Borough
df_merged['crashes for every 100,000 people'] = (df_merged['Crush'] / df_merged['Population']) * 100000
df_merged

Unnamed: 0,BOROUGH,Population,Crush,"crashes for every 100,000 people"
0,Bronx,1446788,205345,14193.164444
1,Brooklyn,2648452,441026,16652.217975
2,Manhattan,1638281,313266,19121.628097
3,Queens,2330295,372457,15983.255339
4,Staten Island,487155,58297,11966.827806


In [77]:
#Sort crashes for every 100,000 people in each Borough from largest to smallest
df_merged.sort_values(['crashes for every 100,000 people'], ascending=[False])

Unnamed: 0,BOROUGH,Population,Crush,"crashes for every 100,000 people"
2,Manhattan,1638281,313266,19121.628097
1,Brooklyn,2648452,441026,16652.217975
3,Queens,2330295,372457,15983.255339
0,Bronx,1446788,205345,14193.164444
4,Staten Island,487155,58297,11966.827806


## 3. Analyzing the leading cause of crashes

In [81]:
#Filter out 'Unspecified'
filtered_df = df[df['CONTRIBUTING FACTOR VEHICLE 1'] != 'Unspecified']
filtered_df

Unnamed: 0,CRASH DATE_CRASH TIME,BOROUGH,ZIP CODE,LATITUDE,LONGITUDE,LOCATION,ON STREET NAME,CROSS STREET NAME,OFF STREET NAME,NUMBER OF PERSONS INJURED,NUMBER OF PERSONS KILLED,NUMBER OF PEDESTRIANS INJURED,NUMBER OF PEDESTRIANS KILLED,NUMBER OF CYCLIST INJURED,NUMBER OF CYCLIST KILLED,NUMBER OF MOTORIST INJURED,NUMBER OF MOTORIST KILLED,CONTRIBUTING FACTOR VEHICLE 1,CONTRIBUTING FACTOR VEHICLE 2,CONTRIBUTING FACTOR VEHICLE 3,CONTRIBUTING FACTOR VEHICLE 4,CONTRIBUTING FACTOR VEHICLE 5,COLLISION_ID,VEHICLE TYPE CODE 1,VEHICLE TYPE CODE 2,VEHICLE TYPE CODE 3,VEHICLE TYPE CODE 4,VEHICLE TYPE CODE 5
0,2021-09-11 02:39:00,,,,,,WHITESTONE EXPRESSWAY,20 AVENUE,,2.0,0.0,0,0,0,0,2,0,Aggressive Driving/Road Rage,Unspecified,,,,4455765,Sedan,Sedan,,,
1,2022-03-26 11:45:00,,,,,,QUEENSBORO BRIDGE UPPER,,,1.0,0.0,0,0,0,0,1,0,Pavement Slippery,,,,,4513547,Sedan,,,,
2,2022-06-29 06:55:00,,,,,,THROGS NECK BRIDGE,,,0.0,0.0,0,0,0,0,0,0,Following Too Closely,Unspecified,,,,4541903,Sedan,Pick-up Truck,,,
4,2021-12-14 08:13:00,BROOKLYN,11233.0,40.683304,-73.917274,"(40.683304, -73.917274)",SARATOGA AVENUE,DECATUR STREET,,0.0,0.0,0,0,0,0,0,0,,,,,,4486609,,,,,
6,2021-12-14 17:05:00,,,40.709183,-73.956825,"(40.709183, -73.956825)",BROOKLYN QUEENS EXPRESSWAY,,,0.0,0.0,0,0,0,0,0,0,Passing Too Closely,Unspecified,,,,4486555,Sedan,Tractor Truck Diesel,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2018238,2023-07-22 10:40:00,,,,,,CLEARVIEW EXPRESSWAY,NORTHERN BOULEVARD,,3.0,0.0,0,0,0,0,3,0,Following Too Closely,Unspecified,,,,4647804,Station Wagon/Sport Utility Vehicle,Sedan,,,
2018239,2023-06-16 00:00:00,,,40.854310,-73.930090,"(40.85431, -73.93009)",WEST 189 STREET,,,1.0,0.0,1,0,0,0,0,0,Backing Unsafely,,,,,4648255,Station Wagon/Sport Utility Vehicle,,,,
2018240,2023-07-03 18:05:00,,,40.866806,-73.931010,"(40.866806, -73.93101)",RIVERSIDE DRIVE,,,0.0,0.0,0,0,0,0,0,0,Turning Improperly,Unspecified,,,,4648110,Sedan,Sedan,,,
2018242,2023-07-02 17:55:00,MANHATTAN,10006.0,40.711033,-74.014540,"(40.711033, -74.01454)",WEST STREET,LIBERTY STREET,,0.0,0.0,0,0,0,0,0,0,Driver Inattention/Distraction,,,,,4648366,Taxi,,,,


In [83]:
# Find the most common contributing factor
leading_cause = filtered_df['CONTRIBUTING FACTOR VEHICLE 1'].value_counts().idxmax()
leading_cause

'Driver Inattention/Distraction'

In [85]:
# Count of the leading cause
leading_count = filtered_df['CONTRIBUTING FACTOR VEHICLE 1'].value_counts().max()
leading_count

401262

In [86]:
# Calculate the proportion of the leading cause
total_accidents = len(filtered_df)
leading_proportion = (leading_count / total_accidents) * 100
leading_proportion

30.27229539746618

## 4. Top 5 causes of crashes 

In [87]:
#Filter out 'Unspecified'
top_5_causes_counts = filtered_df['CONTRIBUTING FACTOR VEHICLE 1'].value_counts().head(5)
top_5_causes_counts

CONTRIBUTING FACTOR VEHICLE 1
Driver Inattention/Distraction    401262
Failure to Yield Right-of-Way     119166
Following Too Closely             107467
Backing Unsafely                   75042
Other Vehicular                    62688
Name: count, dtype: int64

In [90]:
#Calculate the proportion of the top 5 causes
top_5_proportion = top_5_causes_counts.sum() / len(filtered_df) * 100
top_5_proportion

57.760829990592285

## 5. Total Count of Accidents that Involved Two or More Fatalities

In [91]:
#Filter for accidents with 2 or more fatalities
accidents_with_2_or_more_fatalities = df[df['NUMBER OF PERSONS KILLED'] >= 2]
accidents_with_2_or_more_fatalities

Unnamed: 0,CRASH DATE_CRASH TIME,BOROUGH,ZIP CODE,LATITUDE,LONGITUDE,LOCATION,ON STREET NAME,CROSS STREET NAME,OFF STREET NAME,NUMBER OF PERSONS INJURED,NUMBER OF PERSONS KILLED,NUMBER OF PEDESTRIANS INJURED,NUMBER OF PEDESTRIANS KILLED,NUMBER OF CYCLIST INJURED,NUMBER OF CYCLIST KILLED,NUMBER OF MOTORIST INJURED,NUMBER OF MOTORIST KILLED,CONTRIBUTING FACTOR VEHICLE 1,CONTRIBUTING FACTOR VEHICLE 2,CONTRIBUTING FACTOR VEHICLE 3,CONTRIBUTING FACTOR VEHICLE 4,CONTRIBUTING FACTOR VEHICLE 5,COLLISION_ID,VEHICLE TYPE CODE 1,VEHICLE TYPE CODE 2,VEHICLE TYPE CODE 3,VEHICLE TYPE CODE 4,VEHICLE TYPE CODE 5
4137,2021-12-16 00:15:00,BRONX,10461.0,40.849550,-73.853050,"(40.84955, -73.85305)",MORRIS PARK AVENUE,HAIGHT AVENUE,,0.0,2.0,0,0,0,0,0,2,Unsafe Speed,,,,,4487222,Station Wagon/Sport Utility Vehicle,,,,
22638,2021-05-21 00:00:00,,,40.868080,-73.908580,"(40.86808, -73.90858)",MAJOR DEEGAN EXPRESSWAY,,,0.0,2.0,0,0,0,0,0,2,Unsafe Speed,Unspecified,,,,4419608,Refrigerated Van,Sedan,,,
23322,2021-05-22 04:46:00,,,40.738457,-73.939810,"(40.738457, -73.93981)",BORDEN AVENUE,,,0.0,3.0,0,0,0,0,0,3,Unsafe Speed,,,,,4419561,Sedan,,,,
24991,2021-12-24 09:21:00,MANHATTAN,10065.0,40.762802,-73.965675,"(40.762802, -73.965675)",EAST 61 STREET,3 AVENUE,,0.0,2.0,0,1,0,1,0,0,Unspecified,Unspecified,Unspecified,,,4490224,Box Truck,Station Wagon/Sport Utility Vehicle,E-Bike,,
44643,2021-06-18 10:00:00,,,40.675354,-73.952774,"(40.675354, -73.952774)",ROGERS AVENUE,,,0.0,2.0,0,0,0,0,0,2,Unsafe Lane Changing,Unspecified,,,,4430850,Motorcycle,Dump,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1990142,2023-04-11 00:58:00,,,40.631496,-73.884476,"(40.631496, -73.884476)",BELT PARKWAY,,,0.0,2.0,0,0,0,0,0,2,Driver Inattention/Distraction,Unspecified,,,,4619931,Station Wagon/Sport Utility Vehicle,Sedan,,,
1997299,2023-05-07 07:46:00,,,40.617180,-74.038770,"(40.61718, -74.03877)",SHORE ROAD,,,1.0,3.0,0,0,0,0,1,3,Unsafe Speed,,,,,4627379,Sedan,,,,
2006974,2023-06-05 04:24:00,QUEENS,11420.0,40.680008,-73.823050,"(40.680008, -73.82305)",117 STREET,111 AVENUE,,0.0,2.0,0,0,0,0,0,2,Traffic Control Disregarded,Traffic Control Disregarded,Unspecified,Unspecified,,4635512,Sedan,Sedan,Sedan,Sedan,
2010710,2023-06-24 03:39:00,,,40.666748,-73.764534,"(40.666748, -73.764534)",BELT PARKWAY,,,0.0,2.0,0,0,0,0,0,2,Unsafe Speed,Unspecified,,,,4640445,Sedan,Station Wagon/Sport Utility Vehicle,,,


## 6. At Least One Person Dead for Every 1000 Accidents

In [98]:
#Count accidents with 1 or more fatalities
More_than_one_Dead = df[df['NUMBER OF PERSONS KILLED'] >= 1]
More_than_one_Dead

Unnamed: 0,CRASH DATE_CRASH TIME,BOROUGH,ZIP CODE,LATITUDE,LONGITUDE,LOCATION,ON STREET NAME,CROSS STREET NAME,OFF STREET NAME,NUMBER OF PERSONS INJURED,NUMBER OF PERSONS KILLED,NUMBER OF PEDESTRIANS INJURED,NUMBER OF PEDESTRIANS KILLED,NUMBER OF CYCLIST INJURED,NUMBER OF CYCLIST KILLED,NUMBER OF MOTORIST INJURED,NUMBER OF MOTORIST KILLED,CONTRIBUTING FACTOR VEHICLE 1,CONTRIBUTING FACTOR VEHICLE 2,CONTRIBUTING FACTOR VEHICLE 3,CONTRIBUTING FACTOR VEHICLE 4,CONTRIBUTING FACTOR VEHICLE 5,COLLISION_ID,VEHICLE TYPE CODE 1,VEHICLE TYPE CODE 2,VEHICLE TYPE CODE 3,VEHICLE TYPE CODE 4,VEHICLE TYPE CODE 5
39,2021-07-09 00:43:00,,,40.720535,-73.888850,"(40.720535, -73.88885)",ELIOT AVENUE,,,0.0,1.0,0,1,0,0,0,0,Unspecified,,,,,4456659,Bus,,,,
148,2021-12-12 09:09:00,,,40.840360,-73.918070,"(40.84036, -73.91807)",JEROME AVENUE,,,0.0,1.0,0,1,0,0,0,0,Unspecified,,,,,4487210,Taxi,,,,
591,2021-04-15 15:18:00,BROOKLYN,11209.0,40.620487,-74.029305,"(40.620487, -74.029305)",4 AVENUE,FOREST PLACE,,0.0,1.0,0,1,0,0,0,0,Driver Inattention/Distraction,,,,,4408063,Station Wagon/Sport Utility Vehicle,,,,
605,2021-04-15 22:36:00,,,,,,Trans- Manhattan Expressway,Amsterdam Avenue,,4.0,1.0,0,0,0,0,4,1,Alcohol Involvement,,,,,4407693,Sedan,,,,
1320,2021-04-17 13:31:00,,,40.782463,-73.978830,"(40.782463, -73.97883)",AMSTERDAM AVENUE,,,0.0,1.0,0,1,0,0,0,0,Unsafe Speed,,,,,4408062,E-Bike,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2016625,2023-07-01 00:27:00,QUEENS,11372.0,40.753536,-73.886900,"(40.753536, -73.8869)",80 STREET,34 AVENUE,,3.0,1.0,0,0,0,0,3,1,Traffic Control Disregarded,Unspecified,,,,4643896,Motorcycle,,,,
2016626,2023-07-09 09:25:00,QUEENS,11103.0,40.764730,-73.912110,"(40.76473, -73.91211)",28 AVENUE,42 STREET,,0.0,1.0,0,0,0,0,0,1,Traffic Control Disregarded,Unspecified,,,,4643897,Station Wagon/Sport Utility Vehicle,Moped,,,
2016774,2023-07-17 22:15:00,BROOKLYN,11236.0,40.632435,-73.888180,"(40.632435, -73.88818)",ROCKAWAY PARKWAY,SKIDMORE AVENUE,,0.0,1.0,0,0,0,0,0,1,Failure to Yield Right-of-Way,Unspecified,,,,4646703,Sedan,Motorcycle,,,
2018044,2023-07-22 11:17:00,QUEENS,11429.0,40.705220,-73.727880,"(40.70522, -73.72788)",112 AVENUE,CROSS ISLAND PARKWAY,,0.0,1.0,0,0,0,0,0,1,Unsafe Speed,,,,,4648067,Station Wagon/Sport Utility Vehicle,,,,


In [97]:
total_accidents = len(df)
total_accidents

2018245

In [100]:
#Calculate the proportion of fatal accidents
proportion_More_than_one_Dead = len(More_than_one_Dead) / total_accidents*1000
proportion_More_than_one_Dead

1.3893258747079764

## 7.Proportion of Accidents in The Data Do Not Have A Cross Street Name

In [101]:
#Count the number of accidents with missing or blank 'Cross Street Name'
missing_cross_street = df['CROSS STREET NAME'].isnull().sum()
missing_cross_street

755532

In [103]:
# Get the total number of accidents
total_accidents = len(df)
total_accidents

2018245

In [107]:
#Calculate the proportion of accidents without a 'Cross Street Name'
proportion_missing_cross_street = (missing_cross_street / total_accidents) * 100
proportion_missing_cross_street

37.4350983156158

## 8. Combination of Vehicles Have The Most Number of Accidents?

In [109]:
# Assuming your DataFrame is named df and the columns are named 'VEHICLE TYPE CODE 1' and 'VEHICLE TYPE CODE 2'
# Group by the two vehicle type columns and count the number of accidents for each combination
vehicle_combinations = df.groupby(['VEHICLE TYPE CODE 1', 'VEHICLE TYPE CODE 2']).size()

# Sort the counts in descending order to get the combination with the most accidents at the top
most_common_combination = vehicle_combinations.sort_values(ascending=False).head(1)

print(most_common_combination)

VEHICLE TYPE CODE 1  VEHICLE TYPE CODE 2
Sedan                Sedan                  197944
dtype: int64


## 9. Proportion of  Alcohol Involvement Resulted in a Fatality

In [21]:
# Show all heading
pd.set_option('display.max_columns', None)
df.head()

Unnamed: 0,CRASH DATE_CRASH TIME,BOROUGH,ZIP CODE,LATITUDE,LONGITUDE,LOCATION,ON STREET NAME,CROSS STREET NAME,OFF STREET NAME,NUMBER OF PERSONS INJURED,NUMBER OF PERSONS KILLED,NUMBER OF PEDESTRIANS INJURED,NUMBER OF PEDESTRIANS KILLED,NUMBER OF CYCLIST INJURED,NUMBER OF CYCLIST KILLED,NUMBER OF MOTORIST INJURED,NUMBER OF MOTORIST KILLED,CONTRIBUTING FACTOR VEHICLE 1,CONTRIBUTING FACTOR VEHICLE 2,CONTRIBUTING FACTOR VEHICLE 3,CONTRIBUTING FACTOR VEHICLE 4,CONTRIBUTING FACTOR VEHICLE 5,COLLISION_ID,VEHICLE TYPE CODE 1,VEHICLE TYPE CODE 2,VEHICLE TYPE CODE 3,VEHICLE TYPE CODE 4,VEHICLE TYPE CODE 5
0,2021-09-11 02:39:00,,,,,,WHITESTONE EXPRESSWAY,20 AVENUE,,2.0,0.0,0,0,0,0,2,0,Aggressive Driving/Road Rage,Unspecified,,,,4455765,Sedan,Sedan,,,
1,2022-03-26 11:45:00,,,,,,QUEENSBORO BRIDGE UPPER,,,1.0,0.0,0,0,0,0,1,0,Pavement Slippery,,,,,4513547,Sedan,,,,
2,2022-06-29 06:55:00,,,,,,THROGS NECK BRIDGE,,,0.0,0.0,0,0,0,0,0,0,Following Too Closely,Unspecified,,,,4541903,Sedan,Pick-up Truck,,,
3,2021-09-11 09:35:00,BROOKLYN,11208.0,40.667202,-73.8665,"(40.667202, -73.8665)",,,1211 LORING AVENUE,0.0,0.0,0,0,0,0,0,0,Unspecified,,,,,4456314,Sedan,,,,
4,2021-12-14 08:13:00,BROOKLYN,11233.0,40.683304,-73.917274,"(40.683304, -73.917274)",SARATOGA AVENUE,DECATUR STREET,,0.0,0.0,0,0,0,0,0,0,,,,,,4486609,,,,,


In [39]:
# Filter for alcohol involvement
alcohol_involvement_df = df[df['CONTRIBUTING FACTOR VEHICLE 1'] == 'Alcohol Involvement']
alcohol_involvement_df

Unnamed: 0,CRASH DATE_CRASH TIME,BOROUGH,ZIP CODE,LATITUDE,LONGITUDE,LOCATION,ON STREET NAME,CROSS STREET NAME,OFF STREET NAME,NUMBER OF PERSONS INJURED,NUMBER OF PERSONS KILLED,NUMBER OF PEDESTRIANS INJURED,NUMBER OF PEDESTRIANS KILLED,NUMBER OF CYCLIST INJURED,NUMBER OF CYCLIST KILLED,NUMBER OF MOTORIST INJURED,NUMBER OF MOTORIST KILLED,CONTRIBUTING FACTOR VEHICLE 1,CONTRIBUTING FACTOR VEHICLE 2,CONTRIBUTING FACTOR VEHICLE 3,CONTRIBUTING FACTOR VEHICLE 4,CONTRIBUTING FACTOR VEHICLE 5,COLLISION_ID,VEHICLE TYPE CODE 1,VEHICLE TYPE CODE 2,VEHICLE TYPE CODE 3,VEHICLE TYPE CODE 4,VEHICLE TYPE CODE 5
45,2022-04-24 21:40:00,BRONX,10452.0,40.843906,-73.924130,"(40.843906, -73.92413)",BOSCOBEL PLACE,UNIVERSITY AVENUE,,0.0,0.0,0,0,0,0,0,0,Alcohol Involvement,Unspecified,,,,4522156,Taxi,Station Wagon/Sport Utility Vehicle,,,
68,2021-12-09 02:45:00,QUEENS,11422.0,40.653023,-73.738950,"(40.653023, -73.73895)",149 AVENUE,HUXLEY STREET,,1.0,0.0,0,0,0,0,1,0,Alcohol Involvement,,,,,4485026,Sedan,,,,
103,2022-03-26 15:45:00,BRONX,10472.0,40.833965,-73.862900,"(40.833965, -73.8629)",WHITE PLAINS ROAD,CROSS BRONX EXPRESSWAY,,2.0,0.0,0,0,0,0,2,0,Alcohol Involvement,Unspecified,,,,4514202,Station Wagon/Sport Utility Vehicle,,,,
192,2022-03-26 11:20:00,,,40.624763,-73.965180,"(40.624763, -73.96518)",CONEY ISLAND AVENUE,,,1.0,0.0,0,0,0,0,1,0,Alcohol Involvement,Driver Inattention/Distraction,,,,4513691,Sedan,,,,
201,2022-03-26 23:00:00,QUEENS,11420.0,40.676304,-73.816284,"(40.676304, -73.816284)",,,115-36 122 STREET,0.0,0.0,0,0,0,0,0,0,Alcohol Involvement,Unspecified,Unspecified,,,4513952,Station Wagon/Sport Utility Vehicle,Station Wagon/Sport Utility Vehicle,Station Wagon/Sport Utility Vehicle,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2018033,2023-07-10 01:00:00,,,40.858910,-73.922966,"(40.85891, -73.922966)",10 AVENUE,,,0.0,0.0,0,0,0,0,0,0,Alcohol Involvement,,,,,4648257,Sedan,,,,
2018070,2023-07-22 20:42:00,,,40.648705,-74.010254,"(40.648705, -74.010254)",4 AVENUE,,,0.0,0.0,0,0,0,0,0,0,Alcohol Involvement,Unspecified,,,,4647823,Pick-up Truck,Van,,,
2018081,2023-07-22 03:38:00,QUEENS,11435.0,40.706284,-73.809150,"(40.706284, -73.80915)",,,87-45 148 STREET,0.0,0.0,0,0,0,0,0,0,Alcohol Involvement,Other Vehicular,,,,4647463,Station Wagon/Sport Utility Vehicle,Station Wagon/Sport Utility Vehicle,,,
2018208,2023-07-22 01:33:00,,,40.822890,-73.955740,"(40.82289, -73.95574)",RIVERSIDE DRIVE,,,0.0,0.0,0,0,0,0,0,0,Alcohol Involvement,Unspecified,,,,4647440,Sedan,Sedan,,,


In [78]:
#  Calculate the proportion of fatalities
fatalities = alcohol_involvement_df[alcohol_involvement_df['NUMBER OF PERSONS KILLED'] > 0]
fatalities

Unnamed: 0,CRASH DATE_CRASH TIME,BOROUGH,ZIP CODE,LATITUDE,LONGITUDE,LOCATION,ON STREET NAME,CROSS STREET NAME,OFF STREET NAME,NUMBER OF PERSONS INJURED,NUMBER OF PERSONS KILLED,NUMBER OF PEDESTRIANS INJURED,NUMBER OF PEDESTRIANS KILLED,NUMBER OF CYCLIST INJURED,NUMBER OF CYCLIST KILLED,NUMBER OF MOTORIST INJURED,NUMBER OF MOTORIST KILLED,CONTRIBUTING FACTOR VEHICLE 1,CONTRIBUTING FACTOR VEHICLE 2,CONTRIBUTING FACTOR VEHICLE 3,CONTRIBUTING FACTOR VEHICLE 4,CONTRIBUTING FACTOR VEHICLE 5,COLLISION_ID,VEHICLE TYPE CODE 1,VEHICLE TYPE CODE 2,VEHICLE TYPE CODE 3,VEHICLE TYPE CODE 4,VEHICLE TYPE CODE 5
605,2021-04-15 22:36:00,,,,,,Trans- Manhattan Expressway,Amsterdam Avenue,,4.0,1.0,0,0,0,0,4,1,Alcohol Involvement,,,,,4407693,Sedan,,,,
2358,2021-09-11 03:01:00,,,,,,LIE OUTER ROADWAY (CDR),,,2.0,1.0,0,0,0,0,1,1,Alcohol Involvement,Unspecified,Unspecified,,,4457192,Sedan,Motorcycle,E-Bike,,
4456,2021-04-21 00:55:00,QUEENS,11419.0,40.692135,-73.834850,"(40.692135, -73.83485)",ATLANTIC AVENUE,111 STREET,,3.0,1.0,0,0,0,0,3,1,Alcohol Involvement,Unspecified,Unspecified,,,4409176,Sedan,Pick-up Truck,Station Wagon/Sport Utility Vehicle,,
7494,2021-04-27 01:57:00,,,40.742300,-73.781235,"(40.7423, -73.781235)",LONG ISLAND EXPRESSWAY,,,0.0,1.0,0,1,0,0,0,0,Alcohol Involvement,,,,,4411011,Sedan,,,,
11450,2021-09-19 23:17:00,MANHATTAN,10017.0,40.751892,-73.967600,"(40.751892, -73.9676)",1 AVENUE,EAST 47 STREET,,0.0,1.0,0,0,0,0,0,1,Alcohol Involvement,Unspecified,,,,4458952,Motorscooter,Station Wagon/Sport Utility Vehicle,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1980170,2023-03-03 18:45:00,,,,,,Van Wyck Expwy Service Road,106 AVENUE,,0.0,1.0,0,1,0,0,0,0,Alcohol Involvement,,,,,4609999,Station Wagon/Sport Utility Vehicle,,,,
1991137,2023-04-15 22:45:00,BRONX,10473.0,40.821010,-73.865750,"(40.82101, -73.86575)",SOUND VIEW AVENUE,LAFAYETTE AVENUE,,0.0,1.0,0,1,0,0,0,0,Alcohol Involvement,,,,,4620845,Sedan,,,,
1991461,2023-04-16 21:58:00,BROOKLYN,11201.0,40.690240,-73.994370,"(40.69024, -73.99437)",ATLANTIC AVENUE,CLINTON STREET,,0.0,1.0,0,1,0,0,0,0,Alcohol Involvement,Unspecified,,,,4621295,Sedan,Sedan,,,
2005296,2023-04-16 22:50:00,BROOKLYN,11230.0,40.630096,-73.977060,"(40.630096, -73.97706)",,,976 MC DONALD AVENUE,0.0,1.0,0,0,0,1,0,0,Alcohol Involvement,,,,,4621050,Bike,,,,


In [79]:
#  Calculate the proportion of fatalities
proportion_fatalities = alcohol_involvement_df['NUMBER OF PERSONS KILLED'].mean() * 100 
proportion_fatalities

0.4756187661618027

## 10. Proportion of crashes occur during the evening rush hour, defined as starting at 4 PM, and before 7 PM

In [80]:
df

Unnamed: 0,CRASH DATE_CRASH TIME,BOROUGH,ZIP CODE,LATITUDE,LONGITUDE,LOCATION,ON STREET NAME,CROSS STREET NAME,OFF STREET NAME,NUMBER OF PERSONS INJURED,NUMBER OF PERSONS KILLED,NUMBER OF PEDESTRIANS INJURED,NUMBER OF PEDESTRIANS KILLED,NUMBER OF CYCLIST INJURED,NUMBER OF CYCLIST KILLED,NUMBER OF MOTORIST INJURED,NUMBER OF MOTORIST KILLED,CONTRIBUTING FACTOR VEHICLE 1,CONTRIBUTING FACTOR VEHICLE 2,CONTRIBUTING FACTOR VEHICLE 3,CONTRIBUTING FACTOR VEHICLE 4,CONTRIBUTING FACTOR VEHICLE 5,COLLISION_ID,VEHICLE TYPE CODE 1,VEHICLE TYPE CODE 2,VEHICLE TYPE CODE 3,VEHICLE TYPE CODE 4,VEHICLE TYPE CODE 5
0,2021-09-11 02:39:00,,,,,,WHITESTONE EXPRESSWAY,20 AVENUE,,2.0,0.0,0,0,0,0,2,0,Aggressive Driving/Road Rage,Unspecified,,,,4455765,Sedan,Sedan,,,
1,2022-03-26 11:45:00,,,,,,QUEENSBORO BRIDGE UPPER,,,1.0,0.0,0,0,0,0,1,0,Pavement Slippery,,,,,4513547,Sedan,,,,
2,2022-06-29 06:55:00,,,,,,THROGS NECK BRIDGE,,,0.0,0.0,0,0,0,0,0,0,Following Too Closely,Unspecified,,,,4541903,Sedan,Pick-up Truck,,,
3,2021-09-11 09:35:00,BROOKLYN,11208.0,40.667202,-73.866500,"(40.667202, -73.8665)",,,1211 LORING AVENUE,0.0,0.0,0,0,0,0,0,0,Unspecified,,,,,4456314,Sedan,,,,
4,2021-12-14 08:13:00,BROOKLYN,11233.0,40.683304,-73.917274,"(40.683304, -73.917274)",SARATOGA AVENUE,DECATUR STREET,,0.0,0.0,0,0,0,0,0,0,,,,,,4486609,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2018240,2023-07-03 18:05:00,,,40.866806,-73.931010,"(40.866806, -73.93101)",RIVERSIDE DRIVE,,,0.0,0.0,0,0,0,0,0,0,Turning Improperly,Unspecified,,,,4648110,Sedan,Sedan,,,
2018241,2023-07-22 21:39:00,BRONX,10457.0,40.844177,-73.902920,"(40.844177, -73.90292)",EAST 174 STREET,WEBSTER AVENUE,,1.0,0.0,1,0,0,0,0,0,Unspecified,,,,,4648117,Sedan,,,,
2018242,2023-07-02 17:55:00,MANHATTAN,10006.0,40.711033,-74.014540,"(40.711033, -74.01454)",WEST STREET,LIBERTY STREET,,0.0,0.0,0,0,0,0,0,0,Driver Inattention/Distraction,,,,,4648366,Taxi,,,,
2018243,2023-07-22 13:15:00,QUEENS,11433.0,40.691580,-73.793190,"(40.69158, -73.79319)",110 AVENUE,157 STREET,,1.0,0.0,0,0,0,0,0,0,Driver Inattention/Distraction,Driver Inattention/Distraction,,,,4648129,Station Wagon/Sport Utility Vehicle,E-Bike,,,


In [89]:
df[(df['CRASH DATE_CRASH TIME'].dt.hour >= 16) & (df['CRASH DATE_CRASH TIME'].dt.hour < 19)]

Unnamed: 0,CRASH DATE_CRASH TIME,BOROUGH,ZIP CODE,LATITUDE,LONGITUDE,LOCATION,ON STREET NAME,CROSS STREET NAME,OFF STREET NAME,NUMBER OF PERSONS INJURED,NUMBER OF PERSONS KILLED,NUMBER OF PEDESTRIANS INJURED,NUMBER OF PEDESTRIANS KILLED,NUMBER OF CYCLIST INJURED,NUMBER OF CYCLIST KILLED,NUMBER OF MOTORIST INJURED,NUMBER OF MOTORIST KILLED,CONTRIBUTING FACTOR VEHICLE 1,CONTRIBUTING FACTOR VEHICLE 2,CONTRIBUTING FACTOR VEHICLE 3,CONTRIBUTING FACTOR VEHICLE 4,CONTRIBUTING FACTOR VEHICLE 5,COLLISION_ID,VEHICLE TYPE CODE 1,VEHICLE TYPE CODE 2,VEHICLE TYPE CODE 3,VEHICLE TYPE CODE 4,VEHICLE TYPE CODE 5
6,2021-12-14 17:05:00,,,40.709183,-73.956825,"(40.709183, -73.956825)",BROOKLYN QUEENS EXPRESSWAY,,,0.0,0.0,0,0,0,0,0,0,Passing Too Closely,Unspecified,,,,4486555,Sedan,Tractor Truck Diesel,,,
11,2021-12-14 16:50:00,QUEENS,11413.0,40.675884,-73.755770,"(40.675884, -73.75577)",SPRINGFIELD BOULEVARD,EAST GATE PLAZA,,0.0,0.0,0,0,0,0,0,0,Turning Improperly,Unspecified,,,,4487127,Sedan,Station Wagon/Sport Utility Vehicle,,,
15,2021-12-14 17:58:00,BROOKLYN,11217.0,40.681580,-73.974630,"(40.68158, -73.97463)",,,480 DEAN STREET,0.0,0.0,0,0,0,0,0,0,Passing Too Closely,Unspecified,,,,4486604,Tanker,Station Wagon/Sport Utility Vehicle,,,
24,2021-12-13 17:40:00,STATEN ISLAND,10301.0,40.631650,-74.087620,"(40.63165, -74.08762)",VICTORY BOULEVARD,WOODSTOCK AVENUE,,1.0,0.0,0,0,0,0,1,0,Unspecified,Unspecified,,,,4487001,Sedan,Sedan,,,
25,2021-12-14 17:31:00,BROOKLYN,11230.0,40.623104,-73.958090,"(40.623104, -73.95809)",EAST 18 STREET,AVENUE K,,1.0,0.0,1,0,0,0,0,0,Unspecified,,,,,4486516,Sedan,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2018219,2023-07-17 16:39:00,BRONX,10451.0,40.816550,-73.919550,"(40.81655, -73.91955)",EAST 149 STREET,COURTLANDT AVENUE,,0.0,0.0,0,0,0,0,0,0,Passing Too Closely,Unspecified,,,,4648282,Sedan,Box Truck,,,
2018223,2023-07-22 16:43:00,BROOKLYN,11225.0,40.655800,-73.962030,"(40.6558, -73.96203)",,,197 OCEAN AVENUE,0.0,0.0,0,0,0,0,0,0,Driver Inattention/Distraction,Passing or Lane Usage Improper,,,,4648050,Station Wagon/Sport Utility Vehicle,Station Wagon/Sport Utility Vehicle,,,
2018236,2023-07-22 16:15:00,,,,,,PELHAM PARKWAY NORTH,STILLWELL AVENUE,,1.0,0.0,0,0,0,0,1,0,Turning Improperly,Unspecified,,,,4647987,Ambulance,Moped,,,
2018240,2023-07-03 18:05:00,,,40.866806,-73.931010,"(40.866806, -73.93101)",RIVERSIDE DRIVE,,,0.0,0.0,0,0,0,0,0,0,Turning Improperly,Unspecified,,,,4648110,Sedan,Sedan,,,


In [98]:
proportion = 414023/2018245
proportion

0.20514010935243243

## 11. Proportion resulted in injuries but no fatalities among crashes involving motorcycles


In [52]:
motorcycle_crashes = df[(df['VEHICLE TYPE CODE 1'].str.contains("MOTORCYCLE", na=False)) |
                        (df['VEHICLE TYPE CODE 2'].str.contains("MOTORCYCLE", na=False))]
motorcycle_crashes

Unnamed: 0,CRASH DATE_CRASH TIME,BOROUGH,ZIP CODE,LATITUDE,LONGITUDE,LOCATION,ON STREET NAME,CROSS STREET NAME,OFF STREET NAME,NUMBER OF PERSONS INJURED,NUMBER OF PERSONS KILLED,NUMBER OF PEDESTRIANS INJURED,NUMBER OF PEDESTRIANS KILLED,NUMBER OF CYCLIST INJURED,NUMBER OF CYCLIST KILLED,NUMBER OF MOTORIST INJURED,NUMBER OF MOTORIST KILLED,CONTRIBUTING FACTOR VEHICLE 1,CONTRIBUTING FACTOR VEHICLE 2,CONTRIBUTING FACTOR VEHICLE 3,CONTRIBUTING FACTOR VEHICLE 4,CONTRIBUTING FACTOR VEHICLE 5,COLLISION_ID,VEHICLE TYPE CODE 1,VEHICLE TYPE CODE 2,VEHICLE TYPE CODE 3,VEHICLE TYPE CODE 4,VEHICLE TYPE CODE 5
41575,2021-06-19 16:40:00,BRONX,10451.0,40.822120,-73.911575,"(40.82212, -73.911575)",,,3114 3 AVENUE,1.0,0.0,0,0,0,0,1,0,Driver Inattention/Distraction,Unsafe Speed,,,,4429297,Station Wagon/Sport Utility Vehicle,MOTORCYCLE,,,
42089,2021-06-20 21:55:00,QUEENS,11370.0,40.759525,-73.883255,"(40.759525, -73.883255)",31 AVENUE,85 STREET,,2.0,0.0,0,0,0,0,2,0,Unspecified,Unspecified,,,,4429992,Sedan,MOTORCYCLE,,,
194297,2020-10-23 08:22:00,QUEENS,11106.0,40.759457,-73.935890,"(40.759457, -73.93589)",22 STREET,36 AVENUE,,1.0,0.0,0,0,0,0,1,0,Unsafe Speed,Unspecified,Unspecified,Unspecified,,4360737,Station Wagon/Sport Utility Vehicle,MOTORCYCLE,Station Wagon/Sport Utility Vehicle,Station Wagon/Sport Utility Vehicle,
1128296,2016-04-26 00:22:00,,,,,,CROSS ISLAND PKWY,GRAND CENTRAL PARKWAY,,1.0,0.0,0,0,0,0,1,0,,,,,,3427761,MOTORCYCLE,,,,
1153960,2016-03-12 01:05:00,,,,,,,,,1.0,0.0,0,0,0,0,1,0,Unspecified,Unspecified,,,,3404152,MOTORCYCLE,TAXI,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1924669,2012-07-01 18:35:00,MANHATTAN,10037,40.814774,-73.940369,"(40.8147743, -73.9403692)",LENOX AVENUE,WEST 136 STREET,,0.0,0.0,0,0,0,0,0,0,Unspecified,Unspecified,,,,66342,SPORT UTILITY / STATION WAGON,MOTORCYCLE,,,
1924743,2012-07-08 00:11:00,BRONX,10461,40.846075,-73.833504,"(40.8460746, -73.8335039)",WESTCHESTER AVENUE,PILGRIM AVENUE,,2.0,0.0,0,0,0,0,2,0,Unspecified,Unspecified,,,,88585,PASSENGER VEHICLE,MOTORCYCLE,,,
1924796,2012-07-05 01:00:00,BROOKLYN,11234,40.628168,-73.927866,"(40.6281676, -73.9278657)",UTICA AVENUE,AVENUE J,,0.0,0.0,0,0,0,0,0,0,Unspecified,Unspecified,,,,125701,MOTORCYCLE,UNKNOWN,,,
1924915,2012-07-08 20:30:00,,,,,,BROOKVILLE BOULEVARD,137 AVENUE,,1.0,0.0,0,0,0,0,1,0,Unspecified,Unspecified,,,,219502,MOTORCYCLE,PASSENGER VEHICLE,,,


In [54]:
injuries_no_fatalities = motorcycle_crashes[(motorcycle_crashes['NUMBER OF PERSONS INJURED'] > 0) &
                                            (motorcycle_crashes['NUMBER OF PERSONS KILLED'] == 0)]
injuries_no_fatalities

Unnamed: 0,CRASH DATE_CRASH TIME,BOROUGH,ZIP CODE,LATITUDE,LONGITUDE,LOCATION,ON STREET NAME,CROSS STREET NAME,OFF STREET NAME,NUMBER OF PERSONS INJURED,NUMBER OF PERSONS KILLED,NUMBER OF PEDESTRIANS INJURED,NUMBER OF PEDESTRIANS KILLED,NUMBER OF CYCLIST INJURED,NUMBER OF CYCLIST KILLED,NUMBER OF MOTORIST INJURED,NUMBER OF MOTORIST KILLED,CONTRIBUTING FACTOR VEHICLE 1,CONTRIBUTING FACTOR VEHICLE 2,CONTRIBUTING FACTOR VEHICLE 3,CONTRIBUTING FACTOR VEHICLE 4,CONTRIBUTING FACTOR VEHICLE 5,COLLISION_ID,VEHICLE TYPE CODE 1,VEHICLE TYPE CODE 2,VEHICLE TYPE CODE 3,VEHICLE TYPE CODE 4,VEHICLE TYPE CODE 5
41575,2021-06-19 16:40:00,BRONX,10451.0,40.822120,-73.911575,"(40.82212, -73.911575)",,,3114 3 AVENUE,1.0,0.0,0,0,0,0,1,0,Driver Inattention/Distraction,Unsafe Speed,,,,4429297,Station Wagon/Sport Utility Vehicle,MOTORCYCLE,,,
42089,2021-06-20 21:55:00,QUEENS,11370.0,40.759525,-73.883255,"(40.759525, -73.883255)",31 AVENUE,85 STREET,,2.0,0.0,0,0,0,0,2,0,Unspecified,Unspecified,,,,4429992,Sedan,MOTORCYCLE,,,
194297,2020-10-23 08:22:00,QUEENS,11106.0,40.759457,-73.935890,"(40.759457, -73.93589)",22 STREET,36 AVENUE,,1.0,0.0,0,0,0,0,1,0,Unsafe Speed,Unspecified,Unspecified,Unspecified,,4360737,Station Wagon/Sport Utility Vehicle,MOTORCYCLE,Station Wagon/Sport Utility Vehicle,Station Wagon/Sport Utility Vehicle,
1128296,2016-04-26 00:22:00,,,,,,CROSS ISLAND PKWY,GRAND CENTRAL PARKWAY,,1.0,0.0,0,0,0,0,1,0,,,,,,3427761,MOTORCYCLE,,,,
1153960,2016-03-12 01:05:00,,,,,,,,,1.0,0.0,0,0,0,0,1,0,Unspecified,Unspecified,,,,3404152,MOTORCYCLE,TAXI,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1924135,2012-07-01 14:52:00,,,,,,,,,1.0,0.0,0,0,0,0,1,0,Failure to Keep Right,,,,,2896729,MOTORCYCLE,,,,
1924284,2012-07-07 05:10:00,,,,,,FARMERS BOULEVARD,140 AVENUE,,1.0,0.0,0,0,0,0,1,0,Unspecified,Unspecified,,,,268146,PASSENGER VEHICLE,MOTORCYCLE,,,
1924395,2012-07-02 01:00:00,,,,,,,,,4.0,0.0,0,0,0,0,4,0,Unspecified,Unspecified,,,,2918660,PASSENGER VEHICLE,MOTORCYCLE,,,
1924743,2012-07-08 00:11:00,BRONX,10461,40.846075,-73.833504,"(40.8460746, -73.8335039)",WESTCHESTER AVENUE,PILGRIM AVENUE,,2.0,0.0,0,0,0,0,2,0,Unspecified,Unspecified,,,,88585,PASSENGER VEHICLE,MOTORCYCLE,,,


In [56]:
proportion = len(injuries_no_fatalities) / len(motorcycle_crashes)
proportion

0.5004565018912221

## 12. Crashes involved bicycles as one of the vehicles

In [62]:
bicycle_crashes = df[((df['VEHICLE TYPE CODE 1'] == 'BICYCLE') | (df['VEHICLE TYPE CODE 2'] == 'BICYCLE'))]
bicycle_crashes

Unnamed: 0,CRASH DATE_CRASH TIME,BOROUGH,ZIP CODE,LATITUDE,LONGITUDE,LOCATION,ON STREET NAME,CROSS STREET NAME,OFF STREET NAME,NUMBER OF PERSONS INJURED,NUMBER OF PERSONS KILLED,NUMBER OF PEDESTRIANS INJURED,NUMBER OF PEDESTRIANS KILLED,NUMBER OF CYCLIST INJURED,NUMBER OF CYCLIST KILLED,NUMBER OF MOTORIST INJURED,NUMBER OF MOTORIST KILLED,CONTRIBUTING FACTOR VEHICLE 1,CONTRIBUTING FACTOR VEHICLE 2,CONTRIBUTING FACTOR VEHICLE 3,CONTRIBUTING FACTOR VEHICLE 4,CONTRIBUTING FACTOR VEHICLE 5,COLLISION_ID,VEHICLE TYPE CODE 1,VEHICLE TYPE CODE 2,VEHICLE TYPE CODE 3,VEHICLE TYPE CODE 4,VEHICLE TYPE CODE 5
218167,2020-10-06 16:43:00,QUEENS,11423.0,40.729210,-73.781166,"(40.72921, -73.781166)",188 STREET,UNION TURNPIKE,,1.0,0.0,1,0,0,0,0,0,Reaction to Uninvolved Vehicle,,,,,4355439,BICYCLE,,,,
1077740,2016-07-07 08:07:00,QUEENS,11373.0,,,,BROADWAY,BAXTER AVENUE,,0.0,1.0,0,0,0,1,0,0,Pedestrian/Bicyclist/Other Pedestrian Error/Co...,Unspecified,Unspecified,,,3485897,BICYCLE,PASSENGER VEHICLE,BICYCLE,,
1092878,2016-06-17 16:06:00,BROOKLYN,11203.0,,,,UTICA AVENUE,RUTLAND ROAD,,1.0,0.0,0,0,1,0,0,0,Unspecified,Unspecified,Unspecified,Unspecified,,3470666,BICYCLE,BICYCLE,,,
1093258,2016-06-18 03:40:00,QUEENS,11105.0,40.768888,-73.906908,"(40.7688877, -73.9069078)",SOUND STREET,ASTORIA BLVD NORTH,,0.0,0.0,0,0,0,0,0,0,Unspecified,Unspecified,,,,3463912,BICYCLE,PASSENGER VEHICLE,,,
1144089,2016-04-05 20:27:00,,,,,,FLATBUSH AVENUE,LINCOLN ROAD,,0.0,0.0,0,0,0,0,0,0,Unspecified,Unspecified,,,,3417759,PASSENGER VEHICLE,BICYCLE,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1924860,2012-07-09 17:37:00,BROOKLYN,11222,40.720601,-73.954754,"(40.7206006, -73.9547539)",BEDFORD AVENUE,NORTH 12 STREET,,1.0,0.0,0,0,1,0,0,0,Passenger Distraction,Unspecified,,,,198192,TAXI,BICYCLE,,,
1924881,2012-07-02 09:46:00,MANHATTAN,10002,40.717724,-73.985765,"(40.7177239, -73.9857652)",DELANCEY STREET,CLINTON STREET,,1.0,0.0,0,0,1,0,0,0,Outside Car Distraction,Unspecified,,,,12187,PICK-UP TRUCK,BICYCLE,,,
1924949,2012-07-07 18:40:00,,,40.867335,-73.822707,"(40.8673349, -73.8227066)",,,,1.0,0.0,0,0,1,0,0,0,Unspecified,Unspecified,,,,2912116,PASSENGER VEHICLE,BICYCLE,,,
1924950,2012-07-06 13:33:00,BROOKLYN,11209,40.625780,-74.024154,"(40.6257805, -74.0241544)",5 AVENUE,80 STREET,,1.0,0.0,0,0,1,0,0,0,Other Vehicular,Unspecified,,,,140835,PASSENGER VEHICLE,BICYCLE,,,
