# **Comprehensive Analysis of Motor Vehicle Collisions: Injury Patterns, Causes, and Trends**

## **1. Introduction**

### 1.1. Purpose
The purpose of this report is to conduct a thorough analysis of motor vehicle collision data to uncover patterns and common factors associated with injuries, identify the primary causes of collisions, understand vehicle involvement, and examine monthly trends. Additionally, the report aims to generate a collision heatmap to visualize collision hotspots. The insights derived from this analysis can help in formulating effective strategies to enhance road safety and reduce the frequency and severity of collisions.

### 1.2 About Dataset
- **Crash Date**: The date on which the collision occurred.
- **Crash Time**: The time at which the collision occurred.
- **Borough**: The borough in which the collision occurred.
- **Zip Code**: The ZIP code of the location where the collision occurred.
- **Latitude**: The latitude coordinate of the collision location.
- **Longitude**: The longitude coordinate of the collision location.
- **Location**: The combined latitude and longitude coordinates of the collision location.
- **On Street Name**: The name of the street where the collision occurred.
- **Cross Street Name**: The name of the cross street near the collision location.
- **Off Street Name**: The name of an off-street near the collision location.
- **Number of Persons Injured**: The number of people injured in the collision.
- **Contributing Factor Vehicle 1**: The primary contributing factor for the collision from the perspective of the first vehicle.
- **Collision ID**: A unique identifier for the collision incident.
- **Vehicle Type Code 1**: The type of the first vehicle involved in the collision.

## **2. Descriptive Analysis**

### 2.1. Number of Persons Injured

| Statistic | Value          |
|:----------|:---------------|
| count     | 293,256 |
| mean      | 0.45      |
| std       | 0.78       |
| min       | 0.00       |
| 25%       | 0.00      |
| 50%       | 0.00     |
| 75%       | 1.00       |
| max       | 40.00      |

**Interpretation**: The descriptive statistics for the number of persons injured per collision indicate that out of 293,256 recorded collisions, the average number of persons injured is 0.45, with a standard deviation of 0.78. The data shows that in 50% of the collisions, no one was injured (as indicated by the median value of 0.00). The maximum number of persons injured in a single collision is 40, suggesting some extreme cases of high injury counts. The interquartile range (IQR) indicates that 75% of the collisions result in up to 1 person being injured.

### 2.2. Contribution Factor Vehicle

| Contributing Factor Vehicle 1                                      | Count  |
|:-------------------------------------------------------------------|:-------|
| Unspecified                                                        | 74,120 |
| Driver Inattention/Distraction                                     | 72,368 |
| Failure to Yield Right-of-Way                                      | 19,494 |
| Following Too Closely                                              | 19,486 |
| Passing or Lane Usage Improper                                     | 12,388 |
| Passing Too Closely                                                | 10,994 |
| Backing Unsafely                                                   | 9,752  |
| Unsafe Speed                                                       | 9,550  |
| Other Vehicular                                                    | 8,166  |
| Traffic Control Disregarded                                        | 7,601  |
| Unsafe Lane Changing                                               | 6,371  |
| Turning Improperly                                                 | 6,129  |
| Driver Inexperience                                                | 5,392  |
| Alcohol Involvement                                                | 4,484  |
| Reaction to Uninvolved Vehicle                                     | 4,272  |
| Pavement Slippery                                                  | 2,590  |
| View Obstructed/Limited                                            | 2,547  |
| Pedestrian/Bicyclist/Other Pedestrian Error/Confusion              | 2,529  |
| Aggressive Driving/Road Rage                                       | 2,237  |
| Oversized Vehicle                                                  | 1,364  |
| Brakes Defective                                                   | 1,208  |
| Fell Asleep                                                        | 1,149  |
| Passenger Distraction                                              | 716    |
| Steering Failure                                                   | 704    |
| Obstruction/Debris                                                 | 667    |
| Outside Car Distraction                                            | 602    |
| Lost Consciousness                                                 | 525    |
| Tire Failure/Inadequate                                            | 505    |
| Glare                                                              | 485    |
| Illness                                                            | 480    |
| Pavement Defective                                                 | 367    |
| Fatigued/Drowsy                                                    | 358    |
| Failure to Keep Right                                              | 331    |
| Driverless/Runaway Vehicle                                         | 291    |
| Drugs (illegal)                                                    | 273    |
| Animals Action                                                     | 270    |
| Accelerator Defective                                              | 205    |
| Cell Phone (hand-Held)                                             | 136    |
| Physical Disability                                                | 136    |
| Traffic Control Device Improper/Non-Working                        | 133    |
| Lane Marking Improper/Inadequate                                   | 93     |
| Tinted Windows                                                     | 58     |
| Using On Board Navigation Device                                   | 36     |
| Other Lighting Defects                                             | 34     |
| Vehicle Vandalism                                                  | 34     |
| Headlights Defective                                               | 33     |
| Prescription Medication                                            | 33     |
| Tow Hitch Defective                                                | 32     |
| Eating or Drinking                                                 | 29     |
| Other Electronic Device                                            | 28     |
| Shoulders Defective/Improper                                       | 14     |
| Cell Phone (hands-free)                                            | 12     |
| Texting                                                            | 11     |
| Windshield Inadequate                                              | 6      |
| Listening/Using Headphones                                         | 6      |

**Interpretation**: The analysis of contributing factors reveals that "Unspecified" and "Driver Inattention/Distraction" are the most common causes of collisions, accounting for 74,120 and 72,368 incidents respectively. Other notable factors include "Failure to Yield Right-of-Way" (19,494), "Following Too Closely" (19,486), and "Passing or Lane Usage Improper" (12,388). These statistics highlight the critical areas that need attention to improve road safety.

### 2.3. Vehicle Type Code

| Vehicle Type Code 1                     | Count  |
|:----------------------------------------|-------:|
| Sedan                                   | 138,139|
| Station Wagon/Sport Utility Vehicle     | 103,652|
| Taxi                                    | 8,334  |
| Pick-up Truck                           | 6,391  |
| Box Truck                               | 5,134  |
| MOTORHOME                               | 1      |
| MINI SCHOO                              | 1      |
| dump                                    | 1      |
| Gas Scoote                              | 1      |
| City                                    | 1      |

**Interpretation**: The majority of collisions involve sedans (138,139) and station wagons/sport utility vehicles (103,652). Other vehicle types involved in collisions include taxis (8,334), pick-up trucks (6,391), and box trucks (5,134). Less common vehicle types like motorhomes, mini school buses, and gas scooters each have only one recorded collision.

## **3. Injury Analysis**

### 3.1. Injury Count Analysis

| Number of Persons Injured | Number of Collisions |
|:--------------------------|---------------------:|
| 0.0                       | 195,639              |
| 1.0                       | 75,931               |
| 2.0                       | 14,535               |
| 3.0                       | 4,526                |
| 4.0                       | 1,638                |
| 5.0                       | 596                  |
| 6.0                       | 202                  |
| 7.0                       | 103                  |
| 8.0                       | 36                   |
| 9.0                       | 19                   |
| 10.0                      | 13                   |
| 11.0                      | 4                    |
| 12.0                      | 1                    |
| 13.0                      | 3                    |
| 14.0                      | 2                    |
| 15.0                      | 4                    |
| 16.0                      | 1                    |
| 17.0                      | 1                    |
| 18.0                      | 1                    |
| 40.0                      | 1                    |

**Interpretation**: The injury count analysis indicates that the majority of collisions (195,639) result in no injuries. Collisions that resulted in one injury account for 75,931 incidents, while collisions with two injuries account for 14,535 incidents. There are relatively fewer collisions with higher numbers of injuries, with only one collision resulting in 40 injuries, highlighting rare but severe incidents.

## **4. Common Causes Analysis**

### 4.1. Most Common Contributing Factors for Collisions

| Contributing Factor                  | Number of Collisions |
|:-------------------------------------|---------------------:|
| Unspecified                          | 74,120               |
| Driver Inattention/Distraction       | 72,368               |
| Failure to Yield Right-of-Way        | 19,494               |
| Following Too Closely                | 19,486               |
| Passing or Lane Usage Improper       | 12,388               |
| Passing Too Closely                  | 10,994               |
| Backing Unsafely                     | 9,752                |
| Unsafe Speed                         | 9,550                |
| Other Vehicular                      | 8,166                |
| Traffic Control Disregarded          | 7,601                |

**Interpretation**: The top contributing factors for collisions include "Unspecified" and "Driver Inattention/Distraction," which are the leading causes. Other significant factors include "Failure to Yield Right-of-Way," "Following Too Closely," and "Passing or Lane Usage Improper." Addressing these contributing factors can potentially reduce the number of collisions.

## **5. Vehicle Involvement Analysis**

### 5.1. Most Common Vehicle Types Involved in Collisions:

| Vehicle Type                        | Number of Collisions |
|:------------------------------------|---------------------:|
| Sedan                               | 138,139              |
| Station Wagon/Sport Utility Vehicle | 103,652              |
| Taxi                                | 8,334                |
| Pick-up Truck                       | 6,391                |
| Box Truck                           | 5,134                |
| Bus                                 | 4,447                |
| Bike                                | 4,055                |
| Motorcycle                          | 2,279                |
| Tractor Truck Diesel                | 2,248                |
| E-Bike                              | 1,840                |

**Interpretation**: Sedans and station wagons/sport utility vehicles are the most frequently involved in collisions, accounting for 138,139 and 103,652.

## **6. Vehicle Type by Severity Analysis**

### 6.1. Vehicle Type by Severity Distribition 

![image.png](attachment:506c79dd-c0c9-4868-845a-42d213d28366.png)

**Interpretation**: Sedans and sport utility vehicles (SUVs) are the most frequently involved in collisions resulting in injuries, accounting for 60,685 and 44,454 injuries respectively, likely due to their widespread use on the roads. Vulnerable road users, such as bicycles, motorcycles, and e-bikes, have high injury rates relative to their numbers, with 3,662, 1,745, and 1,741 injuries respectively, highlighting their increased risk in traffic and the need for enhanced protective measures. Commercial vehicles, including taxis, pick-up trucks, and box trucks, also contribute significantly to injury counts, with 4,787, 2,332, and 1,218 injuries respectively, indicating the need for targeted safety interventions for commercial vehicles. Emerging vehicle types like e-scooters are involved in a notable number of injury-causing collisions, with 1,217 injuries recorded, indicating potential safety issues and a growing exposure to traffic risks. These insights emphasize the critical need for improving road safety measures, particularly for the most commonly involved vehicle types and vulnerable road users, to reduce injury rates effectively.

## **7. Collision Analysis**

### 7.1. Collision Trend Analysis by Month and Year

![image.png](attachment:48c15750-9e72-4867-89b4-92b3a5319fbd.png)

**Interpretaation**: The trend analysis of collisions by month and year reveals several significant patterns. In January 2020, collisions peaked at 14,366, followed by a sharp decline to 4,128 by April 2020, likely due to the COVID-19 pandemic and subsequent lockdowns reducing traffic volumes. A gradual recovery in collisions is observed from mid-2020, reaching around 10,000 by early 2021 as restrictions eased. The data shows fluctuations in mid-2021, peaking at 10,608 in June 2021, possibly due to varying lockdown measures and seasonal travel changes. A dramatic decline occurs around July 2022, with collisions dropping to 658 and remaining very low through early 2023, indicating potential new traffic regulations or safety measures. Throughout 2023 and into early 2024, the number of collisions remains consistently low.

## **Collision Heatmap by Location**

![image.png](attachment:20cd3a7d-c828-4cdd-83ea-ae6a04b75637.png)

## **8. Conclusion**

### 8.1. Summary of Findings

This report provides a comprehensive analysis of collision data, revealing critical insights into the factors contributing to road accidents and the impact on public safety. The analysis of the number of persons injured indicates that while most collisions do not result in injuries, there are significant outliers with severe injury counts, highlighting the variability in collision severity. Driver-related factors such as inattention and distraction emerge as the leading causes of collisions, accounting for a substantial number of incidents. The data also shows that sedans and SUVs are the most frequently involved vehicle types, reflecting their widespread use. Vulnerable road users like cyclists and motorcyclists experience high injury rates relative to their numbers, underscoring the need for enhanced protective measures.

The trend analysis reveals a dramatic decrease in collisions during the early months of the COVID-19 pandemic, followed by fluctuations and a gradual recovery as restrictions eased. A notable decline in collisions starting from July 2022 suggests the positive impact of new traffic regulations or safety measures. These findings highlight the importance of continuous monitoring and adaptive strategies to sustain improvements in road safety. Addressing driver behavior, enhancing protection for vulnerable users, and maintaining effective regulations are crucial steps in reducing collisions and ensuring public safety.

### 8.2. Conclusion

In conclusion, addressing driver behavior through education and enforcement, improving safety measures for vulnerable road users, and maintaining effective traffic regulations are essential steps in reducing collisions and enhancing public safety. By focusing on these areas, policymakers and safety advocates can make meaningful strides in creating safer road environments for all users.

## **Python Code**

### 1. Descriptive Analysis

In [None]:
import pandas as pd

file_path = '/Users/nahoemi/Downloads/MVC Dataset_I.xlsx'
df = pd.read_excel(file_path, sheet_name='Sheet1')

number_of_persons_injured_stats = df['NUMBER OF PERSONS INJURED'].describe()
print("Descriptive Statistics for 'Number of Persons Injured':")
print(number_of_persons_injured_stats)

contributing_factor_vehicle_1_stats = df['CONTRIBUTING FACTOR VEHICLE 1'].value_counts()
print("\nDescriptive Statistics for 'Contributing Factor Vehicle 1':")
print(contributing_factor_vehicle_1_stats)

vehicle_type_code_1_stats = df['VEHICLE TYPE CODE 1'].value_counts()
print("\nDescriptive Statistics for 'Vehicle Type Code 1':")
print(vehicle_type_code_1_stats)

### 2. Injury Counts Analysis

In [None]:
import pandas as pd

file_path = '/Users/nahoemi/Downloads/MVC Dataset_I.xlsx'
df = pd.read_excel(file_path, sheet_name='Sheet1')

injury_counts = df['NUMBER OF PERSONS INJURED'].value_counts().sort_index()

injury_counts_df = injury_counts.reset_index()
injury_counts_df.columns = ['Number of Persons Injured', 'Number of Collisions']

from IPython.display import display
display(injury_counts_df)

### 3. Common Causes Analysis

In [None]:
import pandas as pd

file_path = '/Users/nahoemi/Downloads/MVC Dataset_I.xlsx'
df = pd.read_excel(file_path, sheet_name='Sheet1')

common_causes = df['CONTRIBUTING FACTOR VEHICLE 1'].value_counts().head(10)
common_causes_df = common_causes.reset_index()

common_causes_df.columns = ['Contributing Factor', 'Number of Collisions']

from IPython.display import display
display(common_causes_df)

### 4. Vehicle Involvement Analysis

In [None]:
import pandas as pd
import matplotlib.pyplot as plt

file_path = '/Users/nahoemi/Downloads/MVC Dataset_I.xlsx'
df = pd.read_excel(file_path, sheet_name='Sheet1')

vehicle_types = df['VEHICLE TYPE CODE 1'].value_counts().head(10)

vehicle_types_df = vehicle_types.reset_index()
vehicle_types_df.columns = ['Vehicle Type', 'Number of Collisions']

from IPython.display import display
display(vehicle_types_df)

### 5. Vehicle Type by Severity Analysis

In [None]:
import pandas as pd
import matplotlib.pyplot as plt

file_path = '/Users/nahoemi/Downloads/MVC Dataset_I.xlsx'
df = pd.read_excel(file_path, sheet_name='Sheet1')

severity_by_vehicle_type = df.groupby('VEHICLE TYPE CODE 1')['NUMBER OF PERSONS INJURED'].sum().sort_values(ascending=False).head(10)

plt.figure(figsize=(10, 6))
ax = severity_by_vehicle_type.plot(kind='bar')
plt.title('Severity of Collisions by Vehicle Type')
plt.xlabel('Vehicle Type')
plt.ylabel('Total Number of Persons Injured')
plt.xticks(rotation=45, ha='right')

for p in ax.patches:
    ax.annotate(f'{int(p.get_height())}', (p.get_x() + p.get_width() / 2., p.get_height()), ha='center', va='baseline')

plt.show()

### 6. Trends Analysis of Month and Year

In [None]:
import pandas as pd
import matplotlib.pyplot as plt

file_path = '/Users/nahoemi/Downloads/MVC Dataset_I.xlsx'
df = pd.read_excel(file_path, sheet_name='Sheet1')

df['CRASH DATE'] = pd.to_datetime(df['CRASH DATE'])

df['Month'] = df['CRASH DATE'].dt.month
df['Year'] = df['CRASH DATE'].dt.year

monthly_trends = df.groupby(['Year', 'Month']).size().reset_index(name='Number of Collisions')
monthly_trends['Date'] = pd.to_datetime(monthly_trends[['Year', 'Month']].assign(DAY=1))

plt.figure(figsize=(14, 7))
plt.plot(monthly_trends['Date'], monthly_trends['Number of Collisions'], marker='o')

for i, row in monthly_trends.iterrows():
    plt.annotate(row['Number of Collisions'], (row['Date'], row['Number of Collisions']), textcoords="offset points", xytext=(0,10), ha='center')

plt.title('Trend Analysis of Collisions by Month and Year')
plt.xlabel('Date')
plt.ylabel('Number of Collisions')
plt.xticks(rotation=45)
plt.grid(True)
plt.tight_layout()
plt.show()

### 7. Collision Heatmap by Location

In [None]:
import folium
from folium.plugins import HeatMap
import pandas as pd

file_path = '/Users/nahoemi/Downloads/MVC Dataset_I.xlsx'
df = pd.read_excel(file_path, sheet_name='Sheet1')

location_data = df.dropna(subset=['LATITUDE', 'LONGITUDE'])

base_map = folium.Map(location=[location_data['LATITUDE'].mean(), location_data['LONGITUDE'].mean()], zoom_start=10)

heat_data = [[row['LATITUDE'], row['LONGITUDE']] for index, row in location_data.iterrows()]
HeatMap(heat_data).add_to(base_map)

base_map.save('collision_heatmap.html')

base_map