<h2> Project: United Kigdom Road Accident Data Analysis</h2>
<h3> Inclusive Years: 2019 - 2022</h3>
<h3> Analyst: Rogemson P. Molina</h3>

<h2> Importing Libraries</h2>

In [1]:
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
from scipy.stats import f_oneway

import warnings
warnings.filterwarnings('ignore')

<h2> Importing data file</h2>

In [2]:
accident = pd.read_csv('dataset\\accident_data.csv')

<h2> Accessing the dataframe</h2>

In [3]:
accident

Unnamed: 0,Index,Accident_Severity,Accident Date,Latitude,Light_Conditions,District Area,Longitude,Number_of_Casualties,Number_of_Vehicles,Road_Surface_Conditions,Road_Type,Urban_or_Rural_Area,Weather_Conditions,Vehicle_Type
0,200701BS64157,Serious,05/06/2019,51.506187,Darkness - lights lit,Kensington and Chelsea,-0.209082,1,2,Dry,Single carriageway,Urban,Fine no high winds,Car
1,200701BS65737,Serious,02/07/2019,51.495029,Daylight,Kensington and Chelsea,-0.173647,1,2,Wet or damp,Single carriageway,Urban,Raining no high winds,Car
2,200701BS66127,Serious,26/08/2019,51.517715,Darkness - lighting unknown,Kensington and Chelsea,-0.210215,1,3,Dry,,Urban,,Taxi/Private hire car
3,200701BS66128,Serious,16/08/2019,51.495478,Daylight,Kensington and Chelsea,-0.202731,1,4,Dry,Single carriageway,Urban,Fine no high winds,Bus or coach (17 or more pass seats)
4,200701BS66837,Slight,03/09/2019,51.488576,Darkness - lights lit,Kensington and Chelsea,-0.192487,1,2,Dry,,Urban,,Other vehicle
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
660674,201091NM01760,Slight,18/02/2022,57.374005,Daylight,Highland,-3.467828,2,1,Dry,Single carriageway,Rural,Fine no high winds,Car
660675,201091NM01881,Slight,21/02/2022,57.232273,Darkness - no lighting,Highland,-3.809281,1,1,Frost or ice,Single carriageway,Rural,Fine no high winds,Car
660676,201091NM01935,Slight,23/02/2022,57.585044,Daylight,Highland,-3.862727,1,3,Frost or ice,Single carriageway,Rural,Fine no high winds,Car
660677,201091NM01964,Serious,23/02/2022,57.214898,Darkness - no lighting,Highland,-3.823997,1,2,Wet or damp,Single carriageway,Rural,Fine no high winds,Motorcycle over 500cc


<h2> Checking and filling up null values</h2>

In [4]:
accident['Latitude'] = accident['Latitude'].fillna(accident['Latitude'].mode()[0])
accident['Longitude'] = accident['Longitude'].fillna(accident['Longitude'].mode()[0])

In [5]:
accident['Road_Surface_Conditions'] = accident['Road_Surface_Conditions'].fillna('Unknown road surface condition')

In [6]:
accident['Road_Type'] = accident['Road_Type'].fillna('Unknown road type')

In [7]:
accident['Urban_or_Rural_Area'] = accident['Urban_or_Rural_Area'].fillna(accident['Urban_or_Rural_Area'].mode()[0])

In [8]:
accident['Weather_Conditions'] = accident['Weather_Conditions'].fillna('Unknown weather condition')

In [9]:
accident.isnull().sum()

Index                      0
Accident_Severity          0
Accident Date              0
Latitude                   0
Light_Conditions           0
District Area              0
Longitude                  0
Number_of_Casualties       0
Number_of_Vehicles         0
Road_Surface_Conditions    0
Road_Type                  0
Urban_or_Rural_Area        0
Weather_Conditions         0
Vehicle_Type               0
dtype: int64

<h2> Changing data types

<h3> Before changing data types


In [10]:
accident.dtypes

Index                       object
Accident_Severity           object
Accident Date               object
Latitude                   float64
Light_Conditions            object
District Area               object
Longitude                  float64
Number_of_Casualties         int64
Number_of_Vehicles           int64
Road_Surface_Conditions     object
Road_Type                   object
Urban_or_Rural_Area         object
Weather_Conditions          object
Vehicle_Type                object
dtype: object

In [11]:
accident['Accident_Severity'] = accident['Accident_Severity'].astype('category')

In [12]:
accident['Accident Date'] = pd.to_datetime(accident['Accident Date'], dayfirst=True, errors='coerce')

In [13]:
accident['Light_Conditions'] = accident['Light_Conditions'].astype('category')

In [14]:
accident['District Area'] = accident['District Area'].astype('category')

In [15]:
accident['Vehicle_Type'] = accident['Vehicle_Type'].astype('category')

In [16]:
accident['Road_Surface_Conditions'] = accident['Road_Surface_Conditions'].astype('category')

In [17]:
accident['Road_Type'] = accident['Road_Type'].astype('category')

In [18]:
accident['Urban_or_Rural_Area'] = accident['Urban_or_Rural_Area'].astype('category')

In [19]:
accident['Weather_Conditions'] = accident['Weather_Conditions'].astype('category')

<h3> After changing data types</h3>

In [20]:
accident.dtypes

Index                              object
Accident_Severity                category
Accident Date              datetime64[ns]
Latitude                          float64
Light_Conditions                 category
District Area                    category
Longitude                         float64
Number_of_Casualties                int64
Number_of_Vehicles                  int64
Road_Surface_Conditions          category
Road_Type                        category
Urban_or_Rural_Area              category
Weather_Conditions               category
Vehicle_Type                     category
dtype: object

In [21]:
accident.isnull().sum()

Index                      0
Accident_Severity          0
Accident Date              0
Latitude                   0
Light_Conditions           0
District Area              0
Longitude                  0
Number_of_Casualties       0
Number_of_Vehicles         0
Road_Surface_Conditions    0
Road_Type                  0
Urban_or_Rural_Area        0
Weather_Conditions         0
Vehicle_Type               0
dtype: int64

<h2> Adding new colums (Year, Month, Day)</h2>

In [22]:
accident['Year'] = accident['Accident Date'].dt.year
accident['Month'] = accident['Accident Date'].dt.month
accident['Day'] = accident['Accident Date'].dt.day
accident['DayofWeek'] = accident['Accident Date'].dt.dayofweek

In [23]:
def month_to_season(month):
    if month in [12, 1, 2]:
        return 'Winter'
    elif month in [3, 4, 5]:
        return 'Spring'
    elif month in [6, 7, 8]:
        return 'Summer'
    elif month in [9, 10, 11]:
        return 'Fall'
    else:
        return np.nan

accident['Season'] = accident['Month'].apply(month_to_season)

In [24]:
accident

Unnamed: 0,Index,Accident_Severity,Accident Date,Latitude,Light_Conditions,District Area,Longitude,Number_of_Casualties,Number_of_Vehicles,Road_Surface_Conditions,Road_Type,Urban_or_Rural_Area,Weather_Conditions,Vehicle_Type,Year,Month,Day,DayofWeek,Season
0,200701BS64157,Serious,2019-06-05,51.506187,Darkness - lights lit,Kensington and Chelsea,-0.209082,1,2,Dry,Single carriageway,Urban,Fine no high winds,Car,2019,6,5,2,Summer
1,200701BS65737,Serious,2019-07-02,51.495029,Daylight,Kensington and Chelsea,-0.173647,1,2,Wet or damp,Single carriageway,Urban,Raining no high winds,Car,2019,7,2,1,Summer
2,200701BS66127,Serious,2019-08-26,51.517715,Darkness - lighting unknown,Kensington and Chelsea,-0.210215,1,3,Dry,Unknown road type,Urban,Unknown weather condition,Taxi/Private hire car,2019,8,26,0,Summer
3,200701BS66128,Serious,2019-08-16,51.495478,Daylight,Kensington and Chelsea,-0.202731,1,4,Dry,Single carriageway,Urban,Fine no high winds,Bus or coach (17 or more pass seats),2019,8,16,4,Summer
4,200701BS66837,Slight,2019-09-03,51.488576,Darkness - lights lit,Kensington and Chelsea,-0.192487,1,2,Dry,Unknown road type,Urban,Unknown weather condition,Other vehicle,2019,9,3,1,Fall
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
660674,201091NM01760,Slight,2022-02-18,57.374005,Daylight,Highland,-3.467828,2,1,Dry,Single carriageway,Rural,Fine no high winds,Car,2022,2,18,4,Winter
660675,201091NM01881,Slight,2022-02-21,57.232273,Darkness - no lighting,Highland,-3.809281,1,1,Frost or ice,Single carriageway,Rural,Fine no high winds,Car,2022,2,21,0,Winter
660676,201091NM01935,Slight,2022-02-23,57.585044,Daylight,Highland,-3.862727,1,3,Frost or ice,Single carriageway,Rural,Fine no high winds,Car,2022,2,23,2,Winter
660677,201091NM01964,Serious,2022-02-23,57.214898,Darkness - no lighting,Highland,-3.823997,1,2,Wet or damp,Single carriageway,Rural,Fine no high winds,Motorcycle over 500cc,2022,2,23,2,Winter


<h1> INSIGHTS</h1>

<h2> UNIVARIATE


<h2> Question 1: Total Accidents that happened on Daylight?
<h3> Insight 1: The total accidents happened during Daylight is 484880 </h3>

In [25]:
accident[accident['Light_Conditions'] == 'Daylight'].value_counts().sum()

np.int64(484880)

<h2> Question 2: Total Accidents that happened on 2022?
<h3> Insight 2: The total accidents happened during 2022 is 144419 </h3>

In [26]:
accident[accident['Year'] == 2022].value_counts().sum()

np.int64(144419)

<h2> Question 3: How many accidents happen during the winter of 2019?
<h3> Insight 3:There is a total of 43316 total accidents during the winter of 2019</h3>

In [27]:
winter_months = [1, 2, 12]
accident[(accident['Year'] == 2019) & (accident['Month'].isin(winter_months))].shape[0]

43316

<h2> Question 4: Show accidents happened each month
<h3> Insight 4: The insight suggested that there are more accidents happening during October and November</h3>

In [28]:
accident['Month'].value_counts().reset_index(name='Accidents')

Unnamed: 0,Month,Accidents
0,11,60424
1,10,59580
2,7,57445
3,6,56481
4,9,56455
5,5,56352
6,3,54086
7,8,53913
8,1,52872
9,12,51836


<h2> Question 5: How many accidents happened depending on severity?
<h3> Insight 5: There are more slight accidents</h3>

In [29]:
accident['Accident_Severity'].value_counts()

Accident_Severity
Slight     563801
Serious     88217
Fatal        8661
Name: count, dtype: int64

<h2> Question 6: What is the average District accidents happen?
<h3> Insight 6: The District Area with average accidents is Birmingham</h3>

In [30]:
accident['District Area'].mode()[0]

'Birmingham'

<h2> BIVARIATE

<h2> Question 7: Average casualties base on accident severity?
<h3> Insight 7: Fatal accidents have the highest average value (≈1.90), suggesting they are considerably more impactful.</h3>

In [31]:
accident.groupby('Accident_Severity')['Number_of_Casualties'].mean()

Accident_Severity
Fatal      1.903129
Serious    1.467280
Slight     1.331402
Name: Number_of_Casualties, dtype: float64

<h2> Question 8: What vehicle(s) mostly got an accident on Snowing no high winds weather condition?
<h3> Insight 8: The vehicle type Car got the most accident totaling to 4748</h3>

In [32]:
snowing_accidents = accident[accident["Weather_Conditions"] == "Snowing no high winds"]
snowing_accidents['Vehicle_Type'].value_counts()

Vehicle_Type
Car                                      4748
Van / Goods 3.5 tonnes mgw or under       296
Motorcycle over 500cc                     254
Bus or coach (17 or more pass seats)      219
Goods 7.5 tonnes mgw and over             149
Motorcycle 125cc and under                146
Taxi/Private hire car                     136
Motorcycle 50cc and under                  75
Motorcycle over 125cc and up to 500cc      72
Goods over 3.5t. and under 7.5t            52
Other vehicle                              50
Agricultural vehicle                       22
Minibus (8 - 16 passenger seats)           16
Pedal cycle                                 3
Data missing or out of range                0
Ridden horse                                0
Name: count, dtype: int64

<h2> Question 9: What vehicle(s) mostly got an accident on Fine no high winds weather condition?
<h3> Insight 9: The vehicle type Car got the most accident totaling to 403324</h3>

In [33]:
fine_accidents = accident[accident["Weather_Conditions"] == "Fine no high winds"]
fine_accidents['Vehicle_Type'].value_counts()

Vehicle_Type
Car                                      392668
Van / Goods 3.5 tonnes mgw or under       26877
Bus or coach (17 or more pass seats)      20398
Motorcycle over 500cc                     20245
Goods 7.5 tonnes mgw and over             13589
Motorcycle 125cc and under                12064
Taxi/Private hire car                     10452
Motorcycle over 125cc and up to 500cc      6069
Motorcycle 50cc and under                  6017
Goods over 3.5t. and under 7.5t            4823
Other vehicle                              4450
Minibus (8 - 16 passenger seats)           1552
Agricultural vehicle                       1513
Pedal cycle                                 161
Data missing or out of range                  4
Ridden horse                                  3
Name: count, dtype: int64

<h2> Question 10: What vehicle(s) mostly got an accident on Fog or mist weather condition?
<h3> Insight 10: The vehicle type Car got the most accident totaling to 2641</h3>

In [34]:
fog_accidents = accident[accident["Weather_Conditions"] == "Fog or mist"]
fog_accidents['Vehicle_Type'].value_counts()

Vehicle_Type
Car                                      2641
Van / Goods 3.5 tonnes mgw or under       192
Bus or coach (17 or more pass seats)      134
Motorcycle over 500cc                     118
Goods 7.5 tonnes mgw and over              93
Motorcycle 125cc and under                 82
Taxi/Private hire car                      78
Motorcycle 50cc and under                  50
Motorcycle over 125cc and up to 500cc      45
Other vehicle                              38
Goods over 3.5t. and under 7.5t            31
Minibus (8 - 16 passenger seats)           14
Agricultural vehicle                        9
Pedal cycle                                 3
Data missing or out of range                0
Ridden horse                                0
Name: count, dtype: int64

<h2> Question 11: What light condition mostly got an accident in Highland district area?
<h3> Insight 11: Most of the accidents happened at Highland happens on Daylight light condition</h3>

In [35]:
highland_accidents = accident[accident['District Area'] == 'Highland']
highland_accidents['Light_Conditions'].value_counts()

Light_Conditions
Daylight                       1524
Darkness - no lighting          336
Darkness - lights lit           132
Darkness - lighting unknown      21
Darkness - lights unlit           8
Name: count, dtype: int64

<h2> Question 12: On what road surface condition got 2 or more casualties?
<h3> Insight 12: Dry road have more accidents with 2 or more casualties</h3>

In [36]:
casualty_condition = accident[accident['Number_of_Casualties'] >= 2]
casualty_condition.groupby('Road_Surface_Conditions').size().reset_index(name='Accident_Count')

Unnamed: 0,Road_Surface_Conditions,Accident_Count
0,Dry,99995
1,Flood over 3cm. deep,308
2,Frost or ice,4419
3,Snow,1483
4,Unknown road surface condition,114
5,Wet or damp,50319


<h2> Question 13: On what road type got the least casualties?
<h3> Insight 13: Slip road type got the least accident casulatie(s)</h3>

In [37]:
road_type_casualties = accident.groupby('Road_Type')['Number_of_Casualties'].sum().reset_index()
road_type_casualties.nsmallest(1, 'Number_of_Casualties')

Unnamed: 0,Road_Type,Number_of_Casualties
5,Unknown road type,5642


<h2> Question 14: What date got the most accident happened?
<h3> Insight 14: The date 2019-11-30 got the most accident count that reach 704 total accidents on just a single day</h3>

In [38]:
accident.groupby('Accident Date').size().reset_index(name='Accident_Count').nlargest(1, 'Accident_Count')

Unnamed: 0,Accident Date,Accident_Count
333,2019-11-30,704


<h2> Question 15: What date got the most Serious accident happened?
<h3> Question 15: The data indicates that September 20, 2020, experienced the highest number of serious accidents, with 99 incidents reported.</h3>

In [39]:
date_with_serious_accidents = accident[accident['Accident_Severity'] == 'Serious']
date_with_serious_accidents.groupby('Accident Date').size().reset_index(name='Serious accident count').nlargest(1, 'Serious accident count')

Unnamed: 0,Accident Date,Serious accident count
628,2020-09-20,99


<h2> Question 16: On what kind of road type most of the Buses got a serious accident?
<h3> Insight 16: The data indicates that single carriageways are the most common road type for serious bus accidents, with 2,648 incidents reported.</h3>

In [40]:
serious_bus_accident = accident[(accident['Accident_Severity'] == 'Serious') & (accident['Vehicle_Type'] == 'Bus or coach (17 or more pass seats)')]
serious_bus_accident.groupby('Road_Type').size().reset_index(name='Accident count').nlargest(1, 'Accident count')

Unnamed: 0,Road_Type,Accident count
3,Single carriageway,2648


<h2> Question 17: Is there a correlation between Number of vehicles and Number of Casualties?
<h3> Insight 17: There is no correlation between the number of vehicles and number of casualties</h3>

In [41]:
accident['Number_of_Vehicles'].corr(accident['Number_of_Casualties'])

np.float64(0.22888886126927557)

<h2> Question 18: Is there a correlation between Light Condition and severity?
<h3> Insight 18: There is no correlation between the Light condition and number of casualties</h3>

In [42]:
f_stats, p_value = f_oneway(accident[accident['Light_Conditions'] == 'Daylight']['Number_of_Casualties'],
                            accident[accident['Light_Conditions'] == 'Darkness - no lighting']['Number_of_Casualties'])
p_value

np.float64(0.0)

<h2> Question 19: Is there a correlation between Number of vehicles and Latitude?
<h3> Insight 19: There is no correlation between the number of vehicles and the latitude</h3>

In [43]:
accident['Number_of_Vehicles'].corr(accident['Latitude'])

np.float64(-0.04002683678406357)

<h2> Question 20: How many accidents happened on Urban or Rural area base on severity?
<h3> Insight 20: The data shows that while urban areas account for a higher overall number of accidents—especially in the serious and slight categories—rural areas have a notably higher count of fatal accidents. </h3>

In [44]:
accident.groupby(['Urban_or_Rural_Area', 'Accident_Severity']).size().unstack(fill_value=0)

Accident_Severity,Fatal,Serious,Slight
Urban_or_Rural_Area,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Rural,5601,37312,196077
Unallocated,0,1,10
Urban,3060,50904,367714


<h2> MULTIVARIATE

<h2> Question 21: What vehicle(s) type got the least casulaties with the light condition of Darkness - no lighting ?
<h3> Insight 21: The agricultural vehicle got the least number of accident casualties</h3>

In [45]:
accident[accident['Light_Conditions'] == 'Darkness - No Lighting'].groupby('Vehicle_Type')['Number_of_Casualties'].sum().reset_index().nsmallest(1, 'Number_of_Casualties')

Unnamed: 0,Vehicle_Type,Number_of_Casualties
0,Agricultural vehicle,0


<h2> Question 22: What Vehicle Type got a fatal severity with 2 or more casualties, at Urban Area on a Dry road condition? 
<h3> Insight 22: The data indicates that 88.57% of fatal accidents with 2 or more casualties in urban areas on dry roads involved cars.</h3>

In [46]:
dry_urban = accident[(accident['Urban_or_Rural_Area'] == 'Urban') & 
(accident['Road_Surface_Conditions'] == 'Dry') & 
(accident['Number_of_Casualties'] >= 2) &
(accident['Accident_Severity'] == 'Fatal')]
dry_urban.groupby('Vehicle_Type').size().reset_index(name='Total count')

Unnamed: 0,Vehicle_Type,Total count
0,Agricultural vehicle,1
1,Bus or coach (17 or more pass seats),26
2,Car,480
3,Data missing or out of range,0
4,Goods 7.5 tonnes mgw and over,9
5,Goods over 3.5t. and under 7.5t,5
6,Minibus (8 - 16 passenger seats),2
7,Motorcycle 125cc and under,20
8,Motorcycle 50cc and under,10
9,Motorcycle over 125cc and up to 500cc,10


<h2> Question 23: On what kind of Light Conditions most of the Motorcycle 125cc and under got a fatal accident?
<h3> Insight 23: Most of the Motorycle accidents happened at Single carriageway with fatal severity</h3>

In [47]:
motor_fatal = accident[(accident['Vehicle_Type'] == 'Motorcycle 125cc and under') & (accident['Accident_Severity'] == 'Fatal')]
motor_fatal.groupby('Road_Type').size().reset_index(name='Accident Count').nlargest(1, 'Accident Count')

Unnamed: 0,Road_Type,Accident Count
3,Single carriageway,150


<h2> Question 24: How many serious accident severity happen on Rural area with 2 or more vehicles involved?
<h3> Insight 24: There are 22,513 serious accidents that happened in rural areas with 2 or more vehicles involved, which accounts for 3.41% of the total accidents.</h3>

In [48]:
insight_five = accident[(accident['Accident_Severity'] == 'Serious') & (accident['Urban_or_Rural_Area'] == 'Rural') & (accident['Number_of_Vehicles'] > 1)]

insight_five.shape[0]

22513

<h2> Question 25: What is the weather condition where largest number of serious accidents happen at Single carriage way?
<h3> Insight 25: The weather condition where most of the serious accidents happen on single carriageways is "Fine no high winds," with a total count of 57,396 incidents, which accounts for 8.69% of the total accidents.</h3>

In [49]:
serious_single_carriageway = accident[(accident['Accident_Severity'] == 'Serious') & (accident['Road_Type'] == 'Single carriageway')]
serious_accident_single = serious_single_carriageway.groupby('Weather_Conditions').size().reset_index(name='Accident count')
serious_accident_single.nlargest(1, 'Accident count')

Unnamed: 0,Weather_Conditions,Accident count
1,Fine no high winds,57396


<h2> Question 26: On what season most of the vehicles got accident on Rural area with Fatal Severity?
<h3> Insight 26: Most fatal accidents in rural areas occur during the summer, with a total count of 1,502 incidents, accounting for 0.23% of the total accidents</h3>

In [50]:
fatal_rural = accident[(accident['Urban_or_Rural_Area'] == 'Rural') & (accident['Accident_Severity'] == 'Fatal')]
fatal_rural.groupby('Season').size().reset_index(name='Accident count')

Unnamed: 0,Season,Accident count
0,Fall,1452
1,Spring,1386
2,Summer,1502
3,Winter,1261


<h2> Question 26: On what dayofweek most of the car accident happens during Summer?
<h3> Insight 26: Most car accidents during summer happen on Fridays, with a total count of 20,464 incidents, accounting for 3.10% of the total accidents</h3>

In [51]:
car_summer = accident[(accident['Vehicle_Type'] == 'Car') & (accident['Season'] == 'Summer')]
car_summer.groupby('DayofWeek').size().reset_index(name='Accident count')

Unnamed: 0,DayofWeek,Accident count
0,0,14943
1,1,17720
2,2,19126
3,3,18610
4,4,18446
5,5,20464
6,6,17333


<h2> Question 27: On what weather condition during Friday most Fatal accidents happen?
<h3> Insight 27: Most fatal accidents on Fridays occur under the "Fine no high winds" weather condition, with a total count of 1,077 incidents</h3>

In [52]:
fri_fa = accident[(accident['DayofWeek'] == 5) & (accident['Accident_Severity'] == 'Fatal')]
fri_fa.groupby('Weather_Conditions').size().reset_index(name='Accident count')

Unnamed: 0,Weather_Conditions,Accident count
0,Fine + high winds,24
1,Fine no high winds,1077
2,Fog or mist,12
3,Other,28
4,Raining + high winds,27
5,Raining no high winds,136
6,Snowing + high winds,1
7,Snowing no high winds,1
8,Unknown weather condition,20


<h2> Insight 28: Cars have the highest count of accidents across all severities, with 6,577 fatal, 66,461 serious, and 424,954 slight accidents. This suggests that cars are involved in the majority of accidents, possibly due to their prevalence on the road.

<h2> Insight 29: Motorcycles, especially those over 500cc, exhibit a notable number of accidents: 339 fatal, 3,457 serious, and 21,861 slight accidents. This highlights the need for targeted safety measures for motorcycle riders, such as improved protective gear and stricter enforcement of traffic regulations. </h2>

<h2> Insight 30: Buses and coaches (17 or more passenger seats) also present a substantial number of accidents: 325 fatal, 3,373 serious, and 22,180 slight accidents. This could point to potential safety improvements needed in public transportation systems, such as better driver training and regular vehicle maintenance checks.</h2>

In [53]:
accident.groupby(['Accident_Severity', 'Vehicle_Type']).size().unstack().T

Accident_Severity,Fatal,Serious,Slight
Vehicle_Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Agricultural vehicle,21,282,1644
Bus or coach (17 or more pass seats),325,3373,22180
Car,6577,66461,424954
Data missing or out of range,0,0,6
Goods 7.5 tonnes mgw and over,216,2321,14770
Goods over 3.5t. and under 7.5t,67,857,5172
Minibus (8 - 16 passenger seats),29,276,1671
Motorcycle 125cc and under,189,2031,13049
Motorcycle 50cc and under,95,1014,6494
Motorcycle over 125cc and up to 500cc,105,1014,6537


In [54]:
accident.groupby(['Road_Surface_Conditions', 'Season', 'Year']).size().unstack()

Unnamed: 0_level_0,Year,2019,2020,2021,2022
Road_Surface_Conditions,Season,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Dry,Fall,34026,28107,28389,25834
Dry,Spring,36242,31025,33083,30492
Dry,Summer,34878,32756,33396,31219
Dry,Winter,20202,21618,15415,11139
Flood over 3cm. deep,Fall,38,114,108,31
Flood over 3cm. deep,Spring,38,50,20,19
Flood over 3cm. deep,Summer,138,74,65,36
Flood over 3cm. deep,Winter,98,93,67,28
Frost or ice,Fall,260,529,116,849
Frost or ice,Spring,101,205,103,343


In [55]:
accident.groupby(['Accident_Severity', 'Vehicle_Type', 'Light_Conditions']).size().unstack()

Unnamed: 0_level_0,Light_Conditions,Darkness - lighting unknown,Darkness - lights lit,Darkness - lights unlit,Darkness - no lighting,Daylight
Accident_Severity,Vehicle_Type,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Fatal,Agricultural vehicle,0,5,1,4,11
Fatal,Bus or coach (17 or more pass seats),2,65,1,65,192
Fatal,Car,56,1410,35,1200,3876
Fatal,Data missing or out of range,0,0,0,0,0
Fatal,Goods 7.5 tonnes mgw and over,1,49,1,43,122
Fatal,Goods over 3.5t. and under 7.5t,0,11,0,18,38
Fatal,Minibus (8 - 16 passenger seats),1,4,0,7,17
Fatal,Motorcycle 125cc and under,0,52,0,33,104
Fatal,Motorcycle 50cc and under,1,22,2,15,55
Fatal,Motorcycle over 125cc and up to 500cc,1,22,0,17,65
