## UK Accident Data

The dataset contains detailed records of road traffic accidents across the UK, including:
- Collision details (time, location, weather, road type, speed limit, lighting)
- Vehicles involved (type, manoeuvre, movement, load)
- Casualties (age, gender, role, severity)

The definition of these parameters are given in :[Instructions for the Completion of Road Accident Reports](https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/995424/stats20-2005.pdf)


In [2]:
!ls ../data/

dft-road-casualty-statistics-casualty-2023.csv
dft-road-casualty-statistics-collision-2023.csv
dft-road-casualty-statistics-vehicle-2023.csv


In [19]:
# load libraries
import pandas as pd
import numpy as np
import plotly.express as px
import pathlib

# set working directory
data_dir = pathlib.Path('../data/') 

### Loading road casuality data

In [23]:
# collision statistics
df_casualties = (pd.read_csv(
    data_dir/'dft-road-casualty-statistics-collision-2023.csv',
    low_memory=False)
    .set_index('accident_index'))
df_casualties.head()

Unnamed: 0_level_0,accident_year,accident_reference,location_easting_osgr,location_northing_osgr,longitude,latitude,police_force,accident_severity,number_of_vehicles,number_of_casualties,...,light_conditions,weather_conditions,road_surface_conditions,special_conditions_at_site,carriageway_hazards,urban_or_rural_area,did_police_officer_attend_scene_of_accident,trunk_road_flag,lsoa_of_accident_location,enhanced_severity_collision
accident_index,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2023010419171,2023,10419171,525060.0,170416.0,-0.202878,51.418974,1,3,1,1,...,4,8,2,0,0,1,1,2,E01003383,-1
2023010419183,2023,10419183,535463.0,198745.0,-0.042464,51.671155,1,3,3,2,...,4,1,1,0,0,1,1,2,E01001547,-1
2023010419189,2023,10419189,508702.0,177696.0,-0.435789,51.487777,1,3,2,1,...,4,1,1,0,0,1,1,2,E01002448,-1
2023010419191,2023,10419191,520341.0,190175.0,-0.263972,51.597575,1,3,2,1,...,4,9,1,0,0,1,1,2,E01000129,-1
2023010419192,2023,10419192,527255.0,176963.0,-0.168976,51.477324,1,3,2,1,...,4,1,1,0,0,1,1,2,E01004583,-1


In [24]:
# get columns
df_casualties.columns

Index(['accident_year', 'accident_reference', 'location_easting_osgr',
       'location_northing_osgr', 'longitude', 'latitude', 'police_force',
       'accident_severity', 'number_of_vehicles', 'number_of_casualties',
       'date', 'day_of_week', 'time', 'local_authority_district',
       'local_authority_ons_district', 'local_authority_highway',
       'first_road_class', 'first_road_number', 'road_type', 'speed_limit',
       'junction_detail', 'junction_control', 'second_road_class',
       'second_road_number', 'pedestrian_crossing_human_control',
       'pedestrian_crossing_physical_facilities', 'light_conditions',
       'weather_conditions', 'road_surface_conditions',
       'special_conditions_at_site', 'carriageway_hazards',
       'urban_or_rural_area', 'did_police_officer_attend_scene_of_accident',
       'trunk_road_flag', 'lsoa_of_accident_location',
       'enhanced_severity_collision'],
      dtype='object')

In [41]:
fig = px.scatter_map(
    df_casualties,  
    lat= 'latitude',
    lon= 'longitude',
    width=800, 
    height=600,
    color= 'accident_severity',
    hover_data=["accident_severity"] # accident severity
   
)

fig.update_layout(
    coloraxis_colorbar= {'title': 'Accident Severity'},
    mapbox_style="open-street-map")
fig.show()

Check for completeness of the data

In [42]:
df_casualties.info()

<class 'pandas.core.frame.DataFrame'>
Index: 104258 entries, 2023010419171 to 2023991462793
Data columns (total 36 columns):
 #   Column                                       Non-Null Count   Dtype  
---  ------                                       --------------   -----  
 0   accident_year                                104258 non-null  int64  
 1   accident_reference                           104258 non-null  object 
 2   location_easting_osgr                        104246 non-null  float64
 3   location_northing_osgr                       104246 non-null  float64
 4   longitude                                    104246 non-null  float64
 5   latitude                                     104246 non-null  float64
 6   police_force                                 104258 non-null  int64  
 7   accident_severity                            104258 non-null  int64  
 8   number_of_vehicles                           104258 non-null  int64  
 9   number_of_casualties                         