-
🔭 I’m currently working on Analysing Road Traffic Accidents in the UK for the year 2019
-
🌱 I’m also learning Linux, Tableau, Power BI and SQL
-
📫 Email: osadebamwen.matthew@gmail.com
-
⚡ Fun fact: I think I'm good at Chess even though I barely win any game nowadays.
For this analysis we will use the road safety data available from here: http://data.gov.uk/dataset/road-accidents-safety-data
The UK government provides detailed road safety data with respect to injuries, road accidents, type of vehicles involved and casualties. Overall, the data is divided into three datasets: Accidents, Vehicles and Casualties. A summary of each of these datasets is presented in Table 1. The ‘accident index’ is provided in each dataset to identify an accident. I initially proposed to merge all three datasets from the start, before performing the analysis but upon realizing this might be more dimensionally challenging to work with, I decided to work with individual datasets and merge them only when needed. It should also be noted lots of the attributes are categorical hence codes are provided to indicate their respective meanings.
Table 1
Datasets | Unique Identifier | Number of Attributes | Number of Rows |
---|---|---|---|
Accidents | Accident Index | 32 | 117536 |
Vehicles | Vehicle Reference | 23 | 216381 |
Casualties | Casualty Reference | 16 | 153158 |
It may sound far-fetched to suggest that certain months, days, or hours could be more dangerous. Hence, the aim of this report is to analyse UK accidents data to give insights into the following questions:
(a) Are there significant hours of the day, and days of the week, on which accidents occur?
(b) For motorbikes, are there significant hours of the day, and days of the week, on which
accidents occur?
(c) For pedestrians involved in accidents, are there significant hours of the day, and days of the
week, on which they are more likely to be involved?
(d) What impact, if any, does daylight savings have on road traffic accidents in the week after it
starts and stops?
(e) What impact, if any, does sunrise and sunset times have on road traffic accidents?
(f) Are there particular types of vehicles (engine capacity, age of vehicle, etc.) that are more
frequently involved in road traffic accidents?
(g) Are there particular conditions (weather, geographic location, situations) that generate more
road traffic accidents?
(h) How does driver related variables affect the outcome (e.g., age of the driver, and the purpose
of the journey)?
(i) Can we make predictions about when and where accidents will occur, and the severity of the
injuries sustained from the data supplied to improve road safety? How well do our models
compare to government models?
Please click on the BigData_&_DataMining.ipynb python note book above to take a look at the detailed analysis.