Due to limited rendering; graphs will not render in github, so please use npviewer, or simply click on icon at the top-right corner to explore the analysis in full details.
This project (Write a Data Science Blog Post) is part of Udacity's Data Scientists Nanodegree Program. Detailed analysis with all needed code is posted in this Github repository.
Roads safety is pressing concern for many countries, where road crash fatalities and disabilities is gradually being recognized as a major public health concern. In this project I wanted to utilize data science and machine learning to help solve real-world problems...
Listed below all packages and libraries used in this project:
- Python Version (3.6.8)
- Pandas (Data Processing, and manipulatoin)
- NumPy (Data Processing, and manipulatoin)
- Sklearn (Machine Learning model)
- Plotly (Interactive Plots)
- Folium (Map Visualization)
- Dython (Categorical Corrilation)
The motivation for the project is to study and understand the nature of car accidents, and how it has changed throughout the years? To better guide the analysis I formed the following questions related to Road Safety and Traffic Accidents:
- What is the severity of accidents over the last decade?
- When do accidents usually happen?
- Where do cyclists accidents usually happen?
- Under which circumstances do accidents happen? Is there any correlation between these features?
- What is the age distribution of drivers involved in the accidents?
- What are the characteristics of casualties impacted in the accidents?
- What are the main factors causing an accidents, and can we predict the severity based on these factors?
- Jupyter notebook with complete analysis code
- HTML copy of Jupyter notebook
- Data files, U.K Road Safety dataset is seperated into three main files:
- Accidents(0514-2017).csv: detailed road safety data about the circumstances of personal injury road accidents in GB, indexed by (Accident_Index).
- Vehicles(0514-2017).csv: detailed vehicles involved in traffic accidents.
- Casualties(0514-2017).csv: detailed consequential casualties invloved in traffic accidents.
- lookup_mapping.csv: lookup tables to encode data variables.
- Due to the size of the data it will not be uploaded in this repositroy, and can be accessed directly from UK Open Data site
A summary of the results of the analysis is described in detailed as Jupyter notebook, and in this published Medium blog post.
This project uses U.K Road Safety Data from (2005–2017). The dataset is published by Department of Trasports, under Open Government Licence. The data consists of detailed road safety data about the circumstances of personal injury road accidents, the types of vehicles involved and the consequential casualties.