Skip to content

First project in my udacity-misk Data Science nano-degree: Blog post

Notifications You must be signed in to change notification settings

qoraraf/Udacity_DS_project_1

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

Udacity_DS_project_1

First project in my udacity-misk Data Science nano-degree: Blog post

Traffic accidents in Saudi Arabia have become one of the main problems the kingdom is dealing with. With this project I try to look at the number of children involved in those accidents and find out when and where their percentage is lower.

  • What is the total number of accidents per month?
  • What is the percentage of children that are involved in these accidents per month?
  • What is the the percentage of children involved for each region during the whole year?
  • Comparing the largest regions in the kingdom of Saudi Arabia, where is the highest percentage of children involvement into traffic accidents?

Files included:

  • traffic-accident-statistics-as-of-1439-h.xls (data file in Excel format). It contains 17 sheets with the same number of columns and same headers. I have concatenated them to one dataframe in pandas.
    The data is provided by the Ministry of Interior - General Directorate of Traffic

  • traffic_KSA.ipynb (Jupyter notebook using Python 3.8, and common libraries).

Installed modules to produce the jupyter notebook:

  • pandas
  • matplotlib
  • numpy
  • ploty
  • plotly express
  • seaborn

Acknoledgement:

The data I have used was sent to me from my mentor Mr. Haroon. I highly appreciate his help.
The data can be found on the Saudi portal for Open Data at:
https://data.gov.sa/Data/en/dataset/traffic-accident-statistics-as-of-1439-h

Key Steps for Project

Following the CRISP-DM process in finding solutions

  1. I have used the traffic accidents data in KSA for the year 1439 Hijri calendar.

  2. My business questions are:

    • What is the total number of accidents per month?
    • What is the percentage of children that are involved in these accidents per month?
    • What is the the percentage of children involved for each region during the whole year?
    • Comparing the largest regions in the kingdom of Saudi Arabia, where is the highest percentage of children involvement into traffic accidents?
  3. I have created a Jupyter Notebook to explore and analyse the data:

    Preparing the data:

    • Read all sheets in the excel workbook and combine them.
    • Clean data, rename columns, and change the arabic naming of months to english words.
    • Create a dataframe for the data containing the age categories.
    • Provide visualized answers for my business questions
  4. Please go to my blog post, where I present my insights:
    https://medium.com/@fatsammar/traffic-accidents-analysis-in-saudi-arabia-during-1939-hijri-df581248c1d4

About

First project in my udacity-misk Data Science nano-degree: Blog post

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published