# Capstone Project  - Car accidents in Seattle

## Applied Data Science IBM Course

## Table of contents
* [Introduction](#intro)
* [Data](#data)
* [Methodology](#methodology)
* [Analysis](#analysis)
* [Results and Discussion](#results)
* [Conclusion](#conclusion)

## Introduction <a name="intro"></a>

According to the Washington State Department of Transportation (WSDOT), car accidents happen in **Seattle** with a high frequency. It is estimated that an incident occurs every **4 minutes** and fatal accidents that leave victims dead occur every 20 hours. In this sense, fatal accidents went from 508 in 2016 to 525 in 2017, leaving a total of **555 fatalities**. Typical conditions that contribute to such fatal events include alcohol use, motorcycle use, and pedestrians. Here we evaluate the different types of incidents that occurred in Seattle and consider the different conditions that led to them. For this, the data record of incidents, locations and important conditions since 2004 is used. Based on the data provided, correlations between different areas and parameters, such as alcohol use, vehicle type, weather and street conditions, etc., are of use to detect the most common type of accident and its causes.

### Audience

- Traffic departments, such as the WSDOT, that are looking to reduce the number of incidents.
- General public, including pedestrians, cyclist, drivers who can decide their routes based on the likelihood of an incident. 
- Car insurance companies looking to estimate costs from the number and type of incidents.

## Data <a name='data'></a>

The data provided by SPD and Traffic Records consist of **194,673** incidents reported in Seattle from 2004 to the present. There are primary and secondary keys associated with every accident, including the specific location, date and time. All kinds of collisions have been recorded, including cars, pedestrians and cyclists. In total, more than 15 different conditions have been recorded, e.g. weather, types of location (alley, mid-block, intersections), alcohol or drug abuse, etc. In addition, the severity of the accident is denoted by the severity code. We started looking in the **areas** with the most fatal incidents reported. Then, a classification using common conditions is used to assess the severity of the accident. Based on our findings, a report that specifies which conditions and areas in Seattle should be a priority for future prevention of fatal car accidents.

In [1]:
import numpy as np
import pandas as pd
import pylab as pl
import wget

In [2]:
data = wget.download('https://s3.us.cloud-object-storage.appdomain.cloud/cf-courses-data/CognitiveClass/DP0701EN/version-2/Data-Collisions.csv')
collisions = pd.read_csv(data)

  interactivity=interactivity, compiler=compiler, result=result)


In [4]:
collisions.shape

(194673, 38)

In [5]:
collisions.head(3)

Unnamed: 0,SEVERITYCODE,X,Y,OBJECTID,INCKEY,COLDETKEY,REPORTNO,STATUS,ADDRTYPE,INTKEY,...,ROADCOND,LIGHTCOND,PEDROWNOTGRNT,SDOTCOLNUM,SPEEDING,ST_COLCODE,ST_COLDESC,SEGLANEKEY,CROSSWALKKEY,HITPARKEDCAR
0,2,-122.323148,47.70314,1,1307,1307,3502005,Matched,Intersection,37475.0,...,Wet,Daylight,,,,10,Entering at angle,0,0,N
1,1,-122.347294,47.647172,2,52200,52200,2607959,Matched,Block,,...,Wet,Dark - Street Lights On,,6354039.0,,11,From same direction - both going straight - bo...,0,0,N
2,1,-122.33454,47.607871,3,26700,26700,1482393,Matched,Block,,...,Dry,Daylight,,4323031.0,,32,One parked--one moving,0,0,N


In [7]:
collisions.groupby(['SEVERITYDESC'])['LOCATION'].value_counts()

SEVERITYDESC                    LOCATION                                                
Injury Collision                AURORA AVE N BETWEEN N 117TH PL AND N 125TH ST              120
                                6TH AVE AND JAMES ST                                        107
                                N NORTHGATE WAY BETWEEN MERIDIAN AVE N AND CORLISS AVE N     94
                                RAINIER AVE S BETWEEN S BAYVIEW ST AND S MCCLELLAN ST        94
                                AURORA AVE N BETWEEN N 130TH ST AND N 135TH ST               88
                                                                                           ... 
Property Damage Only Collision  YALE AVE E BETWEEN YALE PL E AND E NEWTON ST                  1
                                YALE AVE N BETWEEN FAIRVIEW NR AVE N AND DEAD END 2           1
                                YESLER WAY BETWEEN ALASKAN E RDWY WAY AND WESTERN AVE         1
                                YORK RD S BETWE

In [8]:
collisions.groupby(['SEVERITYDESC'])['UNDERINFL'].value_counts()

SEVERITYDESC                    UNDERINFL
Injury Collision                N            30896
                                0            22701
                                Y             1939
                                1             1623
Property Damage Only Collision  N            69378
                                0            57693
                                Y             3187
                                1             2372
Name: UNDERINFL, dtype: int64