# Grand Rapids Traffic Accident Project
## Part 1: Data Cleaning and Exploration

Created by: Kate Meredith
Date: 11.28.22

## Table of Contents

* [1. Background](#header1)
    * [1.1 Data Dictionary](#subheader11)
* [2. Importing Libraries](#header2)
* [3. Importing the Data](#header3)

## 1. Background <a class="anchor" id="header1"></a>

This data represents traffic accidents occuring in Grand Rapids, Michigan, USA from 2007 to 2017. The
data comes from [Grand Rapids Open Data](https://grdata-grandrapids.opendata.arcgis.com/datasets/grandrapids::cgr-crash-data/explore?location=0.000000%2C0.000000%2C2.62).

This [web page](https://services2.arcgis.com/L81TiOwAPO1ZvU9b/arcgis/rest/services/CGRCrashData/FeatureServer/0) provides an overview of the variables and their datatypes.

### 1.1 Data Dictionary <a class="anchor" id="subheader11"></a>


* OBJECTID:
* ROADSOFTID:
* BIKE 
* CITY
* COUNTY
* CRASHDATE
* CRASHSEVER
* CRASHTYPE 
* WORKZNEACT 
* WORKZNECLO 
* WORKZNETYP 
* CTRLMILEPT 
* CTRLSECT 
* DAYOFMONTH
* DAYOFWEEK 
* ANIMAL 
* D1COND 
* D1DRINKIN
* D1HAZACT
* D1INJURY 
* D1INTENT 
* D2COND 
* D2DRINKIN 
    D2HAZACT 
    D2INJURY 
    D2INTENT
    D3COND 
    D3DRINKIN 
    D3HAZACT 
    D3INJURY 
    D3INTENT 
    DRINKING 
    DRIVER1AGE 
    DRIVER1SEX 
    DRIVER2AGE 
    DRIVER2SEX 
    DRIVER3AGE 
    DRIVER3SEX 
    EMRGVEH 
    FARMEQUIP 
    FLEEINGSIT 
    FWSEGID 
    GRTINJSEVE 
    HITANDRUN 
    HOUR 
    INTERNAME 
    LIGHTING 
    MDOTREG 
    MILEPOINT 
    MONTH 
    MOTORCYCLE 
    NOATYPEINJ 
    NOBTYPEINJ 
    NOCTYPEINJ 
    NONTRAFFIC 
    NUMOFINJ 
    NUMOFKILL 
    NUMOFOCCUP 
    NUMOFUNINJ 
    NUMOFVEHIC 
    ORV 
    PEDESTRIAN 
    PRNAME 
    PRNO 
    REFDIR 
    REFDIST 
    ROUTECLASS 
    ROUTENUM 
    SCHOOLBUS 
    SNOWMOBILE 
    SPDLMTPOST 
    SPEEDLIMIT 
    SURFCOND 
    TRAFCTLDEV 
    TRAIN 
    TRUCKBUS 
    TRUNKLINE 
    UD10NUM
    V1DEFECT 
    V1DAMAGE 
    V1HARMEVT1 
    V1HARMEVT2 
    V1HARMEVT3 
    V1HARMEVT4 
    V1MSTHARME 
    V1SPECCAT 
    V1TRAILER 
    V1VIOLATOR 
    V1WIMPCTPT 
    V2DEFECT 
    V2DAMAGE 
    V2HARMEVT1
    V2HARMEVT2 
    V2HARMEVT3 
    V2HARMEVT4 
    V2MSTHARME 
    V2SPECCAT 
    V2TRAILER 
    V2VIOLATOR 
    V2WIMPCTPT 
    V3DEFECT 
    V3DAMAGE 
    V3HARMEVT1 
    V3HARMEVT2 
    V3HARMEVT3 
    V3HARMEVT4 
    V3MSTHARME 
    V3SPECCAT 
    V3TRAILER 
    V3VIOLATOR 
    V3WIMPCTPT 
    VEH1DIR 
    VEH1TYPE 
    VEH1USE 
    VEH2DIR 
    VEH2TYPE 
    VEH2USE
    VEH3DIR 
    VEH3TYPE 
    VEH3USE 
    WEATHER 
    WHEREONRD 
    YEAR 
    RDCITYTWP 
    ROAD_USER1 
    ROAD_USER2 
    ROAD_USER3 
    ROAD_USER4 
    RDLEGALSYS 
    RDLGLCODE 
    RDNFC 
    RDNFCCODE 
    RDNUMLANES 
    RDSUBTYPDS
    RDSUBTYPE 
    RDSURFTYPE 
    RDUSRINVID 
    RDWIDTH 
    FRAMEWORK 

## 2. Importing Libraries <a class="anchor" id="header2"></a>

Importing libraries to support data cleaning and exploration.

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

## 3. Importing the Data <a class="anchor" id="header3"></a>

In [2]:
#importing the data
crash_df = pd.read_csv('CGR_Crash_Data.csv')

In [6]:
#checking data shape
crash_df.shape

(74309, 142)

Data set has 74,309 rows and 142 variables.

In [3]:
#previewing first 5 rows
crash_df.head()

Unnamed: 0,X,Y,OBJECTID,ROADSOFTID,BIKE,CITY,COUNTY,CRASHDATE,CRASHSEVER,CRASHTYPE,...,RDLGLCODE,RDNFC,RDNFCCODE,RDNUMLANES,RDSUBTYPDS,RDSUBTYPE,RDSURFTYPE,RDUSRINVID,RDWIDTH,FRAMEWORK
0,-85.650003,42.919854,1,2589528,No,Grand Rapids,Kent,2008/06/16,Property Damage Only,Backing,...,5,Local,7,2,Asphalt-Standard,35,Asphalt,,26.0,17
1,-85.625665,42.92471,2,2593183,No,Grand Rapids,Kent,2008/08/30,Property Damage Only,Fixed Object,...,5,Local,7,2,Asphalt-Standard,35,Asphalt,,26.0,17
2,-85.655282,43.000972,3,2582102,No,Grand Rapids,Kent,2008/02/13,Property Damage Only,Other Driveway,...,5,Local,7,2,Asphalt-Standard,35,Asphalt,,29.0,17
3,-85.643314,42.928172,4,2579820,No,Grand Rapids,Kent,2008/01/25,Property Damage Only,Angle Straight,...,5,Local,7,2,Asphalt-Standard,35,Asphalt,,27.0,17
4,-85.665571,42.968854,5,2594624,No,Grand Rapids,Kent,2008/09/26,Property Damage Only,Backing,...,4,Local,7,2,Asphalt-Standard,35,Asphalt,,30.0,17


In [4]:
#previewing last 5 rows
crash_df.tail()

Unnamed: 0,X,Y,OBJECTID,ROADSOFTID,BIKE,CITY,COUNTY,CRASHDATE,CRASHSEVER,CRASHTYPE,...,RDLGLCODE,RDNFC,RDNFCCODE,RDNUMLANES,RDSUBTYPDS,RDSUBTYPE,RDSURFTYPE,RDUSRINVID,RDWIDTH,FRAMEWORK
74304,-85.68841,42.997895,74305,2558829,No,Grand Rapids,Kent,2017/03/20,Fatal,Other Driveway,...,4,Other Principal Arterial,3,2,Asphalt-Standard,35,Asphalt,,42.0,17
74305,-85.66167,42.94136,74306,2574652,No,Grand Rapids,Kent,2017/11/18,Fatal,Pedestrian,...,4,Minor Arterial,4,2,Asphalt-Standard,35,Asphalt,,42.0,17
74306,-85.661243,42.939724,74307,2563322,No,Grand Rapids,Kent,2017/06/02,Fatal,Side-Swipe Same,...,5,Local,7,2,Asphalt-Standard,35,Asphalt,,30.0,17
74307,-85.584837,42.971042,74308,2568387,No,Grand Rapids,Kent,2017/08/23,Fatal,Pedestrian,...,3,Local,7,2,Undefined,30,Undefined,,26.1,17
74308,-85.676419,42.984836,74309,2558314,No,Grand Rapids,Kent,2017/03/18,Fatal,Angle Straight,...,1,Major Collector,5,2,Composite,36,Composite,,46.0,17


In [5]:
#previewing sample of data
crash_df.sample(10)

Unnamed: 0,X,Y,OBJECTID,ROADSOFTID,BIKE,CITY,COUNTY,CRASHDATE,CRASHSEVER,CRASHTYPE,...,RDLGLCODE,RDNFC,RDNFCCODE,RDNUMLANES,RDSUBTYPDS,RDSUBTYPE,RDSURFTYPE,RDUSRINVID,RDWIDTH,FRAMEWORK
14785,-85.568654,42.912593,14786,2623512,No,Grand Rapids,Kent,2010/02/27,Property Damage Only,Rear End Straight,...,4,Minor Arterial,4,2,Asphalt-Standard,35,Asphalt,,52.0,17
19879,-85.622045,42.962757,19880,2639272,No,Grand Rapids,Kent,2010/12/04,Injury,Angle Turn,...,4,Other Principal Arterial,3,2,Asphalt-Standard,35,Asphalt,,40.0,17
42529,-85.59072,42.972756,42530,2714566,No,Grand Rapids,Kent,2014/05/14,Property Damage Only,Rear End Straight,...,1,Other Principal Arterial,3,5,Concrete-Standard,37,Concrete,,0.0,17
6398,-85.680666,42.934679,6399,2597438,No,Grand Rapids,Kent,2008/11/10,Injury,Fixed Object,...,1,Other Freeway,2,3,Concrete-Standard,37,Concrete,,0.0,17
44918,-85.687986,42.975808,44919,2712147,No,Grand Rapids,Kent,2014/04/30,Property Damage Only,Rear End Straight,...,4,Minor Arterial,4,2,Asphalt-Standard,35,Asphalt,,38.0,17
42988,-85.705958,42.952631,42989,2722933,No,Grand Rapids,Kent,2014/11/07,Property Damage Only,Other Object,...,1,Interstate,1,2,Concrete-Standard,37,Concrete,,0.0,17
71523,-85.681565,42.927816,71524,2576647,No,Grand Rapids,Kent,2017/12/26,Property Damage Only,Rear End Straight,...,4,Other Principal Arterial,3,2,Concrete-Standard,37,Concrete,,55.0,17
55180,-85.639901,42.998972,55181,2732858,No,Grand Rapids,Kent,2015/04/26,Property Damage Only,Misc. Multiple Vehicle,...,4,Other Principal Arterial,3,2,Asphalt-Standard,35,Asphalt,,50.0,17
29042,-85.678832,42.963738,29043,2664091,No,Grand Rapids,Kent,2012/02/16,Property Damage Only,Side-Swipe Same,...,1,Other Freeway,2,4,Concrete-Standard,37,Concrete,,0.0,17
3623,-85.629283,42.975378,3624,2597613,No,Grand Rapids,Kent,2008/11/24,Property Damage Only,Fixed Object,...,4,Major Collector,5,2,Asphalt-Standard,35,Asphalt,,32.0,17


In [8]:
print(crash_df.columns.tolist())

['X', 'Y', 'OBJECTID', 'ROADSOFTID', 'BIKE', 'CITY', 'COUNTY', 'CRASHDATE', 'CRASHSEVER', 'CRASHTYPE', 'WORKZNEACT', 'WORKZNECLO', 'WORKZNETYP', 'CTRLMILEPT', 'CTRLSECT', 'DAYOFMONTH', 'DAYOFWEEK', 'ANIMAL', 'D1COND', 'D1DRINKIN', 'D1HAZACT', 'D1INJURY', 'D1INTENT', 'D2COND', 'D2DRINKIN', 'D2HAZACT', 'D2INJURY', 'D2INTENT', 'D3COND', 'D3DRINKIN', 'D3HAZACT', 'D3INJURY', 'D3INTENT', 'DRINKING', 'DRIVER1AGE', 'DRIVER1SEX', 'DRIVER2AGE', 'DRIVER2SEX', 'DRIVER3AGE', 'DRIVER3SEX', 'EMRGVEH', 'FARMEQUIP', 'FLEEINGSIT', 'FWSEGID', 'GRTINJSEVE', 'HITANDRUN', 'HOUR', 'INTERNAME', 'LIGHTING', 'MDOTREG', 'MILEPOINT', 'MONTH', 'MOTORCYCLE', 'NOATYPEINJ', 'NOBTYPEINJ', 'NOCTYPEINJ', 'NONTRAFFIC', 'NUMOFINJ', 'NUMOFKILL', 'NUMOFOCCUP', 'NUMOFUNINJ', 'NUMOFVEHIC', 'ORV', 'PEDESTRIAN', 'PRNAME', 'PRNO', 'PUBLICPROP', 'REFDIR', 'REFDIST', 'ROUTECLASS', 'ROUTENUM', 'SCHOOLBUS', 'SNOWMOBILE', 'SPDLMTPOST', 'SPEEDLIMIT', 'SURFCOND', 'TRAFCTLDEV', 'TRAIN', 'TRUCKBUS', 'TRUNKLINE', 'UD10NUM', 'V1DEFECT', 'V

## References

* [Showing all columns as a list](https://www.statology.org/pandas-show-all-columns/)