# Project 5

### Table of Contents

* [Problem Statement](#Problem-Statement)
* [Data Dictionary](#Data-Dictionary)
* [2019 Crash Data](#2019-Crash-Data)
* [2020 Crash Data](#2020-Crash-Data)

### Problem Statement 

Is there a day of the week and/or time of day that correlates to car crashes in Chicago?

https://data.cityofchicago.org/Transportation/Traffic-Crashes-Crashes/85ca-t3if
    

https://git.generalassemb.ly/DSI-R-Metropolis/project_5

https://git.generalassemb.ly/dbvaddi/Project5

### Data Dictionary



|Feature|Type|Dataset|Description|
|---|---|---|---|
|

Imports

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

pd.pandas.set_option('display.max_columns', None)
pd.set_option('display.max_colwidth', 55)

### 2019 Crash Data

In [2]:
df_19 = pd.read_csv('Traffic_Crashes__Crashes_2019.csv')

In [3]:
df_19.shape

(117427, 49)

Checking for null values.

In [4]:
df_19.isnull().sum()

CRASH_RECORD_ID                       0
RD_NO                                 0
CRASH_DATE_EST_I                 108787
CRASH_DATE                            0
POSTED_SPEED_LIMIT                    0
TRAFFIC_CONTROL_DEVICE                0
DEVICE_CONDITION                      0
WEATHER_CONDITION                     0
LIGHTING_CONDITION                    0
FIRST_CRASH_TYPE                      0
TRAFFICWAY_TYPE                       0
LANE_CNT                          98175
ALIGNMENT                             0
ROADWAY_SURFACE_COND                  0
ROAD_DEFECT                           0
REPORT_TYPE                        3609
CRASH_TYPE                            0
INTERSECTION_RELATED_I            89521
NOT_RIGHT_OF_WAY_I               111617
HIT_AND_RUN_I                     84161
DAMAGE                                0
DATE_POLICE_NOTIFIED                  0
PRIM_CONTRIBUTORY_CAUSE               0
SEC_CONTRIBUTORY_CAUSE                0
STREET_NO                             0


Dropping columns that are over 90% nulls.

In [5]:
df_19.drop(columns = ['PHOTOS_TAKEN_I', 'STATEMENTS_TAKEN_I', 'DOORING_I', 'WORK_ZONE_I', 'WORK_ZONE_TYPE', 
                       'WORKERS_PRESENT_I', 'HIT_AND_RUN_I', 'INTERSECTION_RELATED_I', 'NOT_RIGHT_OF_WAY_I',
                       'LANE_CNT', 'CRASH_DATE_EST_I'], inplace = True)

In [6]:
df_19.isnull().sum().sort_values()

CRASH_RECORD_ID                     0
CRASH_MONTH                         0
CRASH_DAY_OF_WEEK                   0
CRASH_HOUR                          0
NUM_UNITS                           0
BEAT_OF_OCCURRENCE                  0
STREET_NAME                         0
STREET_DIRECTION                    0
STREET_NO                           0
PRIM_CONTRIBUTORY_CAUSE             0
DATE_POLICE_NOTIFIED                0
DAMAGE                              0
CRASH_TYPE                          0
SEC_CONTRIBUTORY_CAUSE              0
ROAD_DEFECT                         0
ROADWAY_SURFACE_COND                0
ALIGNMENT                           0
TRAFFICWAY_TYPE                     0
RD_NO                               0
FIRST_CRASH_TYPE                    0
LIGHTING_CONDITION                  0
WEATHER_CONDITION                   0
DEVICE_CONDITION                    0
TRAFFIC_CONTROL_DEVICE              0
CRASH_DATE                          0
POSTED_SPEED_LIMIT                  0
INJURIES_UNK

Dropping 'INJURIES_UNKNOWN' since all values are 0.

In [7]:
df_19.drop(columns = ['INJURIES_UNKNOWN'], inplace = True)

Dropping irrelevant column.

In [8]:
df_19.drop(columns = ['REPORT_TYPE', 'CRASH_RECORD_ID', 'DATE_POLICE_NOTIFIED', 'STREET_NAME']
           , inplace = True)

Dropping columns 'LATITUDE', 'LONGITUDE', and 'LOCATION' since the numerical values are not representative of magnitude nor order.

In [9]:
df_19.drop(columns = ['LATITUDE','LONGITUDE','LOCATION'], inplace = True)

Dropping 'CRASH_DATE', since we have the hour, day of the week and month for aggregation.

In [10]:
df_19.drop(columns = ['CRASH_DATE'], inplace = True)

Dropping 'DAMAGE' and 'WEATHER_CONDITION' since neither of us provide any helpful correlation.

In [11]:
df_19.drop(columns = ['DAMAGE', 'WEATHER_CONDITION', 'TRAFFIC_CONTROL_DEVICE', 'STREET_DIRECTION',
                      'SEC_CONTRIBUTORY_CAUSE'], inplace = True)

Filling in nulls with 0.

In [12]:
df_19['INJURIES_NO_INDICATION'].fillna('0', inplace = True)
df_19['INJURIES_REPORTED_NOT_EVIDENT'].fillna('0', inplace = True)
df_19['INJURIES_TOTAL'].fillna('0', inplace = True)
df_19['INJURIES_INCAPACITATING'].fillna('0', inplace = True)
df_19['INJURIES_FATAL'].fillna('0', inplace = True)
df_19['INJURIES_NON_INCAPACITATING'].fillna('0', inplace = True)

Filling nulls with 'NO INDICATION OF INJURY' for column 'MOST_SEVERE_INJURY' since that is the most common value and most likely to be the case if nothing was reported.

In [13]:
df_19['MOST_SEVERE_INJURY'].fillna('NO INDICATION OF INJURY', inplace = True)

Drop remaining null values.

In [14]:
df_19.dropna(inplace = True)

In [15]:
df_19['PRIM_CONTRIBUTORY_CAUSE'].value_counts()

UNABLE TO DETERMINE                                                                 43005
FAILING TO YIELD RIGHT-OF-WAY                                                       12913
FOLLOWING TOO CLOSELY                                                               12137
NOT APPLICABLE                                                                       6673
IMPROPER OVERTAKING/PASSING                                                          5594
FAILING TO REDUCE SPEED TO AVOID CRASH                                               5553
IMPROPER BACKING                                                                     5092
IMPROPER LANE USAGE                                                                  4466
IMPROPER TURNING/NO SIGNAL                                                           4099
DRIVING SKILLS/KNOWLEDGE/EXPERIENCE                                                  3492
WEATHER                                                                              2084
DISREGARDI

In [16]:
df_19.head()

Unnamed: 0,RD_NO,POSTED_SPEED_LIMIT,DEVICE_CONDITION,LIGHTING_CONDITION,FIRST_CRASH_TYPE,TRAFFICWAY_TYPE,ALIGNMENT,ROADWAY_SURFACE_COND,ROAD_DEFECT,CRASH_TYPE,PRIM_CONTRIBUTORY_CAUSE,STREET_NO,BEAT_OF_OCCURRENCE,NUM_UNITS,MOST_SEVERE_INJURY,INJURIES_TOTAL,INJURIES_FATAL,INJURIES_INCAPACITATING,INJURIES_NON_INCAPACITATING,INJURIES_REPORTED_NOT_EVIDENT,INJURIES_NO_INDICATION,CRASH_HOUR,CRASH_DAY_OF_WEEK,CRASH_MONTH
0,JC111663,35,NO CONTROLS,UNKNOWN,PARKED MOTOR VEHICLE,NOT DIVIDED,STRAIGHT AND LEVEL,UNKNOWN,NO DEFECTS,NO INJURY / DRIVE AWAY,UNABLE TO DETERMINE,1600,235,2,NO INDICATION OF INJURY,0,0,0,0,0,1,0,3,1
1,JC100635,30,NO CONTROLS,DARKNESS,PARKED MOTOR VEHICLE,ONE-WAY,STRAIGHT AND LEVEL,WET,NO DEFECTS,NO INJURY / DRIVE AWAY,UNABLE TO DETERMINE,3332,1123,2,NO INDICATION OF INJURY,0,0,0,0,0,1,0,3,1
2,JC100780,30,NO CONTROLS,"DARKNESS, LIGHTED ROAD",PARKED MOTOR VEHICLE,DIVIDED - W/MEDIAN (NOT RAISED),STRAIGHT AND LEVEL,UNKNOWN,UNKNOWN,INJURY AND / OR TOW DUE TO CRASH,UNABLE TO DETERMINE,1649,1922,2,NO INDICATION OF INJURY,0,0,0,0,0,1,0,3,1
3,JC100033,20,FUNCTIONING PROPERLY,"DARKNESS, LIGHTED ROAD",TURNING,ONE-WAY,STRAIGHT AND LEVEL,WET,NO DEFECTS,NO INJURY / DRIVE AWAY,IMPROPER OVERTAKING/PASSING,188,122,2,NO INDICATION OF INJURY,0,0,0,0,0,2,0,3,1
4,JC100171,30,NO CONTROLS,"DARKNESS, LIGHTED ROAD",PARKED MOTOR VEHICLE,ONE-WAY,STRAIGHT AND LEVEL,DRY,UNKNOWN,NO INJURY / DRIVE AWAY,"OPERATING VEHICLE IN ERRATIC, RECKLESS, CARELESS, N...",4054,1624,3,NO INDICATION OF INJURY,0,0,0,0,0,5,0,3,1


Mapping Target Variable.

In [17]:
df_19['MOST_SEVERE_INJURY'].value_counts()

NO INDICATION OF INJURY     101146
NONINCAPACITATING INJURY      9190
REPORTED, NOT EVIDENT         4981
INCAPACITATING INJURY         2008
FATAL                          102
Name: MOST_SEVERE_INJURY, dtype: int64

In [18]:
df_19['MOST_SEVERE_INJURY'] = df_19['MOST_SEVERE_INJURY'].map({'FATAL': 1, 
                                                               'INCAPACITATING INJURY': 1, 
                                                               'NONINCAPACITATING INJURY': 1, 
                                                               'REPORTED, NOT EVIDENT': 0, 
                                                               'NO INDICATION OF INJURY': 0})

In [19]:
df_19['MOST_SEVERE_INJURY'].value_counts()

0    106127
1     11300
Name: MOST_SEVERE_INJURY, dtype: int64

Dummifying all remaining non-numeric columns.

In [20]:
dum = pd.get_dummies(df_19, columns = ['DEVICE_CONDITION',  
                                       'LIGHTING_CONDITION',
                                       'FIRST_CRASH_TYPE',
                                       'TRAFFICWAY_TYPE',
                                       'ALIGNMENT',
                                       'ROADWAY_SURFACE_COND',
                                       'ROAD_DEFECT',
                                       'CRASH_TYPE',
                                       'PRIM_CONTRIBUTORY_CAUSE'])

In [21]:
df_19 = df_19.merge(dum)

In [22]:
df_19.shape

(117427, 138)

In [23]:
df_19.head()

Unnamed: 0,RD_NO,POSTED_SPEED_LIMIT,DEVICE_CONDITION,LIGHTING_CONDITION,FIRST_CRASH_TYPE,TRAFFICWAY_TYPE,ALIGNMENT,ROADWAY_SURFACE_COND,ROAD_DEFECT,CRASH_TYPE,PRIM_CONTRIBUTORY_CAUSE,STREET_NO,BEAT_OF_OCCURRENCE,NUM_UNITS,MOST_SEVERE_INJURY,INJURIES_TOTAL,INJURIES_FATAL,INJURIES_INCAPACITATING,INJURIES_NON_INCAPACITATING,INJURIES_REPORTED_NOT_EVIDENT,INJURIES_NO_INDICATION,CRASH_HOUR,CRASH_DAY_OF_WEEK,CRASH_MONTH,DEVICE_CONDITION_FUNCTIONING IMPROPERLY,DEVICE_CONDITION_FUNCTIONING PROPERLY,DEVICE_CONDITION_MISSING,DEVICE_CONDITION_NO CONTROLS,DEVICE_CONDITION_NOT FUNCTIONING,DEVICE_CONDITION_OTHER,DEVICE_CONDITION_UNKNOWN,DEVICE_CONDITION_WORN REFLECTIVE MATERIAL,LIGHTING_CONDITION_DARKNESS,"LIGHTING_CONDITION_DARKNESS, LIGHTED ROAD",LIGHTING_CONDITION_DAWN,LIGHTING_CONDITION_DAYLIGHT,LIGHTING_CONDITION_DUSK,LIGHTING_CONDITION_UNKNOWN,FIRST_CRASH_TYPE_ANGLE,FIRST_CRASH_TYPE_ANIMAL,FIRST_CRASH_TYPE_FIXED OBJECT,FIRST_CRASH_TYPE_HEAD ON,FIRST_CRASH_TYPE_OTHER NONCOLLISION,FIRST_CRASH_TYPE_OTHER OBJECT,FIRST_CRASH_TYPE_OVERTURNED,FIRST_CRASH_TYPE_PARKED MOTOR VEHICLE,FIRST_CRASH_TYPE_PEDALCYCLIST,FIRST_CRASH_TYPE_PEDESTRIAN,FIRST_CRASH_TYPE_REAR END,FIRST_CRASH_TYPE_REAR TO FRONT,FIRST_CRASH_TYPE_REAR TO REAR,FIRST_CRASH_TYPE_REAR TO SIDE,FIRST_CRASH_TYPE_SIDESWIPE OPPOSITE DIRECTION,FIRST_CRASH_TYPE_SIDESWIPE SAME DIRECTION,FIRST_CRASH_TYPE_TRAIN,FIRST_CRASH_TYPE_TURNING,TRAFFICWAY_TYPE_ALLEY,TRAFFICWAY_TYPE_CENTER TURN LANE,TRAFFICWAY_TYPE_DIVIDED - W/MEDIAN (NOT RAISED),TRAFFICWAY_TYPE_DIVIDED - W/MEDIAN BARRIER,TRAFFICWAY_TYPE_DRIVEWAY,"TRAFFICWAY_TYPE_FIVE POINT, OR MORE",TRAFFICWAY_TYPE_FOUR WAY,TRAFFICWAY_TYPE_L-INTERSECTION,TRAFFICWAY_TYPE_NOT DIVIDED,TRAFFICWAY_TYPE_NOT REPORTED,TRAFFICWAY_TYPE_ONE-WAY,TRAFFICWAY_TYPE_OTHER,TRAFFICWAY_TYPE_PARKING LOT,TRAFFICWAY_TYPE_RAMP,TRAFFICWAY_TYPE_ROUNDABOUT,TRAFFICWAY_TYPE_T-INTERSECTION,TRAFFICWAY_TYPE_TRAFFIC ROUTE,TRAFFICWAY_TYPE_UNKNOWN,TRAFFICWAY_TYPE_UNKNOWN INTERSECTION TYPE,TRAFFICWAY_TYPE_Y-INTERSECTION,ALIGNMENT_CURVE ON GRADE,ALIGNMENT_CURVE ON HILLCREST,"ALIGNMENT_CURVE, LEVEL",ALIGNMENT_STRAIGHT AND LEVEL,ALIGNMENT_STRAIGHT ON GRADE,ALIGNMENT_STRAIGHT ON HILLCREST,ROADWAY_SURFACE_COND_DRY,ROADWAY_SURFACE_COND_ICE,ROADWAY_SURFACE_COND_OTHER,"ROADWAY_SURFACE_COND_SAND, MUD, DIRT",ROADWAY_SURFACE_COND_SNOW OR SLUSH,ROADWAY_SURFACE_COND_UNKNOWN,ROADWAY_SURFACE_COND_WET,ROAD_DEFECT_DEBRIS ON ROADWAY,ROAD_DEFECT_NO DEFECTS,ROAD_DEFECT_OTHER,"ROAD_DEFECT_RUT, HOLES",ROAD_DEFECT_SHOULDER DEFECT,ROAD_DEFECT_UNKNOWN,ROAD_DEFECT_WORN SURFACE,CRASH_TYPE_INJURY AND / OR TOW DUE TO CRASH,CRASH_TYPE_NO INJURY / DRIVE AWAY,PRIM_CONTRIBUTORY_CAUSE_ANIMAL,PRIM_CONTRIBUTORY_CAUSE_BICYCLE ADVANCING LEGALLY ON RED LIGHT,PRIM_CONTRIBUTORY_CAUSE_CELL PHONE USE OTHER THAN TEXTING,PRIM_CONTRIBUTORY_CAUSE_DISREGARDING OTHER TRAFFIC SIGNS,PRIM_CONTRIBUTORY_CAUSE_DISREGARDING ROAD MARKINGS,PRIM_CONTRIBUTORY_CAUSE_DISREGARDING STOP SIGN,PRIM_CONTRIBUTORY_CAUSE_DISREGARDING TRAFFIC SIGNALS,PRIM_CONTRIBUTORY_CAUSE_DISREGARDING YIELD SIGN,PRIM_CONTRIBUTORY_CAUSE_DISTRACTION - FROM INSIDE VEHICLE,PRIM_CONTRIBUTORY_CAUSE_DISTRACTION - FROM OUTSIDE VEHICLE,"PRIM_CONTRIBUTORY_CAUSE_DISTRACTION - OTHER ELECTRONIC DEVICE (NAVIGATION DEVICE, DVD PLAYER, ETC.)",PRIM_CONTRIBUTORY_CAUSE_DRIVING ON WRONG SIDE/WRONG WAY,PRIM_CONTRIBUTORY_CAUSE_DRIVING SKILLS/KNOWLEDGE/EXPERIENCE,PRIM_CONTRIBUTORY_CAUSE_EQUIPMENT - VEHICLE CONDITION,"PRIM_CONTRIBUTORY_CAUSE_EVASIVE ACTION DUE TO ANIMAL, OBJECT, NONMOTORIST",PRIM_CONTRIBUTORY_CAUSE_EXCEEDING AUTHORIZED SPEED LIMIT,PRIM_CONTRIBUTORY_CAUSE_EXCEEDING SAFE SPEED FOR CONDITIONS,PRIM_CONTRIBUTORY_CAUSE_FAILING TO REDUCE SPEED TO AVOID CRASH,PRIM_CONTRIBUTORY_CAUSE_FAILING TO YIELD RIGHT-OF-WAY,PRIM_CONTRIBUTORY_CAUSE_FOLLOWING TOO CLOSELY,PRIM_CONTRIBUTORY_CAUSE_HAD BEEN DRINKING (USE WHEN ARREST IS NOT MADE),PRIM_CONTRIBUTORY_CAUSE_IMPROPER BACKING,PRIM_CONTRIBUTORY_CAUSE_IMPROPER LANE USAGE,PRIM_CONTRIBUTORY_CAUSE_IMPROPER OVERTAKING/PASSING,PRIM_CONTRIBUTORY_CAUSE_IMPROPER TURNING/NO SIGNAL,PRIM_CONTRIBUTORY_CAUSE_MOTORCYCLE ADVANCING LEGALLY ON RED LIGHT,PRIM_CONTRIBUTORY_CAUSE_NOT APPLICABLE,PRIM_CONTRIBUTORY_CAUSE_OBSTRUCTED CROSSWALKS,"PRIM_CONTRIBUTORY_CAUSE_OPERATING VEHICLE IN ERRATIC, RECKLESS, CARELESS, NEGLIGENT OR AGGRESSIVE MANNER",PRIM_CONTRIBUTORY_CAUSE_PASSING STOPPED SCHOOL BUS,PRIM_CONTRIBUTORY_CAUSE_PHYSICAL CONDITION OF DRIVER,PRIM_CONTRIBUTORY_CAUSE_RELATED TO BUS STOP,PRIM_CONTRIBUTORY_CAUSE_ROAD CONSTRUCTION/MAINTENANCE,PRIM_CONTRIBUTORY_CAUSE_ROAD ENGINEERING/SURFACE/MARKING DEFECTS,PRIM_CONTRIBUTORY_CAUSE_TEXTING,PRIM_CONTRIBUTORY_CAUSE_TURNING RIGHT ON RED,PRIM_CONTRIBUTORY_CAUSE_UNABLE TO DETERMINE,PRIM_CONTRIBUTORY_CAUSE_UNDER THE INFLUENCE OF ALCOHOL/DRUGS (USE WHEN ARREST IS EFFECTED),"PRIM_CONTRIBUTORY_CAUSE_VISION OBSCURED (SIGNS, TREE LIMBS, BUILDINGS, ETC.)",PRIM_CONTRIBUTORY_CAUSE_WEATHER
0,JC111663,35,NO CONTROLS,UNKNOWN,PARKED MOTOR VEHICLE,NOT DIVIDED,STRAIGHT AND LEVEL,UNKNOWN,NO DEFECTS,NO INJURY / DRIVE AWAY,UNABLE TO DETERMINE,1600,235,2,0,0,0,0,0,0,1,0,3,1,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0
1,JC100635,30,NO CONTROLS,DARKNESS,PARKED MOTOR VEHICLE,ONE-WAY,STRAIGHT AND LEVEL,WET,NO DEFECTS,NO INJURY / DRIVE AWAY,UNABLE TO DETERMINE,3332,1123,2,0,0,0,0,0,0,1,0,3,1,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0
2,JC100780,30,NO CONTROLS,"DARKNESS, LIGHTED ROAD",PARKED MOTOR VEHICLE,DIVIDED - W/MEDIAN (NOT RAISED),STRAIGHT AND LEVEL,UNKNOWN,UNKNOWN,INJURY AND / OR TOW DUE TO CRASH,UNABLE TO DETERMINE,1649,1922,2,0,0,0,0,0,0,1,0,3,1,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0
3,JC100033,20,FUNCTIONING PROPERLY,"DARKNESS, LIGHTED ROAD",TURNING,ONE-WAY,STRAIGHT AND LEVEL,WET,NO DEFECTS,NO INJURY / DRIVE AWAY,IMPROPER OVERTAKING/PASSING,188,122,2,0,0,0,0,0,0,2,0,3,1,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,JC100171,30,NO CONTROLS,"DARKNESS, LIGHTED ROAD",PARKED MOTOR VEHICLE,ONE-WAY,STRAIGHT AND LEVEL,DRY,UNKNOWN,NO INJURY / DRIVE AWAY,"OPERATING VEHICLE IN ERRATIC, RECKLESS, CARELESS, N...",4054,1624,3,0,0,0,0,0,0,5,0,3,1,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0


##### Saving Cleaned DataFrame

In [24]:
df_19.to_csv('crash_19', index = False)

In [25]:
# for col in df_19.columns:
#     print(df_19[col].value_counts(ascending=False))

In [26]:
# correlation = df_19.corr()[['CRASH_DAY_OF_WEEK']]
# plt.figure(figsize=(12,12))
# sns.heatmap(data= correlation,
#             cmap= 'coolwarm',      
#             annot= True,
#             vmax= 1,
#             vmin= -1);