# Chicago Car Crashes

## Table Of Contents
<font size=3rem>
    
0 -**[ INTRO](#Introduction)<br>**
1 -**[ OBTAIN](#Obtain)**<br>
2 -**[ SCRUB](#Scrub)**<br>
3 -**[ EXPLORE](#Explor)**<br>
4 -**[ MODEL](#Model)**<br>
5 -**[ INTERPRET](#Interpret)**<br>
6 -**[ CONCLUSIONS & RECOMENDATIONS](#Conclusions-&-Recommendations)<br>**
</font>
___

# Introduction
Student: Thomas Cornett

Pace: Part-Time

Instructor: Amber Yandow


In this notebook I am going to be doing a multi-variable classification on the Chicago Car Crash dataset while answering a few questions.
    
    What types of conditions cause the most accidents? IE weather conditions, road type, human error.
    Are there common area's that most accidents occur? IE highways, back roads, intersections etc
    Is the time of the day or the day of the week worse or better? IE is it usually on a week day for commuting etc.
    
    
    

## Imports

In [1]:
import pandas as pd
import pandas_profiling as pp
import scipy.stats as stats
import statsmodels.api as sm
import statsmodels.stats.api as sms
import mlxtend
from scipy.stats import zscore
from statsmodels.formula.api import ols
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
from mlxtend.feature_selection import SequentialFeatureSelector as SFS
from math import sqrt
from sklearn.metrics import mean_squared_error, r2_score
from sklearn.neighbors import NearestNeighbors, KNeighborsClassifier, KNeighborsRegressor
from sklearn.model_selection import cross_val_predict, KFold
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import make_pipeline
from statsmodels.stats.outliers_influence import variance_inflation_factor

## Adjusting settings
Unfortunitly (or not) there is A LOT of data, about 2.2m rows of crash data that we will be sifting through so I am going to adjust some settings first to make it easier to look through. 

In [2]:
pd.set_option('display.max_rows', 50)
pd.set_option('display.max_columns', 500)
pd.set_option('display.width', 1000)

## Functions
To make it easier I am going to set and define all the functions that I will be using here for later in the project. 

In [3]:
def get_data():
    crash_data = pd.read_csv('Data\Chicago_car_wrecks.csv')
    vehicle = pd.read_csv('Data\Accidents_vehicles.csv')
    persons = pd.read_csv('Data\Accidents_people.csv')
    return crash_data,vehicle,persons

In [4]:
def Cleaning(persons,vehicle,crash_data):
        columns_to_remove = ['RD_NO','UNIT_NO','CMRC_VEH_I','VEHICLE_DEFECT','TRAVEL_DIRECTION','MANEUVER','TOWED_I','FIRE_I',
                            'OCCUPANT_CNT','EXCEED_SPEED_LIMIT_I','TOWED_BY','TOWED_TO','AREA_00_I','AREA_01_I','AREA_02_I','AREA_03_I','AREA_04_I','AREA_05_I','AREA_06_I',
                            'AREA_07_I','AREA_08_I','AREA_09_I','AREA_10_I','AREA_11_I','AREA_12_I','AREA_99_I','FIRST_CONTACT_POINT','CMV_ID','USDOT_NO','CCMC_NO','ILCC_NO',
                            'COMMERCIAL_SRC','GVWR','CARRIER_NAME','CARRIER_STATE','CARRIER_CITY','HAZMAT_PLACARDS_I','HAZMAT_NAME','UN_NO','HAZMAT_PRESENT_I',
                            'HAZMAT_REPORT_I','HAZMAT_REPORT_NO','MCS_REPORT_I','MCS_REPORT_NO','HAZMAT_VIO_CAUSE_CRASH_I','MCS_VIO_CAUSE_CRASH_I','IDOT_PERMIT_NO',
                            'WIDE_LOAD_I','TRAILER1_WIDTH','TRAILER2_WIDTH','TRAILER1_LENGTH','TRAILER2_LENGTH','HOSPITAL','EMS_AGENCY','EMS_RUN_NO', 'PEDPEDAL_ACTION',
                            'PEDPEDAL_VISIBILITY','PEDPEDAL_LOCATION','CELL_PHONE_USE','INTERSECTION_RELATED_I','NOT_RIGHT_OF_WAY_I','HIT_AND_RUN_I','CRASH_DATE_EST_I',
                            'LANE_CNT','BEAT_OF_OCCURRENCE','PHOTOS_TAKEN_I','STATEMENTS_TAKEN_I','DOORING_I','WORK_ZONE_I','WORK_ZONE_TYPE','WORKERS_PRESENT_I', 'INJURIES_TOTAL',
                            'INJURIES_FATAL','INJURIES_INCAPACITATING','INJURIES_NON_INCAPACITATING','INJURIES_REPORTED_NOT_EVIDENT','INJURIES_NO_INDICATION',
                            'INJURIES_UNKNOWN','CRASH_HOUR','CRASH_DAY_OF_WEEK', 'CRASH_MONTH','SEAT_NO','HAZMAT_OUT_OF_SERVICE_I','MCS_OUT_OF_SERVICE_I','VEHICLE_ID_y',
                            'HAZMAT_CLASS','RD_NO_x','CRASH_DATE_x','VEHICLE_ID_x','RD_NO_y','CRASH_DATE_y','CRASH_UNIT_ID']
        persons2 = persons.fillna('UNKNOWN')
        persons = persons2.fillna(0)
        vehicle2 = vehicle.fillna('UNKNOWN')
        vehicle = vehicle2.fillna(0)
        crash_data2 = crash_data.fillna('UNKNOWN')
        crash_data = crash_data2.fillna(0)
        
        data = pd.merge(crash_data,vehicle, on= 'CRASH_RECORD_ID', how = 'inner')
        full_info = pd.merge(data,persons, on = 'CRASH_RECORD_ID', how = 'inner')
        clean_data = full_info.drop(columns = columns_to_remove)
        clean = clean_data.drop_duplicates(subset=None, keep='first', inplace=False)
        clean.columns = map(str.title, clean.columns)
        persons.columns = map(str.title, persons.columns)
        vehicle.columns = map(str.title, vehicle.columns)
        crash_data.columns = map(str.title, crash_data.columns)
        
        return clean,crash_data,persons,vehicle

# Obtain
I will be working with the Chicago Crash Dataset provided by the city of Chicago as part of their Vision Zero plan. It is split up into 3 different datasets: Persons, Vehicle, and Crashes. I am pulling all three of them so that I can get a 'bigger' picture of what it is that could be causing the problem. I am going to link the dataset locations online so it is easier for others to find what I am looking at.

In [5]:
Crashes,Vehicles,Persons = get_data()

  if (await self.run_code(code, result,  async_=asy)):
  if (await self.run_code(code, result,  async_=asy)):


## Initial look through

### Traffic Crashes:
Dataset location: https://data.cityofchicago.org/Transportation/Traffic-Crashes-Crashes/85ca-t3if


In [6]:
display(Crashes.head())
Crashes.info()

Unnamed: 0,CRASH_RECORD_ID,RD_NO,CRASH_DATE_EST_I,CRASH_DATE,POSTED_SPEED_LIMIT,TRAFFIC_CONTROL_DEVICE,DEVICE_CONDITION,WEATHER_CONDITION,LIGHTING_CONDITION,FIRST_CRASH_TYPE,TRAFFICWAY_TYPE,LANE_CNT,ALIGNMENT,ROADWAY_SURFACE_COND,ROAD_DEFECT,REPORT_TYPE,CRASH_TYPE,INTERSECTION_RELATED_I,NOT_RIGHT_OF_WAY_I,HIT_AND_RUN_I,DAMAGE,DATE_POLICE_NOTIFIED,PRIM_CONTRIBUTORY_CAUSE,SEC_CONTRIBUTORY_CAUSE,STREET_NO,STREET_DIRECTION,STREET_NAME,BEAT_OF_OCCURRENCE,PHOTOS_TAKEN_I,STATEMENTS_TAKEN_I,DOORING_I,WORK_ZONE_I,WORK_ZONE_TYPE,WORKERS_PRESENT_I,NUM_UNITS,MOST_SEVERE_INJURY,INJURIES_TOTAL,INJURIES_FATAL,INJURIES_INCAPACITATING,INJURIES_NON_INCAPACITATING,INJURIES_REPORTED_NOT_EVIDENT,INJURIES_NO_INDICATION,INJURIES_UNKNOWN,CRASH_HOUR,CRASH_DAY_OF_WEEK,CRASH_MONTH,LATITUDE,LONGITUDE,LOCATION
0,4fd0a3e0897b3335b94cd8d5b2d2b350eb691add56c62d...,JC343143,,07/10/2019 05:56:00 PM,35,NO CONTROLS,NO CONTROLS,CLEAR,DAYLIGHT,TURNING,ONE-WAY,,STRAIGHT AND LEVEL,DRY,NO DEFECTS,ON SCENE,NO INJURY / DRIVE AWAY,,,,"OVER $1,500",07/10/2019 06:16:00 PM,IMPROPER BACKING,UNABLE TO DETERMINE,2158,N,MARMORA AVE,2515.0,,,,,,,2,NO INDICATION OF INJURY,0.0,0.0,0.0,0.0,0.0,3.0,0.0,17,4,7,41.919664,-87.773288,POINT (-87.773287883007 41.919663832993)
1,009e9e67203442370272e1a13d6ee51a4155dac65e583d...,JA329216,,06/30/2017 04:00:00 PM,35,STOP SIGN/FLASHER,FUNCTIONING PROPERLY,CLEAR,DAYLIGHT,TURNING,NOT DIVIDED,4.0,STRAIGHT AND LEVEL,DRY,NO DEFECTS,ON SCENE,INJURY AND / OR TOW DUE TO CRASH,Y,,,"OVER $1,500",06/30/2017 04:01:00 PM,FAILING TO YIELD RIGHT-OF-WAY,NOT APPLICABLE,8301,S,CICERO AVE,834.0,,,,,,,2,NO INDICATION OF INJURY,0.0,0.0,0.0,0.0,0.0,3.0,0.0,16,6,6,41.741804,-87.740954,POINT (-87.740953581987 41.741803598989)
2,ee9283eff3a55ac50ee58f3d9528ce1d689b1c4180b4c4...,JD292400,,07/10/2020 10:25:00 AM,30,TRAFFIC SIGNAL,FUNCTIONING PROPERLY,CLEAR,DAYLIGHT,REAR END,FOUR WAY,,STRAIGHT AND LEVEL,DRY,NO DEFECTS,ON SCENE,NO INJURY / DRIVE AWAY,,,,"OVER $1,500",07/10/2020 10:25:00 AM,FAILING TO YIELD RIGHT-OF-WAY,FAILING TO YIELD RIGHT-OF-WAY,1632,E,67TH ST,331.0,,,,,,,3,NO INDICATION OF INJURY,0.0,0.0,0.0,0.0,0.0,3.0,0.0,10,6,7,41.773456,-87.585022,POINT (-87.585022352022 41.773455972008)
3,f8960f698e870ebdc60b521b2a141a5395556bc3704191...,JD293602,,07/11/2020 01:00:00 AM,30,NO CONTROLS,NO CONTROLS,CLEAR,DARKNESS,PARKED MOTOR VEHICLE,DIVIDED - W/MEDIAN (NOT RAISED),,STRAIGHT AND LEVEL,DRY,NO DEFECTS,NOT ON SCENE (DESK REPORT),NO INJURY / DRIVE AWAY,,,Y,$500 OR LESS,07/11/2020 08:30:00 AM,UNABLE TO DETERMINE,UNABLE TO DETERMINE,110,E,51ST ST,224.0,,,,,,,2,NO INDICATION OF INJURY,0.0,0.0,0.0,0.0,0.0,3.0,0.0,1,7,7,41.802119,-87.622115,POINT (-87.622114914961 41.802118543011)
4,8eaa2678d1a127804ee9b8c35ddf7d63d913c14eda61d6...,JD290451,,07/08/2020 02:00:00 PM,20,NO CONTROLS,NO CONTROLS,CLEAR,DAYLIGHT,PARKED MOTOR VEHICLE,DRIVEWAY,,STRAIGHT AND LEVEL,DRY,NO DEFECTS,ON SCENE,NO INJURY / DRIVE AWAY,,,,"OVER $1,500",07/08/2020 02:15:00 PM,UNABLE TO DETERMINE,UNABLE TO DETERMINE,412,W,OHARE ST,1654.0,,,,,,,2,NO INDICATION OF INJURY,0.0,0.0,0.0,0.0,0.0,1.0,0.0,14,4,7,,,


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 471122 entries, 0 to 471121
Data columns (total 49 columns):
 #   Column                         Non-Null Count   Dtype  
---  ------                         --------------   -----  
 0   CRASH_RECORD_ID                471122 non-null  object 
 1   RD_NO                          467952 non-null  object 
 2   CRASH_DATE_EST_I               35166 non-null   object 
 3   CRASH_DATE                     471122 non-null  object 
 4   POSTED_SPEED_LIMIT             471122 non-null  int64  
 5   TRAFFIC_CONTROL_DEVICE         471122 non-null  object 
 6   DEVICE_CONDITION               471122 non-null  object 
 7   WEATHER_CONDITION              471122 non-null  object 
 8   LIGHTING_CONDITION             471122 non-null  object 
 9   FIRST_CRASH_TYPE               471122 non-null  object 
 10  TRAFFICWAY_TYPE                471122 non-null  object 
 11  LANE_CNT                       198961 non-null  float64
 12  ALIGNMENT                     

### Vehicles:
Dataset Location: https://data.cityofchicago.org/Transportation/Traffic-Crashes-Vehicles/68nd-jvt3

In [7]:
display(Vehicles.head())
Vehicles.info()

Unnamed: 0,CRASH_UNIT_ID,CRASH_RECORD_ID,RD_NO,CRASH_DATE,UNIT_NO,UNIT_TYPE,NUM_PASSENGERS,VEHICLE_ID,CMRC_VEH_I,MAKE,MODEL,LIC_PLATE_STATE,VEHICLE_YEAR,VEHICLE_DEFECT,VEHICLE_TYPE,VEHICLE_USE,TRAVEL_DIRECTION,MANEUVER,TOWED_I,FIRE_I,OCCUPANT_CNT,EXCEED_SPEED_LIMIT_I,TOWED_BY,TOWED_TO,AREA_00_I,AREA_01_I,AREA_02_I,AREA_03_I,AREA_04_I,AREA_05_I,AREA_06_I,AREA_07_I,AREA_08_I,AREA_09_I,AREA_10_I,AREA_11_I,AREA_12_I,AREA_99_I,FIRST_CONTACT_POINT,CMV_ID,USDOT_NO,CCMC_NO,ILCC_NO,COMMERCIAL_SRC,GVWR,CARRIER_NAME,CARRIER_STATE,CARRIER_CITY,HAZMAT_PLACARDS_I,HAZMAT_NAME,UN_NO,HAZMAT_PRESENT_I,HAZMAT_REPORT_I,HAZMAT_REPORT_NO,MCS_REPORT_I,MCS_REPORT_NO,HAZMAT_VIO_CAUSE_CRASH_I,MCS_VIO_CAUSE_CRASH_I,IDOT_PERMIT_NO,WIDE_LOAD_I,TRAILER1_WIDTH,TRAILER2_WIDTH,TRAILER1_LENGTH,TRAILER2_LENGTH,TOTAL_VEHICLE_LENGTH,AXLE_CNT,VEHICLE_CONFIG,CARGO_BODY_TYPE,LOAD_TYPE,HAZMAT_OUT_OF_SERVICE_I,MCS_OUT_OF_SERVICE_I,HAZMAT_CLASS
0,829999,24ddf9fd8542199d832e1c223cc474e5601b356f1d77a6...,JD124535,01/22/2020 06:25:00 AM,1,DRIVER,,796949.0,,INFINITI,UNKNOWN,IL,2017.0,NONE,PASSENGER,PERSONAL,N,STRAIGHT AHEAD,,,1.0,,,,,Y,Y,,,,,,,,,,,,FRONT,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
1,749947,81dc0de2ed92aa62baccab641fa377be7feb1cc47e6554...,JC451435,09/28/2019 03:30:00 AM,1,DRIVER,,834816.0,,HONDA,CIVIC,IL,2016.0,UNKNOWN,PASSENGER,PERSONAL,N,STRAIGHT AHEAD,,,1.0,,,,,Y,,,,,,,,,,,,,FRONT,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
2,749949,81dc0de2ed92aa62baccab641fa377be7feb1cc47e6554...,JC451435,09/28/2019 03:30:00 AM,2,PARKED,,834819.0,,TOYOTA,YARIS,IL,2010.0,NONE,UNKNOWN/NA,PERSONAL,N,PARKED,,,0.0,,,,,,,,,,,,,Y,,,,,ROOF,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
3,749950,81dc0de2ed92aa62baccab641fa377be7feb1cc47e6554...,JC451435,09/28/2019 03:30:00 AM,3,PARKED,,834817.0,,GENERAL MOTORS CORPORATION (GMC),SIERRA,IL,2008.0,UNKNOWN,UNKNOWN/NA,UNKNOWN/NA,N,PARKED,,,0.0,,,,,,,,,,,,,Y,,,,,ROOF,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
4,871921,af84fb5c8d996fcd3aefd36593c3a02e6e7509eeb27568...,JD208731,04/13/2020 10:50:00 PM,2,DRIVER,,827212.0,,BUICK,ENCORE,IL,,NONE,PASSENGER,PERSONAL,W,STRAIGHT AHEAD,,,1.0,,,,,,,Y,,,,,,,,,,,FRONT-RIGHT,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 963888 entries, 0 to 963887
Data columns (total 72 columns):
 #   Column                    Non-Null Count   Dtype  
---  ------                    --------------   -----  
 0   CRASH_UNIT_ID             963888 non-null  int64  
 1   CRASH_RECORD_ID           963888 non-null  object 
 2   RD_NO                     957106 non-null  object 
 3   CRASH_DATE                963888 non-null  object 
 4   UNIT_NO                   963888 non-null  int64  
 5   UNIT_TYPE                 962443 non-null  object 
 6   NUM_PASSENGERS            144790 non-null  float64
 7   VEHICLE_ID                941455 non-null  float64
 8   CMRC_VEH_I                17882 non-null   object 
 9   MAKE                      941450 non-null  object 
 10  MODEL                     941308 non-null  object 
 11  LIC_PLATE_STATE           862168 non-null  object 
 12  VEHICLE_YEAR              789522 non-null  float64
 13  VEHICLE_DEFECT            941455 non-null  o

### Persons:
Dataset location: https://data.cityofchicago.org/Transportation/Traffic-Crashes-People/u6pd-qa9d

In [8]:
display(Persons.head())
Persons.info()

Unnamed: 0,PERSON_ID,PERSON_TYPE,CRASH_RECORD_ID,RD_NO,VEHICLE_ID,CRASH_DATE,SEAT_NO,CITY,STATE,ZIPCODE,SEX,AGE,DRIVERS_LICENSE_STATE,DRIVERS_LICENSE_CLASS,SAFETY_EQUIPMENT,AIRBAG_DEPLOYED,EJECTION,INJURY_CLASSIFICATION,HOSPITAL,EMS_AGENCY,EMS_RUN_NO,DRIVER_ACTION,DRIVER_VISION,PHYSICAL_CONDITION,PEDPEDAL_ACTION,PEDPEDAL_VISIBILITY,PEDPEDAL_LOCATION,BAC_RESULT,BAC_RESULT VALUE,CELL_PHONE_USE
0,O749947,DRIVER,81dc0de2ed92aa62baccab641fa377be7feb1cc47e6554...,JC451435,834816.0,09/28/2019 03:30:00 AM,,CHICAGO,IL,60651.0,M,25.0,IL,D,NONE PRESENT,DEPLOYMENT UNKNOWN,NONE,NO INDICATION OF INJURY,,,,UNKNOWN,UNKNOWN,UNKNOWN,,,,TEST NOT OFFERED,,
1,O871921,DRIVER,af84fb5c8d996fcd3aefd36593c3a02e6e7509eeb27568...,JD208731,827212.0,04/13/2020 10:50:00 PM,,CHICAGO,IL,60620.0,M,37.0,IL,,SAFETY BELT USED,DID NOT DEPLOY,NONE,NO INDICATION OF INJURY,,,,NONE,NOT OBSCURED,NORMAL,,,,TEST NOT OFFERED,,
2,O10018,DRIVER,71162af7bf22799b776547132ebf134b5b438dcf3dac6b...,HY484534,9579.0,11/01/2015 05:00:00 AM,,,,,X,,,,USAGE UNKNOWN,DEPLOYMENT UNKNOWN,NONE,NO INDICATION OF INJURY,,,,IMPROPER BACKING,UNKNOWN,UNKNOWN,,,,TEST NOT OFFERED,,
3,O10038,DRIVER,c21c476e2ccc41af550b5d858d22aaac4ffc88745a1700...,HY484750,9598.0,11/01/2015 08:00:00 AM,,,,,X,,,,USAGE UNKNOWN,DEPLOYMENT UNKNOWN,UNKNOWN,NO INDICATION OF INJURY,,,,UNKNOWN,UNKNOWN,UNKNOWN,,,,TEST NOT OFFERED,,
4,O10039,DRIVER,eb390a4c8e114c69488f5fb8a097fe629f5a92fd528cf4...,HY484778,9600.0,11/01/2015 10:15:00 AM,,,,,X,,,,USAGE UNKNOWN,DEPLOYMENT UNKNOWN,UNKNOWN,NO INDICATION OF INJURY,,,,UNKNOWN,UNKNOWN,UNKNOWN,,,,TEST NOT OFFERED,,


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1045663 entries, 0 to 1045662
Data columns (total 30 columns):
 #   Column                 Non-Null Count    Dtype  
---  ------                 --------------    -----  
 0   PERSON_ID              1045663 non-null  object 
 1   PERSON_TYPE            1045663 non-null  object 
 2   CRASH_RECORD_ID        1045663 non-null  object 
 3   RD_NO                  1038665 non-null  object 
 4   VEHICLE_ID             1024792 non-null  float64
 5   CRASH_DATE             1045663 non-null  object 
 6   SEAT_NO                214325 non-null   float64
 7   CITY                   774040 non-null   object 
 8   STATE                  782684 non-null   object 
 9   ZIPCODE                707029 non-null   object 
 10  SEX                    1030159 non-null  object 
 11  AGE                    748528 non-null   float64
 12  DRIVERS_LICENSE_STATE  620287 non-null   object 
 13  DRIVERS_LICENSE_CLASS  539291 non-null   object 
 14  SAFETY_EQUIPMENT  

### Initial Observations:
At first glance the data looks pretty broken and missing a lot of data, so let's get to it and clean her up! 

# Scrub
Cleaning the data, getting rid of NaN values and condensing the dataset down to a more 'managable' level

In [9]:
#Pulling out the original information just in case we need it later.
clean,crash_data,persons,vehicle = Cleaning(Persons,Vehicles,Crashes)

In [11]:
clean.head()

Unnamed: 0,Crash_Record_Id,Posted_Speed_Limit,Traffic_Control_Device,Device_Condition,Weather_Condition,Lighting_Condition,First_Crash_Type,Trafficway_Type,Alignment,Roadway_Surface_Cond,Road_Defect,Report_Type,Crash_Type,Damage,Date_Police_Notified,Prim_Contributory_Cause,Sec_Contributory_Cause,Street_No,Street_Direction,Street_Name,Num_Units,Most_Severe_Injury,Latitude,Longitude,Location,Unit_Type,Num_Passengers,Make,Model,Lic_Plate_State,Vehicle_Year,Vehicle_Type,Vehicle_Use,Total_Vehicle_Length,Axle_Cnt,Vehicle_Config,Cargo_Body_Type,Load_Type,Person_Id,Person_Type,Crash_Date,City,State,Zipcode,Sex,Age,Drivers_License_State,Drivers_License_Class,Safety_Equipment,Airbag_Deployed,Ejection,Injury_Classification,Driver_Action,Driver_Vision,Physical_Condition,Bac_Result,Bac_Result Value
0,4fd0a3e0897b3335b94cd8d5b2d2b350eb691add56c62d...,35,NO CONTROLS,NO CONTROLS,CLEAR,DAYLIGHT,TURNING,ONE-WAY,STRAIGHT AND LEVEL,DRY,NO DEFECTS,ON SCENE,NO INJURY / DRIVE AWAY,"OVER $1,500",07/10/2019 06:16:00 PM,IMPROPER BACKING,UNABLE TO DETERMINE,2158,N,MARMORA AVE,2,NO INDICATION OF INJURY,41.919664,-87.773288,POINT (-87.773287883007 41.919663832993),DRIVER,UNKNOWN,GEO,UNKNOWN,IL,1995.0,PASSENGER,PERSONAL,UNKNOWN,UNKNOWN,UNKNOWN,UNKNOWN,UNKNOWN,O690420,DRIVER,07/10/2019 05:56:00 PM,CHICAGO,IL,60639,M,31.0,IL,D,USAGE UNKNOWN,NOT APPLICABLE,NONE,NO INDICATION OF INJURY,IMPROPER BACKING,UNKNOWN,NORMAL,TEST NOT OFFERED,UNKNOWN
1,4fd0a3e0897b3335b94cd8d5b2d2b350eb691add56c62d...,35,NO CONTROLS,NO CONTROLS,CLEAR,DAYLIGHT,TURNING,ONE-WAY,STRAIGHT AND LEVEL,DRY,NO DEFECTS,ON SCENE,NO INJURY / DRIVE AWAY,"OVER $1,500",07/10/2019 06:16:00 PM,IMPROPER BACKING,UNABLE TO DETERMINE,2158,N,MARMORA AVE,2,NO INDICATION OF INJURY,41.919664,-87.773288,POINT (-87.773287883007 41.919663832993),DRIVER,UNKNOWN,GEO,UNKNOWN,IL,1995.0,PASSENGER,PERSONAL,UNKNOWN,UNKNOWN,UNKNOWN,UNKNOWN,UNKNOWN,O690421,DRIVER,07/10/2019 05:56:00 PM,CHICAGO,IL,60639,M,43.0,IL,D,USAGE UNKNOWN,NOT APPLICABLE,NONE,NO INDICATION OF INJURY,NONE,NOT OBSCURED,NORMAL,TEST NOT OFFERED,UNKNOWN
2,4fd0a3e0897b3335b94cd8d5b2d2b350eb691add56c62d...,35,NO CONTROLS,NO CONTROLS,CLEAR,DAYLIGHT,TURNING,ONE-WAY,STRAIGHT AND LEVEL,DRY,NO DEFECTS,ON SCENE,NO INJURY / DRIVE AWAY,"OVER $1,500",07/10/2019 06:16:00 PM,IMPROPER BACKING,UNABLE TO DETERMINE,2158,N,MARMORA AVE,2,NO INDICATION OF INJURY,41.919664,-87.773288,POINT (-87.773287883007 41.919663832993),DRIVER,UNKNOWN,GEO,UNKNOWN,IL,1995.0,PASSENGER,PERSONAL,UNKNOWN,UNKNOWN,UNKNOWN,UNKNOWN,UNKNOWN,P157329,PASSENGER,07/10/2019 05:56:00 PM,CHICAGO,IL,60625,M,28.0,UNKNOWN,UNKNOWN,USAGE UNKNOWN,NOT APPLICABLE,NONE,NO INDICATION OF INJURY,UNKNOWN,UNKNOWN,UNKNOWN,UNKNOWN,UNKNOWN
3,4fd0a3e0897b3335b94cd8d5b2d2b350eb691add56c62d...,35,NO CONTROLS,NO CONTROLS,CLEAR,DAYLIGHT,TURNING,ONE-WAY,STRAIGHT AND LEVEL,DRY,NO DEFECTS,ON SCENE,NO INJURY / DRIVE AWAY,"OVER $1,500",07/10/2019 06:16:00 PM,IMPROPER BACKING,UNABLE TO DETERMINE,2158,N,MARMORA AVE,2,NO INDICATION OF INJURY,41.919664,-87.773288,POINT (-87.773287883007 41.919663832993),DRIVER,1.0,NISSAN,SENTRA (DATSUN AND NISSAN HAVE MERGED),IL,2012.0,PASSENGER,PERSONAL,UNKNOWN,UNKNOWN,UNKNOWN,UNKNOWN,UNKNOWN,O690420,DRIVER,07/10/2019 05:56:00 PM,CHICAGO,IL,60639,M,31.0,IL,D,USAGE UNKNOWN,NOT APPLICABLE,NONE,NO INDICATION OF INJURY,IMPROPER BACKING,UNKNOWN,NORMAL,TEST NOT OFFERED,UNKNOWN
4,4fd0a3e0897b3335b94cd8d5b2d2b350eb691add56c62d...,35,NO CONTROLS,NO CONTROLS,CLEAR,DAYLIGHT,TURNING,ONE-WAY,STRAIGHT AND LEVEL,DRY,NO DEFECTS,ON SCENE,NO INJURY / DRIVE AWAY,"OVER $1,500",07/10/2019 06:16:00 PM,IMPROPER BACKING,UNABLE TO DETERMINE,2158,N,MARMORA AVE,2,NO INDICATION OF INJURY,41.919664,-87.773288,POINT (-87.773287883007 41.919663832993),DRIVER,1.0,NISSAN,SENTRA (DATSUN AND NISSAN HAVE MERGED),IL,2012.0,PASSENGER,PERSONAL,UNKNOWN,UNKNOWN,UNKNOWN,UNKNOWN,UNKNOWN,O690421,DRIVER,07/10/2019 05:56:00 PM,CHICAGO,IL,60639,M,43.0,IL,D,USAGE UNKNOWN,NOT APPLICABLE,NONE,NO INDICATION OF INJURY,NONE,NOT OBSCURED,NORMAL,TEST NOT OFFERED,UNKNOWN


# Explore
Now that everything is prepped and edited it is time to get nitty gritty with the data and take a look at whats under the hood!

# Model
Let's take a look more into the classification aspect of what it is we are doing and run it through some machine learning algorithms

# Interpertations
What the data says to me about those questions

# Recomendations
What I would reccomend the city of Chicago do

# Thank you
Thank you for taking a look at my work!