# Generate Fire Incident Features

### Introduction

This notebook documents the process of generating feature data from the file matched_Fire_Incidents.csv.  These features will be used as the target variables in modeling.

### Load libraries and CSV

In [1]:
import numpy as np
import pandas as pd
from datetime import datetime

path = 'C:\\Users\\Kevin\\Desktop\\Fire Risk\\Model_matched_to_EAS'
incident_df = pd.read_csv(path + '\\' + 'matched_Fire_Incidents.csv', 
              low_memory=False)[['Incident Date','Primary Situation','EAS']].dropna()  #Drop obs  where any variable NAN

incident_df['Incident Date'] = pd.to_datetime(incident_df['Incident Date'])
incident_df['Incident_Year'] = incident_df['Incident Date'].dt.year


In [2]:
incident_df.head()

Unnamed: 0,Incident Date,Primary Situation,EAS,Incident_Year
0,2015-06-20,"113 cooking fire, confined to container",451005.0,2015
1,2006-08-05,111 - building fire,422528.0,2006
2,2005-05-30,"151 - outside rubbish, trash or waste fire",489180.0,2005
3,2010-11-28,"113 - cooking fire, confined to container",360149.0,2010
4,2003-10-05,154 - dumpster/outside trash receptacle fire,279186.0,2003


### Create larger incident category groupings 

In [3]:
incident_df['code'] = incident_df['Primary Situation'].apply(lambda s: s[0:3])
pd.set_option("display.max_rows",999)
incident_df.groupby(['Primary Situation', 'code']).count()

Unnamed: 0_level_0,Unnamed: 1_level_0,Incident Date,EAS,Incident_Year
Primary Situation,code,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1 -,1 -,26,26,26
10 -,10,30,30,30
"100 - fire, other",100,721,721,721
"100 fire, other",100,823,823,823
11 -,11,2,2,2
111 - building fire,111,3770,3770,3770
111 building fire,111,909,909,909
112 - fires in struct. other than in a bldg.,112,294,294,294
112 fires in structure other than in a building,112,58,58,58
"113 - cooking fire, confined to container",113,9540,9540,9540


A bit difficult to make clear groupings, but I see the following:

1, 10, 11, 100, 112  "OTHER FIRE"  
111                  "BUILDING FIRE"  
113                  "COOKING FIRE"  
114-118              "TRASH FIRE (INDOOR)"  
120-138              "VEHICLE FIRE"  
140-173              "OUTDOOR FIRE"  

Therefore, I implement as such.

In [4]:
di = {'FIRE OTHER': ['1 -', '10', '100', '11', '112'], 
      'BUILDING FIRE': ['111'], 
      'COOKING FIRE': ['113'], 
      'TRASH FIRE (INDOOR)': ['114','115','116','117','118'],
      'VEHICLE FIRE': ['120', '121', '122', '123', '130', '131', '132', '133', '134', '135', '136', '137', '138'],
      'OUTDOOR FIRE': ['140', '141', '142', '143', '150', '151', '152', '153', '154', '155', '160', '161', '162', '163', '164', '170', '173']}
# reverse the mapping
di = {d:c for c, d_list in di.items()
        for d in d_list}
#Map to 'Incident_Cat' groupings var
incident_df['Incident_Cat'] = incident_df['code'].map(di)

In [5]:
incident_df['Incident_Cat'].value_counts()

COOKING FIRE           12182
OUTDOOR FIRE           10689
TRASH FIRE (INDOOR)     6885
BUILDING FIRE           4679
VEHICLE FIRE            2895
FIRE OTHER              1922
Name: Incident_Cat, dtype: int64

### Clean up and save data

In [6]:
incident_df['Incident_Dummy'] = 1
incident_df = incident_df[['Incident Date', 
                           'EAS', 
                           'Incident_Year', 
                           'Incident_Cat', 
                           'Incident_Dummy']] 

In [87]:
incident_df.head()

Unnamed: 0,Incident Date,EAS,Incident_Year,Incident_Cat,Incident_Dummy
0,2015-06-20,451005.0,2015,COOKING FIRE,1
1,2006-08-05,422528.0,2006,BUILDING FIRE,1
2,2005-05-30,489180.0,2005,OUTDOOR FIRE,1
3,2010-11-28,360149.0,2010,COOKING FIRE,1
4,2003-10-05,279186.0,2003,OUTDOOR FIRE,1


In [8]:
#Export data
incident_df.to_csv(path_or_buf= path + '\\' + 'fireincident_data_formerge_20170917.csv', index=False)