## Climate Predictor and Wildfires Join

#### Author: Ryan Gan

#### Date: 2018-07-01

Joining climate predictors with wildfire data. Also planning to lag the climate predictors. Think about normalizing each variable.

In [39]:
import pandas as pd
import numpy as np

### Temperature

Importing temperature data.

In [10]:
# read temp
temp = pd.read_csv('../data/model_data/1948-2018_mon_temp_us.csv', index_col=0)

  mask |= (ar1 == a)


Lagging monthly temperature data for a year and filter to dates I have fire data.

In [11]:
# lag for a year
number_lags = 12

for lag in range(1, number_lags + 1):
    temp['temp_c_lag' + str(lag)] = temp.groupby('grid_id')['temp_c'].shift(lag)

In [12]:
# convert date from string to datetime
temp['date'] = pd.to_datetime(temp['date'])
# filter to 1979 to 2016 for years I have fire data
temp = temp[(temp['date'] >= '1979-01-01') & (temp['date'] <= '2016-12-01')]
# print date range to make sure filter worked
print(temp['date'].min(), temp['date'].max())

1979-01-01 00:00:00 2016-12-01 00:00:00


### Humidity

Importing humidity.

In [13]:
# read rhum
rhum = pd.read_csv('../data/model_data/1948-2018_mon_rhum_us.csv', index_col=0)

  mask |= (ar1 == a)


In [14]:
# lag for a year
number_lags = 12

for lag in range(1, number_lags + 1):
    rhum['rhum_perc_lag' + str(lag)] = rhum.groupby('grid_id')['rhum_perc'].shift(lag)

In [15]:
# convert date from string to datetime
rhum['date'] = pd.to_datetime(rhum['date'])
# filter to 1979 to 2016 for years I have fire data
rhum = rhum[(rhum['date'] >= '1979-01-01') & (rhum['date'] <= '2016-12-01')]
# print date range to make sure filter worked
print(rhum['date'].min(), rhum['date'].max())

1979-01-01 00:00:00 2016-12-01 00:00:00


### Precipitation

Loading precipitation.

In [16]:
# read rhum
prec = pd.read_csv('../data/model_data/1948-2018_mon_pr_wtr_us.csv', index_col=0)

  mask |= (ar1 == a)


In [17]:
# lag for a year
number_lags = 12

for lag in range(1, number_lags + 1):
    prec['prec_kgm2_lag' + str(lag)] = prec.groupby('grid_id')['prec_kgm2'].shift(lag)

In [18]:
# convert date from string to datetime
prec['date'] = pd.to_datetime(prec['date'])
# filter to 1979 to 2016 for years I have fire data
prec = prec[(prec['date'] >= '1979-01-01') & (prec['date'] <= '2016-12-01')]
# print date range to make sure filter worked
print(prec['date'].min(), prec['date'].max())

1979-01-01 00:00:00 2016-12-01 00:00:00


### State Indicator

Loading in the indicator of what state the grid ID is in. I plan to only model fires for events in the US since that's where my fire data is from and it's an easy way to only estimate in the states (i.e. not water or Canada).

In [19]:
# load grid state
grid_state = pd.read_csv('../data/model_data/grid_state.csv', index_col=0)
# conver to factor
grid_state["state"] = grid_state["state"].astype('category')

In [20]:
grid_state['state'].describe()

count       819
unique       46
top       Texas
freq         65
Name: state, dtype: object

### Joining Climate Predictors

In [21]:
# merge temp and precip
climate = temp.merge(prec, left_on=['grid_id', 'glon', 'glat','date'], 
                     right_on=['grid_id', 'glon', 'glat','date'], how = 'left')

In [23]:
# merge climate to humidity
climate = climate.merge(rhum, left_on=['grid_id', 'glon', 'glat','date'], 
                     right_on=['grid_id', 'glon', 'glat','date'], how = 'left')

In [26]:
# merge states in; leaving NaNs as is
climate = climate.merge(grid_state, left_on=['grid_id', 'glon', 'glat'], 
                     right_on=['grid_id', 'glon', 'glat'], how = 'left')

In [27]:
climate.dtypes

grid_id                     int64
glon                      float64
glat                      float64
date               datetime64[ns]
temp_c                    float64
temp_c_lag1               float64
temp_c_lag2               float64
temp_c_lag3               float64
temp_c_lag4               float64
temp_c_lag5               float64
temp_c_lag6               float64
temp_c_lag7               float64
temp_c_lag8               float64
temp_c_lag9               float64
temp_c_lag10              float64
temp_c_lag11              float64
temp_c_lag12              float64
prec_kgm2                 float64
prec_kgm2_lag1            float64
prec_kgm2_lag2            float64
prec_kgm2_lag3            float64
prec_kgm2_lag4            float64
prec_kgm2_lag5            float64
prec_kgm2_lag6            float64
prec_kgm2_lag7            float64
prec_kgm2_lag8            float64
prec_kgm2_lag9            float64
prec_kgm2_lag10           float64
prec_kgm2_lag11           float64
prec_kgm2_lag1

### Wildfire Events

Loading the wildfire class and fire 1, 0 variable. Making dataset to estimate wildfire likelihood for a given location.

Import cleaned and prepared wildfire data.

In [87]:
# load fires count and indicator
fires = pd.read_csv('../data/model_data/1979-2016_wildfire_grid.csv', index_col=0)
# convert date to datetime
fires['date'] = pd.to_datetime(fires['date'])

In [93]:
# join fires with climate data
fire_climate = climate.merge(fires, left_on=['grid_id', 'date'], 
                     right_on=['grid_id', 'date'], how = 'left')

# filter to only states
fire_climate = fire_climate[fire_climate.state.notna()]

In [96]:
# conditional set fire to 0 if state is not NaN and fire is NaN
fire_climate['fire'].fillna(0, inplace = True)
# make seasons variable
fire_climate['month'] = fire_climate.date.map(lambda x: x.month).astype('category')

In [108]:
# view types
fire_climate.dtypes

grid_id                     int64
glon                      float64
glat                      float64
date               datetime64[ns]
temp_c                    float64
temp_c_lag1               float64
temp_c_lag2               float64
temp_c_lag3               float64
temp_c_lag4               float64
temp_c_lag5               float64
temp_c_lag6               float64
temp_c_lag7               float64
temp_c_lag8               float64
temp_c_lag9               float64
temp_c_lag10              float64
temp_c_lag11              float64
temp_c_lag12              float64
prec_kgm2                 float64
prec_kgm2_lag1            float64
prec_kgm2_lag2            float64
prec_kgm2_lag3            float64
prec_kgm2_lag4            float64
prec_kgm2_lag5            float64
prec_kgm2_lag6            float64
prec_kgm2_lag7            float64
prec_kgm2_lag8            float64
prec_kgm2_lag9            float64
prec_kgm2_lag10           float64
prec_kgm2_lag11           float64
prec_kgm2_lag1

In [109]:
# save file for analysis
#fire_climate.to_csv('../data/model_data/1979-2016_fire_likelihood.csv')

### Wildfire Acres and Class

Making a second dataset that uses the same climate predictors to estimate the fire class and acres based on climate predictors.

In [112]:
fire_class = pd.read_csv('../data/model_data/1979-2016_wildfire_info_us.csv', index_col=0)

In [124]:
# subset certain fire variables
fire_size = fire_class[['CAUSE', 'SPECCAUSE', 'SIZECLASS', 'STATE', 
                        'TOTALACRES','lat', 'lon', 'grid_id','date']]

# convert to date
fire_size['date'] = pd.to_datetime(fire_size['date'])

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  


In [125]:
fire_size = fire_size.merge(climate, left_on=['grid_id', 'date'], 
                     right_on=['grid_id', 'date'], how = 'left')

In [127]:
# check grid lon and lat to fire lat and lon
coord_check = fire_size[['lat', 'lon', 'glat', 'glon']]

Checking fire coordinates to grid coordinates.

In [128]:
coord_check.head()

Unnamed: 0,lat,lon,glat,glon
0,44.741389,-117.934167,45.0,-118.0
1,42.397222,-122.19,42.0,-122.0
2,34.756944,-93.436111,35.0,-93.0
3,36.373611,-112.333056,36.0,-112.0
4,45.821667,-102.401667,46.0,-102.0


In [129]:
coord_check.tail()

Unnamed: 0,lat,lon,glat,glon
298220,48.999444,-114.994722,49.0,-115.0
298221,48.999722,-117.438333,49.0,-117.0
298222,49.0,-120.582222,49.0,-121.0
298223,37.363889,-84.129722,37.0,-84.0
298224,49.0,-118.683333,49.0,-119.0


In [130]:
# write fire size file
#fire_size.to_csv('../data/model_data/1979-2016_fire_size.csv')