# WA State Wildfire List

My goal for this notebook is to download images of locations where there have been wildfires as well as locations where there have not.  The images will be labeled for the purpose of training a neural network. 

The data consists of 13,391 rows. Each row contains the info about 1 distinct wildfire, where it started, cause, etc. . . 

# Importing Libraries:

In [1]:
import pandas as pd
import numpy as np

# Plots and Graphs:
import seaborn as sns
import matplotlib.pyplot as plt
import plotly.express as px
import scikitplot as skplt 
%matplotlib inline

import geopandas as gpd
import folium 

import requests
from IPython.display import Image, display

import random

import urllib.request

import warnings
warnings.filterwarnings('ignore')

# Shows all columns
pd.set_option('display.max_columns', None)

# Importing Data:

In [2]:
df = pd.read_csv('DNR_Fire_Statistics_2008_-_Present.csv')

In [3]:
df.head()

Unnamed: 0,X,Y,OBJECTID,FIREEVENT_ID,INCIDENT_NO,INCIDENT_NM,INCIDENT_ID,COUNTY_LABEL_NM,FIRE_TWP_WHOLE_NO,FIRE_TWP_FRACT_NO,FIRE_RGE_WHOLE_NO,FIRE_RGE_FRACT_NO,FIRE_RGE_DIR_FLG,FIRE_SECT_NO,SITE_ELEV,FIREGCAUSE_LABEL_NM,FIRESCAUSE_LABEL_NM,BURNESCAPE_RSN_LABEL_NM,ACRES_BURNED,START_DT,START_TM,DSCVR_DT,DSCVR_TM,CONTROL_DT,CONTROL_TM,FIRE_OUT_DT,FIRE_OUT_TM,BURN_MERCH_AREA,BURN_REPROD_AREA,BURN_NONSTOCK_AREA,FIREEVNT_CLASS_CD,FIREEVNT_CLASS_LABEL_NM,SECTION_SUBDIV_PTS_ID,LAT_COORD,LON_COORD,RES_ORDER_NO,NON_DNR_RES_ORDER_NO,START_OWNER_AGENCY_NM,START_JURISDICTION_AGENCY_NM,PROTECTION_TYPE,REGION_NAME
0,-13335670.0,6191981.0,1,49996,26,LITTLE PEACOCK,49829,OKANOGAN,35,0,24,0,E,26,2927.0,Recreation,Camper,,0.01,2017/05/28 08:00:00+00,1800,2017/05/29 00:00:00+00,1335.0,2017/05/29 00:00:00+00,1420.0,2017/06/21 00:00:00+00,1328.0,,,,1,Classified,662491,48.50915,-119.79637,,,DNR,DNR,DNR Protection FFPA,NORTHEAST
1,-13460350.0,5765132.0,2,50035,7,Turkey Ranch,49868,KLICKITAT,5,0,15,0,E,22,2000.0,Debris Burn,,Extinguish,0.25,2017/05/23 08:00:00+00,1715,2017/05/23 00:00:00+00,1650.0,2017/05/23 00:00:00+00,1935.0,2017/05/25 00:00:00+00,1300.0,,,0.25,1,Classified,372894,45.904947,-120.916377,WA-SES-050,,Private,DNR,DNR Protection FFPA,SOUTHEAST
2,-13643230.0,5913875.0,4,5021,90,1050 Fire,5163,THURSTON,16,0,2,0,E,33,350.0,Lightning,,,9.68,2008/08/17 08:00:00+00,300,2008/08/17 00:00:00+00,300.0,2008/08/17 00:00:00+00,1210.0,2008/09/15 00:00:00+00,1500.0,,9.68,,1,Classified,435486,46.82695,-122.55925,WA-PCS-0090,,Private,DNR,DNR Protection FFPA,PACIFIC CASC
3,-13063970.0,6094150.0,5,7882,104,BEAR LAKE,8024,SPOKANE,28,0,43,0,E,15,1800.0,Lightning,,,0.1,2009/06/17 08:00:00+00,2015,2009/06/17 00:00:00+00,2030.0,2009/06/18 00:00:00+00,1221.0,2009/07/02 00:00:00+00,1234.0,0.1,,,1,Classified,552562,47.92356,-117.35562,,,Other Government,DNR,DNR Protection FFPA,NORTHEAST
4,-13685870.0,5932888.0,6,47157,68,HWY 101,46930,THURSTON,17,0,2,0,W,21,170.0,Recreation,Camper,,0.01,2016/07/25 08:00:00+00,1730,2016/07/25 00:00:00+00,1730.0,2016/07/25 00:00:00+00,1815.0,2016/07/26 00:00:00+00,1530.0,,,,1,Classified,560860,46.94369,-122.94224,WA-SPS-0097,,Private,DNR,DNR Protection FFPA,SO PUGET


In [4]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 13379 entries, 0 to 13378
Data columns (total 41 columns):
X                               13379 non-null float64
Y                               13379 non-null float64
OBJECTID                        13379 non-null int64
FIREEVENT_ID                    13379 non-null int64
INCIDENT_NO                     13379 non-null int64
INCIDENT_NM                     13379 non-null object
INCIDENT_ID                     13379 non-null int64
COUNTY_LABEL_NM                 13379 non-null object
FIRE_TWP_WHOLE_NO               13379 non-null int64
FIRE_TWP_FRACT_NO               13379 non-null int64
FIRE_RGE_WHOLE_NO               13379 non-null int64
FIRE_RGE_FRACT_NO               13379 non-null int64
FIRE_RGE_DIR_FLG                13379 non-null object
FIRE_SECT_NO                    13379 non-null int64
SITE_ELEV                       12176 non-null float64
FIREGCAUSE_LABEL_NM             13379 non-null object
FIRESCAUSE_LABEL_NM             1

# Setting aside a new DF:

Choosing only the first 10 rows with limited info in order to sort out collecting satellite imagery.

In [5]:
# Selecting columns
df_test = df[['INCIDENT_NM', 'START_DT', 'ACRES_BURNED', 'LAT_COORD', 'LON_COORD']]

# Renaming columns
df_test.rename(columns={"INCIDENT_NM": "name",
                        "START_DT": "date",
                        "ACRES_BURNED": "acres", 
                        "LAT_COORD": "lat",
                        "LON_COORD": "lon"}, inplace = True)

# Formatting the date
df_test['date'] = pd.to_datetime(df_test.date)
df_test.date = df_test.date.dt.strftime('%m/%d/%Y')
df_test['date'] = pd.to_datetime(df_test.date)

# Rounding coordinates to 6 places:
df_test['lat'] = df_test['lat'].map(lambda x: round(x, 6))
df_test['lon'] = df_test['lon'].map(lambda x: round(x, 6))

# For center coordinates in the URL
df_test['lat'] = df_test['lat'].map(lambda x: str(x))
df_test['lon'] = df_test['lon'].map(lambda x: str(x))
df_test['center'] = df_test[['lat', 'lon']].agg(','.join, axis=1)

# Setting it to only 5 rows
#df_test = df_test[:5]
# Showing the dataframe
df_test.head()

Unnamed: 0,name,date,acres,lat,lon,center
0,LITTLE PEACOCK,2017-05-28,0.01,48.50915,-119.79637,"48.50915,-119.79637"
1,Turkey Ranch,2017-05-23,0.25,45.904947,-120.916377,"45.904947,-120.916377"
2,1050 Fire,2008-08-17,9.68,46.82695,-122.55925,"46.82695,-122.55925"
3,BEAR LAKE,2009-06-17,0.1,47.92356,-117.35562,"47.92356,-117.35562"
4,HWY 101,2016-07-25,0.01,46.94369,-122.94224,"46.94369,-122.94224"


Next steps:
- Use Google Static Maps API to get the satellite data
- link [here](https://stackoverflow.com/questions/9087166/how-can-i-extract-a-satellite-image-from-google-maps-given-a-lat-long-rectangle)

## Collecting API Key:

In [6]:
# top secret ;) 

key = open('/Users/Thomas/Desktop/capstone/google_api/gmap_api_key.txt', 'r').read()

# URL Request Crafting:

## Setting Parameters:

In [7]:
img_size = '350x350' # Let's try this for now.  Gets multiplied by scale for resolution
# 350x350 seems to be about the minimum for image resolution
# ~100kb each, 20k images, 2 gigs for the batch, wf and nwf

img_format = 'jpg' # For compressability (is that a word?) png is 'png8' for 8 bit and 'png32' for 32 bit

map_scale = '1' # For scale parameter. 

maptype = 'satellite' # Obvious reason

zoom = '15' # try that to start

## URL:

In [8]:
# Chopping the URL into manageable pieces 

a = 'https://maps.googleapis.com/maps/api/staticmap?' # Base
b = 'center=' # Center 
# Enter Center
c = '&zoom=' # Zoom
# Enter Zoom
d = '&maptype=satellite' # Map type 
# No need to enter maptype - just keep satellite default
e = '&size=' # Image Size
# Enter image size
f = '&key='
# Enter key

# Creating the URL:
url1 = a + b
url2 = c + zoom + d + e + img_size + f + key
# URL = url1 + row['center'] + url2

## Collecting Test WF Images:

In [9]:
# Unecessary to keep showing images for now. . . 

for index, row in df_test.iterrows():
    url = url1 + row['center'] + url2
    r = requests.get(url)
    display(Image(r.content))

"\nfor index, row in df_test.iterrows():\n    url = url1 + row['center'] + url2\n    r = requests.get(url)\n    display(Image(r.content))\n"

## Looping and Downloading:

In [10]:
# Downloading the wf images to local file

for index, row in df_test.iterrows():
    url = url1 + row['center'] + url2
    urllib.request.urlretrieve(url, 
                               "/Users/Thomas/Desktop/capstone/images/wf_imgs/" 
                               + row['center'] 
                               + '.jpg')

'\nfor index, row in df_test.iterrows():\n    url = url1 + row[\'center\'] + url2\n    urllib.request.urlretrieve(url, \n                               "/Users/Thomas/Desktop/capstone/images/wf_imgs/" \n                               + row[\'center\'] \n                               + \'.jpg\')\n'

# Generating Random Lon and Lat Pairs:

Generating points for the neural network for 'non wildfire' areas.

In [11]:
print('Latitude Range: ', min(df.LAT_COORD), ' - ', max(df.LAT_COORD))
print('Longitude Range: ', min(df.LON_COORD), ' - ', max(df.LON_COORD))

Latitude Range:  45.557811  -  48.999629999999996
Longitude Range:  -124.66326000000001  -  -116.94345700000001


Note:  Changing min longitude to -124.0 due to values less than -124 pushing the satellite too far out into the ocean.

In [12]:
nwf_size = 10_000 # 10 thousand ought to be enough

new_lats = np.random.uniform(low = min(df.LAT_COORD), 
                             high = max(df.LAT_COORD), 
                             size = (nwf_size,))

new_lons = np.random.uniform(low = -124, 
                             high = max(df.LON_COORD), 
                             size = (nwf_size,))

d = {'lat':new_lats, 'lon': new_lons}
df_nwf = pd.DataFrame(data=d)

df_nwf['lat'] = df_nwf['lat'].map(lambda x: round(x, 6))
df_nwf['lon'] = df_nwf['lon'].map(lambda x: round(x, 6))

# For center coordinates in the URL
df_nwf['lat'] = df_nwf['lat'].map(lambda x: str(x))
df_nwf['lon'] = df_nwf['lon'].map(lambda x: str(x))
df_nwf['center'] = df_nwf[['lat', 'lon']].agg(','.join, axis=1)

df_nwf.head() #No Wild Fire

Unnamed: 0,lat,lon,center
0,48.390571,-122.046509,"48.390571,-122.046509"
1,47.755541,-118.816292,"47.755541,-118.816292"
2,48.738237,-117.654688,"48.738237,-117.654688"
3,45.587224,-122.794764,"45.587224,-122.794764"
4,48.930474,-119.113149,"48.930474,-119.113149"


## Collecting Test NWF Images:

In [13]:
# Unecessary to keep showing images for now. . . 

for index, row in df_nwf.iterrows():
    url = url1 + row['center'] + url2
    r = requests.get(url)
    display(Image(r.content))


"\nfor index, row in df_nwf.iterrows():\n    url = url1 + row['center'] + url2\n    r = requests.get(url)\n    display(Image(r.content))\n"

In [14]:
# Downloading the nwf images to local file

for index, row in df_nwf.iterrows():
    url = url1 + row['center'] + url2
    urllib.request.urlretrieve(url, 
                               "/Users/Thomas/Desktop/capstone/images/nwf_imgs/" 
                               + row['center'] 


'\nfor index, row in df_nwf.iterrows():\n    url = url1 + row[\'center\'] + url2\n    urllib.request.urlretrieve(url, \n                               "/Users/Thomas/Desktop/capstone/images/nwf_imgs/" \n                               + row[\'center\'] \n'

# Conclusion:

I now have ~22k images to use for my neural network. Now I'm ready to build and train the model! <br><br>
__Wildfire Images:__
- 12,133 (duplicates filtered out)<br>

__Non-Wildfire Images:__
- 10,019

__Image examples:__<br>
__Areas with wildfires:__
![text](example_images/wf1.jpg)
![text](example_images/wf2.jpg)
![text](example_images/wf3.jpg)

__Areas without wildfires:__
![text](example_images/nwf1.jpg)
![text](example_images/nwf2.jpg)
![text](example_images/nwf3.jpg)