My first Kernel, yay!

I thought it would be fun to create a literal 'heat map' of the US, creating a map of the US from the occurrances of wildfires, using Peter Kovesi's 'fire' color mapper (https://bokeh.github.io/colorcet/).

This notebook creates two maps: one that shows the number of wildfires for each geographical location, and one that shows the average size of the wildfires for each geographical location.

First we import what we need:

In [None]:
import sqlite3
import pandas as pd
import numpy as np
import colorcet as cc
from bokeh.io import output_notebook
from bokeh.plotting import figure, show
from bokeh.models import ColumnDataSource, LogColorMapper

Next, we extract the columns we need from the database and turn it into a Pandas dataframe:

In [None]:
cnx = sqlite3.connect('../input/FPA_FOD_20170508.sqlite')
df = pd.read_sql_query("SELECT LATITUDE, LONGITUDE, FIRE_SIZE, STATE FROM fires", cnx)
df.head(5)

Suppress some Pandas warning messages that I believe are actually not relevant (false positives):

In [None]:
pd.options.mode.chained_assignment = None

Remove all wildfires in Alaska, Hawaii and Puerto Rico, because they don't fit on our map nicely:

In [None]:
new = df.loc[(df.loc[:,'STATE']!='AK') & (df.loc[:,'STATE']!='HI') & (df.loc[:,'STATE']!='PR')]

Group wildfires together that occured near to each other. To do this, I truncated all latitude and longitude values, combined the values into a new attribute (LL_COMBO) and grouped the dataframe by this attribute.

In [None]:
new.loc[:,'LATITUDE'] = ((new.loc[:,'LATITUDE']*10).apply(np.floor))/10
new.loc[:,'LONGITUDE'] = ((new.loc[:,'LONGITUDE']*10).apply(np.floor))/10
new.loc[:,'LL_COMBO'] = new.loc[:,'LATITUDE'].map(str) + '-' + new.loc[:,'LONGITUDE'].map(str)
grouped = new.groupby(['LL_COMBO', 'LATITUDE', 'LONGITUDE'])

Create the datasource that is needed for the first heat maps (showing the number of wildfires per geographic location). 

In [None]:
number_of_wf = grouped['FIRE_SIZE'].agg(['count']).reset_index()
number_of_wf.head(5)

Create the datasource that is needed for the second heat map (showing the average size of wildfires per geographic location). 

In [None]:
size_of_wf = grouped['FIRE_SIZE'].agg(['mean']).reset_index()
size_of_wf.head(5)

Create and show the first heat map:

In [None]:
source = ColumnDataSource(number_of_wf)
p1 = figure(title="Number of wildfires occurring from 1992 to 2015 " + \
            "(lighter color means more wildfires)",
           toolbar_location=None, plot_width=600, plot_height=400)
p1.background_fill_color = "black"
p1.grid.grid_line_color = None
p1.axis.visible = False
color_mapper = LogColorMapper(palette=cc.fire)
glyph = p1.circle('LONGITUDE', 'LATITUDE', source=source,
          color={'field': 'count', 'transform' : color_mapper},
          size=1)
output_notebook()
show(p1)

Create and show the second heat map:

In [None]:
source = ColumnDataSource(size_of_wf)
p2 = figure(title="Average size of wildfires occurring from 1992 to 2015 " + \
            "(lighter color means bigger fire)",
           toolbar_location=None, plot_width=600, plot_height=400)
p2.background_fill_color = "black"
p2.grid.grid_line_color = None
p2.axis.visible = False
glyph = p2.circle('LONGITUDE', 'LATITUDE', source=source,
          color={'field': 'mean', 'transform' : color_mapper},
          size=1)
show(p2)

Conclusion:
I am not from the US, so am not really familiar with the geography, but it seems to me that there are more occurrances of wildfires in the areas where lots of people live (makes sense, right?) Also, these wildfires are smaller in size, probably because they are more quickly contained (fires are detected sooner and fire departments are closer by).