# **Wildfires across ecoregions in the United States**

**Introduction**

In the Western United States, the frequency of large fires has been increasing . Fire ignitions, which are necessary for the start of new wildfires, are predominantly anthropogenic. Wildfire ignitions have increased in recent decades as human populations expand into more areas (Nagy et al., 2018). However, not all ignitions lead to large fires; ignitions during weather unfavorable to fire spread result in small fires that can be easily suppressed. Alternatively, ignitions can occur during weather conditions favorable for combustion and fire spread, which leads to rapid fire spread rates. In the absence of suppression, when suppression is limited by severe weather, or when fire spread overtakes suppression, rapid fire growth can lead to large fires.

A key question in understanding and managing wildfires in the United States is where large fire frequency is increasing, and what might be causing the increase. There is evidence that climate change is increasing the incidence of weather conditions favorable to fire spread in the western US (Goss et al., 2020, Barbero et al., 2015). Whether this trend is also occurring or will occur in the Eastern US is an outstanding question. As the Eastern US has a more humid climate, wildfires typically occur during periods of drought (dry spells). It is somewhat unclear how climate change will alter drought and its ecological impacts in the Eastern US (Djajapranata, 2023), and thus whether the risk of large fires will increase in the East in the future remains unknown. The goal of this analysis is to understand how large fire frequency has changed in the United States in different regions, which provides a base point from which to hypothesize about future alterations in wildfire regimes.


>**Sources**
>
>Barbero, R., Abatzoglou, J. T., Larkin, N. K., Kolden, C. A., & Stocks, B. (2015). Climate change presents increased potential for very large fires in the contiguous >United States. International Journal of Wildland Fire, 24(7), 892-899.
>
>Djajapranata, C. (2023, June 9). Are Wildfires Coming for the East Coast? Georgetown University. https://www.georgetown.edu/news/is-the-east-coast-the-next-wildfire-hotspot/
>
>Goss, M., Swain, D. L., Abatzoglou, J. T., Sarhadi, A., Kolden, C. A., Williams, A. P., & Diffenbaugh, N. S. (2020). Climate change is increasing the likelihood of extreme autumn wildfire conditions across California. Environmental Research Letters, 15(9), 094016.
>
>Nagy, R. C., Fusco, E., Bradley, B., Abatzoglou, J. T., & Balch, J. (2018). Human-related ignitions increase the number of large wildfires across US ecoregions. Fire, 1(1), 4.



In [146]:
import os
import warnings
import io
import zipfile

import cartopy.crs as ccrs
import earthpy as et
import geopandas as gpd
import geoviews as gv
import holoviews as hv
import hvplot.pandas
import pandas as pd
import pyogrio
import matplotlib.pyplot as plt 
import numpy as np
import warnings

import requests
from scipy.stats import percentileofscore

gv.extension('bokeh')

warnings.filterwarnings('ignore')

### Ecoregions dataset

Bailey's ecoregions dataset contains data on ecoregions defined by the U.S. Forest Service.

>**Citation**
>
> (dataset) (2022). Bailey's Ecoregions and Subregions Dataset. U.S. Forest Service. https://data.nal.usda.gov/dataset/baileys-ecoregions-and-subregions-dataset-0. Accessed 2023-10-19.

In [147]:
# Pull geoJSON directly from data.gov
ecoregions_gdf = gpd.read_file("https://data-usfs.hub.arcgis.com/datasets/usfs::baileys-ecoregions-and-subregions-dataset.geojson?outSR=%7B%22latestWkid%22%3A4269%2C%22wkid%22%3A4269%7D")

# ecoregions_gdf

Test plot to check if we correctly imported the data.

In [148]:
# ecoregions_gdf.hvplot()

## Fire occurrence data

The spatial wildfire occurrence dataset is generated by the United States Forest Service. The dataset contains point locations, final fire size, and discovery date for all recorded wildfires that occurred in the United States from 1992 to 2020. 

>**Data Citation**
>
>Short, Karen C. 2022. Spatial wildfire occurrence data for the United States, 1992-2020 [FPA_FOD_20221014]. 6th Edition. Fort Collins, CO: Forest Service Research Data >Archive. https://doi.org/10.2737/RDS-2013-0009.6

In [149]:
# Download wildfire occurance data from geodatabase
fire_url = (
    "https://www.fs.usda.gov/rds/archive/products/RDS-2013-0009.6"
    "/RDS-2013-0009.6_Data_Format2_GDB.zip"
)
user_agent = (
    'Mozilla/5.0 (X11; Linux x86_64; rv:60.0) '
    'Gecko/20100101 Firefox/81.0'
)
r = requests.get(url=fire_url, headers={'User-Agent': user_agent})

# Unzip data
data_path = os.path.join(
    et.io.HOME, et.io.DATA_NAME, 
    'earthpy-downloads', 'RDS-2013-0009.6_Data_Format2_GDB')
archive = zipfile.ZipFile(io.BytesIO(r.content))
archive.extractall(path=data_path)

In [150]:
fire_path = os.path.join(data_path, 'Data', 'FPA_FOD_20221014.gdb')

if not 'fire_gdf' in globals():
    print('fire_gdf does not exist. Loading...')
    fire_gdf = pyogrio.read_dataframe(fire_path, layer='Fires')
# fire_gdf.head()

# fire_gdf.info()

In [151]:
# Clean the fire dataset
fire_clean_gdf = (
    fire_gdf
    [['FOD_ID', 
    'DISCOVERY_DATE', 'FIRE_SIZE', 'geometry']]
    .set_index('FOD_ID')
)
# Convert discovery date to the datetime format
fire_clean_gdf.DISCOVERY_DATE = pd.to_datetime(fire_clean_gdf.DISCOVERY_DATE)

# Convert the fire_clean_gdf to the same CRS as the ecoregions
fire_clean_gdf = fire_clean_gdf.to_crs(ecoregions_gdf.crs)

# fire_clean_gdf

In [152]:
# fire_clean_gdf.plot()

## Visualize fire density across ecoregions

In [153]:
# Spatial join the fires with the ecoregions
fire_region_gdf = ( 
    ecoregions_gdf
    [['ECO_US_', 'geometry']]
    .sjoin(
        fire_clean_gdf,
        how='inner', # only include rows for fires that are within an ecoregion boundary
        predicate='intersects')
)

fire_summary_gdf = (
    fire_region_gdf
        .groupby(['ECO_US_', 
                  fire_region_gdf.DISCOVERY_DATE.dt.year
                  ])
        .agg(
            max_fire_size=('FIRE_SIZE', 'max'),
            num_fires=('index_right', 'count'))
)
# fire_summary_gdf

In [154]:
# Calculate area of ecoregions using the Albers equal area CRS
ecoregions_gdf['area_ha'] = (
    ecoregions_gdf.to_crs(9822).area
    # Convert to hectares
    / 10000
    # Convert to millions of hectares
    # / 1000000
)

# Create gdf for fire density
fire_density_gdf = (
    ecoregions_gdf.set_index('ECO_US_')
    .join(fire_summary_gdf)
    [['num_fires', 'area_ha', 'geometry']]
)

# Create new column for fire density (number of fires in the ecoregion / total area of ecoregion)
fire_density_gdf['fire_density_per_ha'] = (
    fire_density_gdf.num_fires / fire_density_gdf.area_ha)
    
# fire_density_gdf

In [155]:
# fire_density_gdf.dropna()

# Create histogram with specified bin width
# plt.hist(data, edgecolor='black', bins=np.arange(min(data), max(data) + w, w))

# fire_density_gdf.hist(["fire_density_per_ha"], edgecolor='black', bins=50)

In [156]:
fire_density_gdf.geometry = fire_density_gdf.geometry.simplify(tolerance=0.1)
# fire_density_gdf.index.names

In [157]:
fire_density_2020_gdf = fire_density_gdf.loc[(slice(None), 2020), :]

for index, row in fire_density_2020_gdf.iterrows():
    percentile_score = percentileofscore(fire_density_2020_gdf['fire_density_per_ha'], row['fire_density_per_ha'])
    fire_density_2020_gdf.loc[index, 'percentile_score'] = percentile_score

# fire_density_2020_gdf

In [158]:
fire_density_2000_gdf = fire_density_gdf.loc[(slice(None), 2000), :]

for index, row in fire_density_2000_gdf.iterrows():
    percentile_score = percentileofscore(fire_density_2000_gdf['fire_density_per_ha'], row['fire_density_per_ha'])
    fire_density_2000_gdf.loc[index, 'percentile_score'] = percentile_score

# fire_density_2000_gdf

In [159]:
center_lon = -98.5796
center_lat = 39.8282

poly_plot = (
    gv.Polygons(
        fire_density_2000_gdf
        .reset_index(),
        vdims=['percentile_score', 'num_fires', 'ECO_US_'])
    .opts(
        width=600, height=600,
        colorbar=True, color='percentile_score',
        cmap='plasma_r',
        # line_color='white',
        xaxis='bare', yaxis='bare', tools=['hover'],
        title='Percentiles of fire density per ha in 2000 for US ecoregions'

    )
)

(gv.tile_sources.EsriImagery * poly_plot).opts(
        data_aspect=1,
        width=800
        )

In [160]:
poly_plot = (
    gv.Polygons(
        fire_density_2020_gdf
        .reset_index(),
        vdims=['percentile_score', 'num_fires', 'ECO_US_'])
    .opts(
        width=600, height=600,
        colorbar=True, color='percentile_score',
        cmap='plasma_r',
        # line_color='white',
        xaxis='bare', yaxis='bare', tools=['hover'],
        title='Percentiles of fire density per ha in 2020 for US ecoregions'

    )
)

(gv.tile_sources.EsriImagery * poly_plot).opts(
        data_aspect=1,
        width=800
    )

From these maps we can see that in 2000, ecoregions in the Southeastern United States had greater fire densities relative to other ecoregions in that year.

However in 2020, ecoregions in California and the Northeast had greater fire densities relative to other ecoregions in that year.

Further analysis could explore if there has been a greater increase over time in the fire density of certain ecoregions compared to other ecoregions in the U.S.

In [163]:
%%capture
%%bash
jupyter nbconvert wildfire.ipynb --to html --no-input