# Wildfires by Large River Basins (HUC-4) 

<figure style="display: inline-block; border: 1px dotted gray; margin: 20px;">
    <img alt="Rim Fire and American Fire large" src="https://live.staticflickr.com/8519/8597688091_85571d79ce_w.jpg" style="float:left; height: 500px; vertical-align: top;"/>
    <figcaption style="text-align: center; height: 400px; vertical-align: top" > Nevada forest fire fueled by invasive cheatgrass, 2007
        <br><a href="https://www.flickr.com/photos/usfwshq/8597688091">U.S. Fish and Wildlife Service Headquarters</a>, Public domain, via Flickr
        </figcaption>
</figure>


### Overview
* Over the past several decades, there has been an increase in large wildfires in the United States. 

* Risk for wildfires depends on environmental factors including the availability of fuel, weather, and local topography ((Prestemon et al., 2013; Moore, 2021; Nagy et al., 2018). 

* Human-caused ignitions are responsible for 84% of all wildfires and about half of the total area burned (Balch et al., 2017).

* This page will explore variation in the density and size of wildfires across large river basins in the United States (as defined by HUC-4 sub-regions from the USGS).

### Data Description
* This analysis includes fire data from the Fire Program Analysis fire-occurrence database (FPA FOD). The dataset includes 2.3 millions wildfire records.

* Watershed subregions are based on USGS's Watershed Boundary Dataset. The analysis used the HUC-4 level, which represents 245 subregions with shared hydrologic features (USGS, n.d).

* Fire data was joined to subregional watershed data and to state boundaries, respectively, to examine variations in fire patterns across these two geographical areas.

* The density and average size of fires by HUC-4 sub-region was calculated.

### Data Citation
* U.S. Geological Survey (2023). Watershed Boundary Dataset (v2.3.1), accessed October 8, 2023 at https://www.usgs.gov/national-hydrography/access-national-hydrography-products

* Short, Karen C. 2022. Spatial wildfire occurrence data for the United States, 1992-2020 [FPA_FOD_20221014]. 6th Edition. Fort Collins, CO: Forest Service Research Data Archive. https://doi.org/10.2737/RDS-2013-0009.6

In [1]:
# Import Packages
import os
import pathlib

import cartopy.crs as ccrs
import earthpy as et
import geopandas as gpd
import geoviews as gv
import holoviews as hv
import hvplot.pandas
import pandas as pd
import pyogrio
from bokeh.models import HoverTool

In [2]:
# Get url for watershed boundaries
wbd_url = (
    "https://prd-tnm.s3.amazonaws.com/StagedProducts/"
    "Hydrography/WBD/National/GDB/WBD_National_GDB.zip"
)

# Get data from url
wbd_path = et.data.get_data(url=wbd_url)

In [3]:
# Get HUC-4 layer
wbd4_gdb = gpd.read_file(
    os.path.join(wbd_path, "WBD_National_GDB.gdb"),
    driver="OpenFileGBD",
    layer="WBDHU4",
    from_disk=True,
)

In [4]:
# Define url for the fire data
fire_url = (
    "https://www.fs.usda.gov/rds/archive/products/"
    "RDS-2013-0009.6/RDS-2013-0009.6_Data_Format2_GDB.zip"
)

# Define directory for fire data
fire_dir = et.data.get_data(url=fire_url)

In [5]:
# Define path for fire geodatabase
fire_path = os.path.join(fire_dir, "Data", "FPA_FOD_20221014.gdb")

# Load data
if not "fire_gdf" in globals():
    fire_gdf = pyogrio.read_dataframe(fire_path, layer="Fires")

In [6]:
# Select variables
fire_clean_gdf = fire_gdf[
    [
        "FOD_ID",
        "LATITUDE",
        "LONGITUDE",
        "DISCOVERY_DATE",
        "NWCG_GENERAL_CAUSE",
        "FIRE_SIZE",
        "geometry",
    ]
].set_index("FOD_ID")

# Reformat Discovery Date as a Datetime
fire_clean_gdf.DISCOVERY_DATE = pd.to_datetime(fire_clean_gdf.DISCOVERY_DATE)

# Change CRS
fire_clean_gdf = fire_clean_gdf.to_crs(wbd4_gdb.crs)

In [7]:
# Join fire data to the watershed data
fire_reg_gdf = wbd4_gdb.sjoin(
    fire_clean_gdf, 
    how="inner", 
    predicate="intersects"
)

# Calculate maximum fire size and number of fires by year and watershed
fire_reg_gdf = (fire_reg_gdf.groupby(
    ["name", fire_reg_gdf.DISCOVERY_DATE.dt.year])
    .agg(max_fire_size=("FIRE_SIZE", "max"), 
         num_fires=("index_right", "count"))
)

In [8]:
# Compute area of each watershed
wbd4_gdb["area_ha"] = (
    wbd4_gdb.to_crs(9822).area
    / 10000  # Convert to hectares
)

# Join files by name
fire_count_df = (
    fire_reg_gdf
    .reset_index()
    [["name", "num_fires"]]
    .groupby("name").sum()
)
fire_density_gdf = (wbd4_gdb.set_index("name")
                    .join(fire_count_df)
                    [["num_fires", "area_ha", "geometry"]]
                    )

# Calculate density of fires per hectare
fire_density_gdf["fire_density_per_mha"] = (
    fire_density_gdf["num_fires"] / fire_density_gdf["area_ha"]
)

In [9]:
# Create gdf for Average Size of Fires
fire_avgsz_reg_gdf = wbd4_gdb.sjoin(
    fire_clean_gdf, 
    how="inner", 
    predicate="intersects"
)

# Calculate average fire size by region
fire_avgsz_reg_gdf = (fire_avgsz_reg_gdf.groupby(
    ["name", fire_avgsz_reg_gdf.DISCOVERY_DATE.dt.year])
    .agg(avg_fire_size=("FIRE_SIZE", "mean"))
)

# Summarize average fire size over years
fire_avg_df = (
    fire_avgsz_reg_gdf
    .reset_index()
    [["name", "avg_fire_size"]]
    .groupby("name").mean()
)

# Join back to geodata
fire_stats_gdf = (wbd4_gdb.set_index("name")
                    .join(fire_avg_df)
                    [["geometry", "avg_fire_size"]]
                    )

In [10]:
# Merge data for plots
merged_summ_gdf = fire_reg_gdf.reset_index().merge(fire_avgsz_reg_gdf.reset_index(), on=['name', 'DISCOVERY_DATE'])
merged_summ_gdf = merged_summ_gdf.set_index(['name', 'DISCOVERY_DATE'])

In [11]:
# Set values for ylabels and titles
labels = pd.DataFrame(dict(
    column_name = ['num_fires', "avg_fire_size", 'max_fire_size'],
    ylabel = ['Number of Fires', 'Average Size of Fires', 'Fire Size (million ha)'],
    title = ['Number of fires in the sub-region', 'Average fire size in the sub-region', 'Largest recorded fire in sub-region']))

def fire_plot(region_name, df=merged_summ_gdf, labels=labels):
    """
    Create a multi-panel plot for a region

    Parameters
    ----------
    region_name : str
      The name of the region to generate a plot for. Must exists 
      in the 'name' index of df.
    df : pd.DataFrame
      The dataframe with the data to plot. Columns much match
      an item in labels.column_name to be plotted
    labels : pd.DataFrame
      Plot labels. Must have a 'column_name', 'ylabel', and 'title'
      columns with str values. Each row will be a subplot.

    Returns
    -------
    plot : hv.core.layout.Layout
      A holoviews plot layout or similar. For use with hv.DynamicMap.
    """
    # Generate a subplot for each row in the labels
    subplots = []
    # Iterate through the labels row by row
    for i, labs in labels.iterrows():
        # Create subplot
        subplot = (
            df.xs(region_name, level='name')
            [[labs.column_name]]
            .hvplot(xlabel="Year", 
                    ylabel=labs.ylabel, 
                    title=labs.title,
                    height=250))
        subplots.append(subplot)

    # Stack the subplots vertically
    plot = hv.Layout(subplots).cols(1)
    return plot

# Create a dropdown menu to switch between regions
if not os.path.exists('/workspaces/pth6570.github.io/notebooks/wildfire_huc4_plots.html'):
  hv.save((
      hv.DynamicMap(
          # The plotting function for the two-panel fire history
          fire_plot,
          # Define the dimension for the dropdown
          kdims=[('subregion', 'Sub-Region')])
      # Add the explicit indexing - region names as a bokeh dimension
      .redim.values(subregion=merged_summ_gdf.reset_index().name)
  ), "wildfire_huc4_plots.html")


## Wildfire Density and Size by HUC-4 Sub-Regions
The number of wildfires, as well as the average and maximum fire size, per year in each HUC-4 subregion can be explored using [the following tool](https://pth6570.github.io/notebooks/wildfire_huc4_plots.html).

#### Wildfire Density in HUC-4 Subregions
* The map below shows the density of wildfires in HUC-4 Subregions.

* The highest density of wildfires is in the southwestern U.S., the southeastern U.S., as well as in the New York/New Jersey area. Specifically, the Lower Hudson-Long Island subregion has the highest density of wildfires.

In [12]:
# Simplify to reduce rendering time.
fire_density_gdf.geometry = (
    fire_density_gdf
    .geometry
    .simplify(tolerance=0.1)
)

not_contig_list = ["Arctic Alaska", "Northwest Alaska", "Lower Yukon River",
           "Southwest Alaska", "Middle Yukon River", "Upper Yukon River",
           "South Central Alaska", "Southeast Alaska", "Kauai", "Oahu",
           "Molokai", "Maui", "Lanai", "Hawaii", "Kahoolawe", "Puerto Rico"]

fire_density_contig_gdf = (fire_density_gdf
    .drop(["Arctic Alaska", "Northwest Alaska", "Lower Yukon River",
           "Southwest Alaska", "Middle Yukon River", "Upper Yukon River",
           "South Central Alaska", "Southeast Alaska", "Kauai", "Oahu",
           "Molokai", "Maui", "Lanai", "Hawaii", "Kahoolawe", "Puerto Rico"], axis="rows")
)

# Create chloropleth plot using fire density
den_plt = gv.Polygons(
    fire_density_contig_gdf
    .reset_index()
    .dropna()[["fire_density_per_mha", "geometry", "name"]]
)

# Customize plot options
hover_tool = HoverTool(tooltips=[
        ("name", "@name"),
        ("Density","@fire_density_per_mha{.01,5}")])

den_plt.opts(
    width=650,
    height=500,
    data_aspect=1,
    colorbar=True,
    cmap="inferno_r",
    tools=[hover_tool],
    projection=ccrs.PlateCarree(central_longitude=-121),
    title="Density of Wildfires (per hectare) by HUC-4 Subregions "
)

den_plt

In [13]:
# Split data in half by year range
fire_reg_unindex_gdf = fire_reg_gdf.reset_index()
fire_early_gdf = fire_reg_unindex_gdf [fire_reg_unindex_gdf ['DISCOVERY_DATE'] < 2006]
fire_later_gdf = fire_reg_unindex_gdf [fire_reg_unindex_gdf ['DISCOVERY_DATE'] >= 2006]

In [14]:
# # Compute area of each watershed
# wbd4_gdb["area_ha"] = (
#     wbd4_gdb.to_crs(9822).area
#     / 10000  # Convert to hectares
#     / 1000000  # Convert to million hectares
# )

# Join files by name
fire_early_count_df = (
    fire_early_gdf
    .reset_index()
    [["name", "num_fires"]]
    .groupby("name").sum()
)

# Calculate density for data 1992 to 2005
fire_density_early_gdf = (wbd4_gdb.set_index("name")
                    .join(fire_early_count_df)
                    [["num_fires", "area_ha", "geometry"]]
                    )

# Calculate density of fires per hectare
fire_density_early_gdf["fire_density_per_mha_early"] = (
    fire_density_early_gdf["num_fires"] / fire_density_early_gdf["area_ha"]
)

In [15]:
# Join files by name
fire_late_count_df = (
    fire_later_gdf
    .reset_index()
    [["name", "num_fires"]]
    .groupby("name").sum()
)

# Calculate density for data 2006 to 2020
fire_density_late_gdf = (wbd4_gdb.set_index("name")
                    .join(fire_late_count_df)
                    [["num_fires", "area_ha", "geometry"]]
                    )

# Calculate density of fires per hectare
fire_density_late_gdf["fire_density_per_mha"] = (
    fire_density_late_gdf["num_fires"] / fire_density_late_gdf["area_ha"]
)

In [16]:
# merge early and later sections of data
fire_density_diffyr_df = fire_density_early_gdf.merge(fire_density_late_gdf, how="left", on="name")

# Calculate differences
fire_density_diffyr_df['difference_ha'] = fire_density_diffyr_df['fire_density_per_mha'] - fire_density_diffyr_df['fire_density_per_mha_early']
fire_density_diffyr_df['difference_num'] = fire_density_diffyr_df['num_fires_x'] - fire_density_diffyr_df['num_fires_y']

# Rename variables
fire_density_diffyr_df = fire_density_diffyr_df.rename(columns={"geometry_x": "geometry"})

# Drop non-contiguous regions
fire_density_diffyr_contig_df = fire_density_diffyr_df.drop(["Arctic Alaska", "Northwest Alaska", "Lower Yukon River",
           "Southwest Alaska", "Middle Yukon River", "Upper Yukon River",
           "South Central Alaska", "Southeast Alaska", "Kauai", "Oahu",
           "Molokai", "Maui", "Lanai", "Hawaii", "Kahoolawe", "Puerto Rico"], axis="rows")

# Select columns for data
fire_density_diffyr_contig_df = (fire_density_diffyr_contig_df.reset_index()
                                 [["geometry", "name", "difference_ha", "difference_num"]]
)

#### Wildfire Density in HUC-4 Subregions over Time
* The map below shows the change in density of wildfires in HUC-4 Subregions, comparing the first half of the fire data (1992-2005) to the latter half (2006-2020).

* The largest decrease in density of fires has occurred in the southeast, including the following sub-regions: Edisto-Santi, Ogeechee-Savannah, Alabama, Pascagoula, and Pearl.

* The largest increases have occurred in the southwest, south-central, and northeast regions. The Lower Hudson-Long Island sub-region had the largest change.

* While the southeast had the largest decrease in density of wildfires, it also has the highest overall density (see figure above).

In [17]:
# Simplify to reduce rendering time.
fire_density_diffyr_contig_gdf = gpd.GeoDataFrame(fire_density_diffyr_contig_df, crs=wbd4_gdb.crs, geometry=fire_density_diffyr_contig_df.geometry)


fire_density_diffyr_contig_gdf.geometry = (
    fire_density_diffyr_contig_gdf
    .geometry
    .simplify(tolerance=0.1)
)


# Create chloropleth plot using fire density
diff_plt = gv.Polygons(
    fire_density_diffyr_contig_gdf
    .reset_index()
    .dropna()[["difference_ha", "geometry", "name"]]
)

# Customize plot options
from bokeh.models import HoverTool
hover_tool_diff = HoverTool(tooltips=[
        ("name", "@name"),
        ("Difference in Density of Fires per hectare","@difference_ha{.01,5}")])

diff_plt.opts(
    width=650,
    height=500,
    data_aspect=1,
    colorbar=True,
    cmap="coolwarm",
    tools=[hover_tool_diff],
    projection=ccrs.PlateCarree(central_longitude=-121),
    title=("Difference in density of forest files (per hectare)"
    "between earlier (1992 to 2005) and later (2006 to 2020) by Subregions")
)

# Set range for coloring
diff_plt = diff_plt.redim.range(difference_ha=(-.006,.006))
diff_plt

### Average Annual Fire Size by HUC-4 Subregion
* The HUC-4 subregions in the western half of the U.S. tend to have larger forest fires than those in the eastern half of the country.

* Average annual fire size is particularly high in Arkansas-Keystone, Loup, Platte, and North Canadian sub-regions.

In [18]:
# Simplify to reduce rendering time.
fire_stats_gdf.geometry = (
    fire_stats_gdf
    .geometry
    .simplify(tolerance=0.1)
)

fire_stats_contig_gdf = (fire_stats_gdf
    .drop(["Arctic Alaska", "Northwest Alaska", "Lower Yukon River",
           "Southwest Alaska", "Middle Yukon River", "Upper Yukon River",
           "South Central Alaska", "Southeast Alaska", "Kauai", "Oahu",
           "Molokai", "Maui", "Lanai", "Hawaii", "Kahoolawe", "Puerto Rico"], axis="rows")
)

# Create chloropleth plot using fire density
avg_size_plt = gv.Polygons(
    fire_stats_contig_gdf
    .reset_index()
    .dropna()[["avg_fire_size", "geometry", "name"]]
)

# Customize plot options
hover_tool = HoverTool(tooltips=[
        ("name", "@name"),
        ("Average Fire Size","@avg_fire_size")])

avg_size_plt.opts(
    width=650,
    height=500,
    data_aspect=1,
    colorbar=True,
    cmap="inferno_r",
    tools=[hover_tool],
    projection=ccrs.PlateCarree(central_longitude=-121),
    title="Average Size of Wildfires (1992-2020) by HUC-4 Subregions"
)

In [19]:
avg_size_plt

### References
Balch, J. K., Bradley, B. A., Abatzoglou, J. T., Nagy, R. C., Fusco, E. J., & Mahood, A. L. (2017). Human-started wildfires expand the fire niche across the United States. Proceedings of the National Academy of Sciences, 114(11), 2946-2951.

Environmental Protection Agency (2023, July 21). Climate Change Indicators: Wildfires. Environmental Protection Agency. Accessed October 10, 2023 from: https://www.epa.gov/climate-indicators/climate-change-indicators-wildfires#ref21

Mietkiewicz, N., Balch, J. K., Schoennagel, T., Leyk, S., St. Denis, L. A., & Bradley, B. A. (2020). In the line of fire: consequences of human-ignited wildfires to homes in the US (1992–2015). Fire, 3(3), 50.

Moore, A. (2021, Dec 3). Explainer: How Wildfires Start and Spread. NC State University. Accessed October 10, 2023 from: https://cnr.ncsu.edu/news/2021/12/explainer-how-wildfires-start-and-spread/

Nagy, R. C., Fusco, E., Bradley, B., Abatzoglou, J. T., & Balch, J. (2018). Human-related ignitions increase the number of large wildfires across US ecoregions. Fire, 1(1), 4.

Prestemon, J. P., & Prestemon, J. P. (2013). Wildfire ignitions: a review of the science and recommendations for empirical modeling (p. 24). Asheville, NC, USA: US Department of Agriculture, Forest Service, Southern Research Station.

In [20]:
%%capture
%%bash
jupyter nbconvert wildfires_by_huc4.ipynb --to html --no-input