## Wildfire Trends in California - Exploratory Data Analysis

Here, we will use the cleaned dataset to perform exploratory data analysis, extract meaningul observations, build intelligent visualisations to showcase the data and associated patterns.

In [58]:
import pandas as pd
import numpy as np
import plotly.graph_objects as go
import plotly.express as px
from plotly.subplots import make_subplots

In [59]:
cleaned_df = pd.read_csv("Wildfire-Analysis-Cleaned.csv")
cleaned_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1442 entries, 0 to 1441
Data columns (total 15 columns):
 #   Column             Non-Null Count  Dtype  
---  ------             --------------  -----  
 0   AcresBurned        1439 non-null   float64
 1   AdminUnit          1442 non-null   object 
 2   ArchiveYear        1442 non-null   int64  
 3   Counties           1442 non-null   object 
 4   Extinguished       1383 non-null   object 
 5   Fatalities         21 non-null     float64
 6   Latitude           1442 non-null   float64
 7   Longitude          1442 non-null   float64
 8   MajorIncident      1442 non-null   bool   
 9   Name               1442 non-null   object 
 10  PersonnelInvolved  186 non-null    float64
 11  Started            1442 non-null   object 
 12  WaterTenders       136 non-null    float64
 13  fire_duration      1383 non-null   float64
 14  AdminUnitCleaned   1441 non-null   object 
dtypes: bool(1), float64(7), int64(1), object(6)
memory usage: 159.3+ KB


### Severity of Wildfires

To start our exploratory data analysis, lets compute the yearly statistics using the `ArchiveYear` column and determine the severity of wildfires each year and the necessary efforts involved to contain them.

In [60]:
cleaned_year_df = cleaned_df.groupby(['ArchiveYear']).sum(numeric_only=True)
unique_years = cleaned_year_df.index.tolist()
print(f"The unique years in the dataset are {unique_years}")
cleaned_year_df.head()

The unique years in the dataset are [2013, 2014, 2015, 2016, 2017, 2018, 2019]


Unnamed: 0_level_0,AcresBurned,Fatalities,Latitude,Longitude,MajorIncident,PersonnelInvolved,WaterTenders,fire_duration
ArchiveYear,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2013,500555.0,0.0,5176.147908,-16805.590818,41,18017.0,388.0,484.320833
2014,305145.0,0.0,2677.185631,-8539.742163,30,4289.0,40.0,726.739583
2015,357623.0,1.0,3533.320825,-11223.175616,40,4509.0,81.0,575.571528
2016,456975.0,0.0,5658.129142,-18199.61512,43,4125.0,63.0,1400.629167
2017,1692756.0,75.0,15932.341499,-51652.457971,92,8468.0,85.0,71330.684722


Lets see the severity of these fires each year by plotting the total acres of land burned and the number of incidents reported against each calendar year.

In [61]:
fig1 = go.Figure()
total_acres = cleaned_year_df['AcresBurned'].values
total_incidents = cleaned_df.groupby(['ArchiveYear']).size().values
color_intensities = ['#fbe7c6', '#f5b461', '#ec7b3a', '#d33f21', '#a50026']
fig1.add_trace(go.Bar(
    x=unique_years,
    y=total_acres,
    name="Total Acres Burned",
    marker=dict(
        color=total_acres,
        colorscale=color_intensities,
        showscale=True,
        colorbar=dict(
            title='Total Acres Burned',
            x=1.12,
            xanchor='left'
        )
    ),
    opacity=0.78
))

fig1.add_trace(go.Scatter(
    x=unique_years,
    y=total_incidents,
    name="Number of Wildfire Incidents",
    yaxis="y2",
    mode="lines+markers",
    marker=dict(size=8, color='red', line=dict(width=2, color='black'))
))

fig1.update_layout(
    title="Total acres of land burned due to wildfires in California",
    xaxis=dict(title="Year"),
    yaxis=dict(
        title="Total Acres of Land Burned",
        color='black',
        showgrid=False,
        showline=True,
        zeroline=True
    ),
    yaxis2=dict(
        title="Total Number of Wildfire Incidents",
        overlaying="y",
        side="right",
        color='black',
        showgrid=False,
        showline=True,
        zeroline=True
    ),
    legend=dict(x=0.01, y=0.99),
    template="plotly_white",
    bargap=0.2,
    width=1200,
    height=700
)

fig1.show()

In [62]:
percent2018_over2017_incidents = (total_incidents[5] / total_incidents[4]) * 100
percent2018_over2017_acresBurned = (total_acres[5] / total_acres[4]) * 100

print(f"Percentage of number of incidents in year 2018 to that of year 2017: {percent2018_over2017_incidents:.2f}%")
print(f"Percentage of land burned in year 2018 to that of year 2017: {percent2018_over2017_acresBurned:.2f}%")

Percentage of number of incidents in year 2018 to that of year 2017: 71.63%
Percentage of land burned in year 2018 to that of year 2017: 197.31%


#### Observation 1

 >
 > During years 2017 and 2018, a large amount of land area was burned during the wildfires, compared to any other years.
 >

#### Observation 2

 >
 > The total number of incidents in 2018 was **28%** less than the number of incidents in 2017.
 >
 > The total land burned in 2018 was **97%** more than the total land burned due to wildfires in 2017.
 >

 This tells us that the wildfires had much more devastating impact in 2018 compared to 2017.

### Major vs Minor Incidents over the years

Lets further understand severity and occurrence of wildfires by observing the spread of incident types across these years.

In [63]:
fig2 = px.bar(
    cleaned_df,
    x='ArchiveYear',
    y='AcresBurned',
    color='MajorIncident',
    color_discrete_sequence=["#12C617", "#F60700"],
    barmode='group',
    labels={'AcresBurned': 'Total Acres Burned', 'ArchiveYear': 'Year'},
    title="Total land area burned due to wildfires in California"
)

fig2.update_layout(
    xaxis=dict(tickangle=45, title="Year", showgrid=False, showline=True, zeroline=True),
    yaxis=dict(title="Total acres of land burned", showgrid=False, showline=True, zeroline=True),
    template="plotly_white",
    width=1200,
    height=700
)
fig2.show()

In [64]:
LandBurned_2016_MajorIncident = cleaned_df[(cleaned_df['MajorIncident'] == True) & (cleaned_df['ArchiveYear'] == 2016)]['AcresBurned'].sum()
LandBurned_2017_MajorIncident = cleaned_df[(cleaned_df['MajorIncident'] == True) & (cleaned_df['ArchiveYear'] == 2017)]['AcresBurned'].sum()
LandBurned_2018_MajorIncident = cleaned_df[(cleaned_df['MajorIncident'] == True) & (cleaned_df['ArchiveYear'] == 2018)]['AcresBurned'].sum()

LandBurned_2016_Total = cleaned_df[(cleaned_df['ArchiveYear'] == 2016)]['AcresBurned'].sum()
LandBurned_2017_Total = cleaned_df[(cleaned_df['ArchiveYear'] == 2017)]['AcresBurned'].sum()
LandBurned_2018_Total = cleaned_df[ (cleaned_df['ArchiveYear'] == 2018)]['AcresBurned'].sum()

PrecentLandBurned_2016 = np.divide(LandBurned_2016_MajorIncident,LandBurned_2016_Total)*100
PrecentLandBurned_2017 = np.divide(LandBurned_2017_MajorIncident,LandBurned_2017_Total)*100
PrecentLandBurned_2018 = np.divide(LandBurned_2018_MajorIncident,LandBurned_2018_Total)*100

print(f" Land burned in Major Incidents vs Total Land Burned in 2016: {PrecentLandBurned_2016:.2f}%")
print(f" Land burned in Major Incidents vs Total Land Burned in 2017: {PrecentLandBurned_2017:.2f}%")
print(f" Land burned in Major Incidents vs Total Land Burned in 2018: {PrecentLandBurned_2018:.2f}%")

 Land burned in Major Incidents vs Total Land Burned in 2016: 63.82%
 Land burned in Major Incidents vs Total Land Burned in 2017: 71.84%
 Land burned in Major Incidents vs Total Land Burned in 2018: 87.35%


#### Observation 3

> As anticipated, nearly 87% of the land burned in California in 2018 was due to major wildfire incidents. Notably, since 2016, major incidents have accounted for over 50% of all wildfire activity each year—a striking trend that underscores a growing cause for concern. 

#### Observation 4

This concern might also have been noted by the government and remedial action might have taken as there was a significant budget upgrade after 2018.
In fact, the following excerpt from [California state Budget Summary, Pg. 3](http://www.ebudget.ca.gov/2019-20/pdf/Enacted/BudgetSummary/FullBudgetSummary.pdf) for 2019-2020 confirms it.

> The Budget includes critical investments needed to sustain and improve California’s
emergency preparedness, response, and recovery capabilities. This includes
$240.3 million to augment the California Department of Forestry and Fire Protection's
(CAL FIRE's) firefighting capabilities by adding 13 additional year‑round engines,
replacing Vietnam War-era helicopters, deploying new air tankers, and investing in
technology and data analytics that support CAL FIRE's initial fire suppression strategies.
The Budget also provides a sizable investment in forest management to increase fire
prevention and complete additional fuel reduction projects, including increased
prescribed fire crews. 

The low amount of wildfire incidents reported in 2019 also supports this hypothesis.

### Personnel involved in Firefighting

Now that we have seen how the severity of the wilfires have evolved over the years, the next question we need to address is - What about the personnel involved in putting out the fires ?

In [65]:
ratio = np.divide(cleaned_year_df['WaterTenders'], cleaned_year_df['PersonnelInvolved']) * 100
years = cleaned_year_df.index.astype(str)
fig3 = make_subplots(rows=1, cols=3, subplot_titles=[
    "Personnel Involved",
    "Water Tenders Involved",
    "Water Tenders to Personnel Ratio (%)"
], shared_yaxes=True)

fig3.add_trace(go.Bar(
    x=cleaned_year_df['PersonnelInvolved'],
    y=years,
    orientation='h',
    name="Personnel",
    marker=dict(color='rgba(255,80,80,0.75)'),
), row=1, col=1)

fig3.add_trace(go.Bar(
    x=cleaned_year_df['WaterTenders'],
    y=years,
    orientation='h',
    name="Water Tenders",
    marker=dict(color='rgba(222,45,38,0.75)'),
), row=1, col=2)

fig3.add_trace(go.Bar(
    x=ratio,
    y=years,
    orientation='h',
    name="Ratio (%)",
    marker=dict(color='rgba(200,30,30,0.75)'),
), row=1, col=3)

fig3.update_layout(
    height=700,
    width=1200,
    title_text="Personnel and Water Tenders in Fire Fighting Over the Years",
    showlegend=False,
    template="plotly_white",
    margin=dict(t=60, b=40, l=40, r=20),
    plot_bgcolor='white'
)

fig3.update_xaxes(title_text="Number of Personnel Involved", row=1, col=1, showgrid=False, showline=True, zeroline=True)
fig3.update_xaxes(title_text="Number of Water Tenders Involved", row=1, col=2, showgrid=False, showline=True, zeroline=True)
fig3.update_xaxes(title_text="Ratio of Water Tenders to Personnel (%)", row=1, col=3, showgrid=False, showline=True, zeroline=True)
fig3.update_yaxes(title_text="Years", row=1, col=1, showgrid=False, showline=True, zeroline=True)
fig3.show()

In [66]:
# Calculate the number of personnel more in year 2013 than in 2018
percent2018_over2017_personnel = (cleaned_year_df[cleaned_year_df.index == 2018]['PersonnelInvolved'].values[0]/cleaned_year_df[cleaned_year_df.index == 2013]['PersonnelInvolved'].values[0])*100
print(f"Personnel more in 2013 compared to 2018: {percent2018_over2017_personnel:.2f}%")

Personnel more in 2013 compared to 2018: 76.42%


#### Observation 5

From the above visualization, it is reasonable to conclude that the advancements in wildfire handling technology from 2013 to 2018 has led to lower number of personnel being capable of containing larger and expansive wildfires. 

> In other words, twice the number of wildfire incidents were handled by three-fourths of the workforce in 2018 than in 2013.

### Geographical distribution of these wildfires

Now the next question to ask naturally is how are these numbers distributed across California ? Are there any wildfire hotspots or are they evenly spread across all the counties?

We can get this information by looking into how many wildfires are being handled by each administrative unit on every county. Let us begin by looking at the top 20 adminstrative divisions of CAL FIRE that handles the most wildfires.

In [67]:
admin_index = cleaned_df["Counties"].value_counts().index
admin_counts = cleaned_df["Counties"].value_counts().values
admin_counts

array([125,  83,  61,  59,  58,  55,  52,  52,  50,  46,  43,  42,  40,
        34,  33,  33,  31,  31,  28,  27,  25,  25,  24,  23,  23,  22,
        22,  20,  20,  17,  17,  17,  16,  16,  15,  13,  13,  12,  11,
        11,  11,  10,  10,  10,   7,   6,   6,   5,   5,   5,   4,   4,
         3,   3,   2,   2,   2,   1,   1])

In [68]:
fig4 = make_subplots(rows=1, cols=1)
fig4.add_trace(
    go.Box(
        x=admin_counts,
        name="Boxplot",
        boxpoints="outliers",
        orientation='h'
    ),
    row=1, col=1
)
fig4.update_layout(
    title_text="Distribution plot for handling wildfire units by Administrative units",
    width=1200,
    height=500,
    showlegend=False,
    plot_bgcolor='white'
)
fig4.update_xaxes(title_text="Number of wildfires handled", row=1, col=1)
fig4.show()

#### Observation 6

Most administrative units handle approximately 15-20 wildfires over the entire 6 years with the wildfire prone counties handling between 80 and 120 wildfires over 6 years.

In [69]:
zonecount_df = pd.DataFrame({
    "AdministrativeUnits": admin_index,
    "WildfiresHandled": admin_counts
})
fig5 = px.bar(
    zonecount_df,
    x="WildfiresHandled",
    y="AdministrativeUnits",
    orientation='h',
    color="WildfiresHandled",
    color_continuous_scale="Spectral",
    title="Number of Wildfire Incidents Handled Based on Administrative Zones"
)

fig5.update_layout(
    height=700,
    width=1200,
    xaxis=dict(title="Number of Wildfire Incidents", showgrid=False, showline=True, zeroline=True),
    yaxis=dict(title="Administrative Zone", showgrid=False, showline=True, zeroline=True),
    plot_bgcolor='white',
)
fig5.show()

#### Observation 7

Looking at the wildfires handled by different administrative zones California Fire Department, it is clear that `Riverside`, `San Luis Obispo`, `Butte` and `San Bernardino` are the top four administrative zones dealing with a lot of wildfire incidents.

Next, we will use the latitude and longitude data to answer the question we started with - how are these wildfires geographically distributed ?

In [70]:
fig6 = make_subplots(
    rows=2, cols=2,
    subplot_titles=[
        "Wildfires with size as total acres burned",
        "Major Incident Wildfires",
        "Fatalities",
        "Personnel Involved"
    ],
    shared_xaxes=True,
    shared_yaxes=True
)

x_range = [30.45, 43.05]
y_range = [-124.55, -115.80]
color_col = "ArchiveYear"
colorscale = "Rainbow"

scatter_one_df = cleaned_df.dropna(subset=['Latitude', 'Longitude', 'AcresBurned', 'ArchiveYear'])
scatterOne = px.scatter(scatter_one_df, x="Latitude", y="Longitude", color="ArchiveYear", size=scatter_one_df["AcresBurned"] * 100, color_continuous_scale=colorscale, opacity=0.8)
for trace in scatterOne.data:
    trace.showlegend = False
    fig6.add_trace(trace, row=1, col=1)

# Build second scatter chart - Major Incident vs latitude and longitude
scatter_two_df = cleaned_df.dropna(subset=['Latitude', 'Longitude', 'MajorIncident', 'ArchiveYear'])
scatterTwo = px.scatter(scatter_two_df, x="Latitude", y="Longitude", color="ArchiveYear", size=scatter_two_df["MajorIncident"] * 10, color_continuous_scale=colorscale, opacity=0.3)
for trace in scatterTwo.data:
    trace.showlegend = False
    fig6.add_trace(trace, row=1, col=2)

# Build third scatter chart - Fatalities vs latitude and longitude
scatter_three_df = cleaned_df.dropna(subset=['Latitude', 'Longitude', 'Fatalities', 'ArchiveYear'])
scatterThree = px.scatter(scatter_three_df, x="Latitude", y="Longitude", color="ArchiveYear", size=scatter_three_df["Fatalities"] * 50, color_continuous_scale=colorscale, opacity=0.7)
for trace in scatterThree.data:
    trace.showlegend = False
    fig6.add_trace(trace, row=2, col=1)

# Build fourth scatter chart - Personnel Involved vs latitude and longitude
scatter_four_df = cleaned_df.dropna(subset=['Latitude', 'Longitude', 'PersonnelInvolved', 'ArchiveYear'])
scatterFour = px.scatter(scatter_four_df, x="Latitude", y="Longitude", color="ArchiveYear", size=scatter_four_df["PersonnelInvolved"] / 2, color_continuous_scale=colorscale, opacity=0.5)
for trace in scatterFour.data:
    trace.showlegend = False
    fig6.add_trace(trace, row=2, col=2)

for i in range(1, 3):
    for j in range(1, 3):
        fig6.update_xaxes(range=x_range, title_text="Latitude", row=i, col=j, showgrid=False, showline=True, zeroline=True)
        fig6.update_yaxes(range=y_range, title_text="Longitude", row=i, col=j, showgrid=False, showline=True, zeroline=True)

fig6.update_layout(
    height=900,
    width=1200,
    title_text="Wildfire Incident Scatter Matrix by Feature",
    showlegend=False,
    plot_bgcolor='white'
)
fig6.show()

In [71]:
cleaned_df[ (cleaned_df['ArchiveYear'] >=  2017)].count()/cleaned_df.count() *100

AcresBurned          68.380820
AdminUnit            68.446602
ArchiveYear          68.446602
Counties             68.446602
Extinguished         67.100506
Fatalities           95.238095
Latitude             68.446602
Longitude            68.446602
MajorIncident        68.446602
Name                 68.446602
PersonnelInvolved    43.548387
Started              68.446602
WaterTenders         41.176471
fire_duration        67.100506
AdminUnitCleaned     68.424705
dtype: float64

The four plots above offer several key insights into the dataset. Notably, they reinforce the conclusion from Observation 3 regarding the increasing severity of wildfires—highlighting that 68.3% of all incidents and nearly 95% of the total fatalities have occurred since 2017, with a significant concentration in the southeastern region of California, particularly within the Riverside Unit (RRU).

Having established that, the next step is to determine: **How many wildfire incidents in 2018 were managed by the Riverside administrative unit?**

In [72]:
unique_years = cleaned_df['ArchiveYear'].unique()
fig7 = make_subplots(
    rows=3, cols=3,
    subplot_titles=[f"Wildfires handled in year {year}" for year in unique_years],
    shared_xaxes=True
)

rows, cols = 3, 3
current_row = 1
current_col = 1

fire_counties_df = cleaned_df.groupby("Counties").head(100)
for idx, year in enumerate(unique_years):
    total_fire_year = cleaned_df[cleaned_df["ArchiveYear"] == year].count().values[0]
    fires_by_county = fire_counties_df[fire_counties_df["ArchiveYear"] == year]["Counties"].value_counts()
    percentages = (fires_by_county.values[0: 5] / total_fire_year) * 100
    counties = fires_by_county.index.unique()[0: 5]
    fig7.add_trace(
        go.Bar(
            x=percentages,
            y=counties,
            orientation='h',
            marker=dict(color=['red','blue','green','orange','magenta']),
            opacity=0.7
        ),
        row=current_row,
        col=current_col
    )

    current_col += 1
    if current_col > cols:
        current_row += 1
        current_col = 1

fig7.update_layout(
    height=700,
    width=1200,
    title_text="Top 5 Counties by Wildfire Incidents per Year (as % of total that year)",
    showlegend=False,
    template="plotly_white"
)

for c in range(1, cols + 1):
    fig7.update_xaxes(title_text="Percentage of wildfires handled", row=3, col=c)

for r in range(1, rows + 1):
    for c in range(1, cols + 1):
        fig7.update_xaxes(showgrid=False, showline=True, zeroline=True)
        fig7.update_yaxes(showgrid=False, showline=True, zeroline=True)

fig7.show()

#### Observation 8

- `San Diego` administrative unit has been in the top 5 wildfire handling units over all 7 years.
- `Riverside` has handled wildfires only in the years 2013, 2017 and 2018, but the number of wildfires handled were significantly higher.

**Worst Wildfire(s)**

In [73]:
worst_wildfires = cleaned_df[(cleaned_df.Fatalities > 20) & (cleaned_df.AcresBurned > 20000)].copy().filter(items=["AcresBurned", "Counties", "ArchiveYear", "Fatalities", "MajorIncident", "Started", "fire_duration", "PersonnelInvolved"])
worst_wildfires

Unnamed: 0,AcresBurned,Counties,ArchiveYear,Fatalities,MajorIncident,Started,fire_duration,PersonnelInvolved
466,36807.0,Napa,2017,22.0,True,2017-10-08 21:45:00+00:00,123.4875,
467,36807.0,Sonoma,2017,22.0,True,2017-10-08 21:45:00+00:00,123.4875,
891,153336.0,Butte,2018,85.0,True,2018-11-08 06:33:00+00:00,17.060417,1065.0


#### Observation 9

The Worst wildfire occured in the Butte Counties on November, 2018 with 85 fatalities.

In [74]:
longest_wildfires = cleaned_df[(cleaned_df.fire_duration > 365)].copy().filter(items=["AcresBurned", "Counties", "ArchiveYear", "Fatalities", "MajorIncident", "Started", "fire_duration", "PersonnelInvolved"])
longest_wildfires

Unnamed: 0,AcresBurned,Counties,ArchiveYear,Fatalities,MajorIncident,Started,fire_duration,PersonnelInvolved
455,281893.0,Santa Barbara,2017,,True,2017-12-04 18:28:00+00:00,464.705556,
456,281893.0,Ventura,2017,,True,2017-12-04 18:28:00+00:00,464.705556,
499,4016.0,Butte,2017,,False,2017-08-29 13:16:00+00:00,366.090972,


#### Observation 10

Fires are either dealt-with relatively fast within 20 days or as large as upto 200 days. The longest being around ~ 450 days (Almost a year and half!!)

The longest fire was Thomas Fire in the Los Padres National Forest that started on 2017 Dec and was extinguished in March 2019 with a total duration of 465 days.

#### Saving Plotly Visualisations in HTML file

In [75]:
htmlTemplatePrefix = """
    <!DOCTYPE html>
<html lang="en">
   <head>
      <title>California Wildfire Analysis</title>
      <meta charset="UTF-8" />
      <meta name="viewport" content="width=device-width, initial-scale=1.0" />
      <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.5.0/css/all.min.css">
      <style type="text/css">
         * {
            box-sizing: border-box;
            padding: 0;
            margin: 0;
            border: none;
         }

         body {
            width: 100vw;
            height: 100vh;
            background-color: #F9FAFB;
         }

         .heading {
            width: 100%;
            height: 150px;
            background-color: tomato;
            color: white;
            display: flex;
            flex-direction: column;
            align-items: center;
            justify-content: center;
            row-gap: 10px;
         }

         .heading > h3 {
            font-family:'Gill Sans', 'Gill Sans MT', Calibri, 'Trebuchet MS', sans-serif;
            font-size: 38px;
         }

         .heading > p {
            font-size: 20px;
         }

         .heading .icon {
            color: white;
            margin-right: 3px;
         }

         .cards-container {
            width: 100%;
            height: fit-content;
            display: flex;
            flex-direction: column;
            align-items: center;
            justify-content: space-evenly;
            row-gap: 20px;
            margin-top: 15px;
         }

         .card {
            width: fit-content;
            max-width: 1300px;
            height: fit-content;
            max-height: 1100px;
            border: 1px solid #F9FAFB;
            border-radius: 5px;
            background-color: #FFFFFF;
            padding: 15px;
            box-shadow: -1px -1px 2px rgba(0, 0, 0, 0.2), 1px 1px 2px rgba(0, 0, 0, 0.2);
         }

        .key-insights {
            background-color: #FEFBEB;
            border: 1px solid #FCE68A;
            color: #923F0F;
            border-radius: 5px;
            font-size: 16px;
            padding: 10px;
         }

         .key-insights > p{
            font-weight: bold;
            margin-bottom: 20px;
            width: 100%;
         }

         .key-insights > ul {
            list-style-position: inside;
            line-height: 20px;
            width: 100%;
         }

         .summary {
            background-color: #FAF5FF;
            border: 1px solid #E9D5FF;
            color: #6B21A8;
            border-radius: 5px;
            font-size: 16px;
            padding: 10px;
            width: fit-content;
            max-width: 1300px;
            height: fit-content;
            max-height: 1100px;
         }
      </style>
   </head>
   <body>
    <div class="heading">
        <h3><span class="icon"><i class="fa-solid fa-fire"></i></span> California Wildfire Analysis</h3>
        <p>Analyze wildfire data in California from 2013 to 2019, exploring patterns, severity, and impact across different regions.</p>
    </div>
    <div class="cards-container">
"""

htmlTemplateSuffix = """
        <div class="summary">
        <p>
            <b>Summary:</b>
            <br /><br />
            The most significant insight from this analysis is that the years 2017 and 2018 experienced some of the highest numbers, most severe, and longest-lasting wildfires within the 2013–2018 period. However, it is evident that CAL FIRE recognized the shortcomings during these years and implemented corrective measures, which is reflected in the improved wildfire management and reduced incident severity observed in 2019.
        </p>
    </div>
    </div>
   </body>
</html>
"""

htmlTemplate = f"""
    {htmlTemplatePrefix}
    <div class="card">
        {fig1.to_html(full_html=False, include_plotlyjs='cdn')}
        <div class="key-insights">
            <p>Key Findings:</p>
            <ul>
                <li>During years 2017 and 2018, a large amount of land area was burned during the wildfires, compared to any other years.</li>
                <li>The total number of incidents in 2018 was 28% less than the number of incidents in 2017.</li>
                <li>The total land burned in 2018 was 97% more than the total land burned due to wildfires in 2017.</li>
            </ul>
         </div>
    </div>
    <div class="card">
        {fig2.to_html(full_html=False, include_plotlyjs='cdn')}
        <div class="key-insights">
            <p>Key Findings:</p>
            <ul>
                <li>Nearly 87% of the land burned in California in 2018 was due to major wildfire incidents</li>
                <li>Since 2016, major incidents have accounted for over 50% of all wildfire activity each year</li>
            </ul>
         </div>
    </div>
    <div class="card">
        {fig3.to_html(full_html=False, include_plotlyjs='cdn')}
        <div class="key-insights">
            <p>Key Findings:</p>
            <ul>
                <li>The advancements in wildfire handling technology from 2013 to 2018 has led to lower number of personnel being capable of containing larger and expansive wildfires</li>
                <li>Twice the number of wildfire incidents were handled by three-fourths of the workforce in 2018 than in 2013</li>
            </ul>
         </div>
    </div>
    <div class="card">
        {fig4.to_html(full_html=False, include_plotlyjs='cdn')}
        <div class="key-insights">
            <p>Key Findings:</p>
            <ul>
                <li>Most administrative units handle approximately 15-20 wildfires over the entire 6 years with the wildfire prone counties handling between 80 and 120 wildfires over 6 years.</li>
            </ul>
         </div>
    </div>
    <div class="card">
        {fig5.to_html(full_html=False, include_plotlyjs='cdn')}
        <div class="key-insights">
            <p>Key Findings:</p>
            <ul>
                <li>Riverside, San Luis Obispo, Butte and San Bernardino are the top four administrative zones dealing with a lot of wildfire incidents</li>
            </ul>
         </div>
    </div>
    <div class="card">
        {fig6.to_html(full_html=False, include_plotlyjs='cdn')}
        <div class="key-insights">
            <p>Key Findings:</p>
            <ul>
                <li>68.3% of all incidents and nearly 95% of the total fatalities have occurred since 2017, with a significant concentration in the southeastern region of California, particularly within the Riverside Unit (RRU)</li>
            </ul>
         </div>
    </div>
    <div class="card">
        {fig7.to_html(full_html=False, include_plotlyjs='cdn')}
        <div class="key-insights">
            <p>Key Findings:</p>
            <ul>
                <li>San Diego administrative unit has been in the top 5 wildfire handling units over all 7 years.</li>
                <li>Riverside has handled wildfires only in the years 2013, 2017 and 2018, but the number of wildfires handled were significantly higher.</li>
            </ul>
         </div>
    </div>
    {htmlTemplateSuffix}
"""

In [76]:
with open("index.html", "w+") as f:
    f.write(htmlTemplate)

### Summary

The most significant insight from this analysis is that the years 2017 and 2018 experienced some of the highest numbers, most severe, and longest-lasting wildfires within the 2013–2018 period. However, it is evident that CAL FIRE recognized the shortcomings during these years and implemented corrective measures, which is reflected in the improved wildfire management and reduced incident severity observed in 2019.

In summary,
- The total number of incidents in the year 2018 is less than that of 2017 by 28 % where as the land area affected is about 187% more. This means that, in totality, the wildfires in 2018 were severe and larger.
- (Severity) The percentage of major fire incident has been more than 50 % since 2016 until 2019.
- (Learning from the Past) The number of wildfires in the year 2019 is much lower than that of 2018 because of the serverity of the wildfires in 2018 that resulted in extra funding, equipment and better handling of wildfires in 2019 as mentioned in California State Budget Summary.
- (Technological Progress) Due to advancements in wildfire handling technology, lower number of deployed personnel can handle larger and expansive of wildfires as evident from the years 2013 and 2018 where in 2018 the number of wildfires in 200 % whereas the personnel deployed is less than by around 23 %. A similar conclusion can be drawn for the number of water tenders involved.
- (Who does how much) A Fire Administrative unit handles around 15-20 fires in the span of 6 years. The most wildfire prone regions handle at a maximum of 80-120 wildfires.
- Looking at the administrative zones for the California Fire Dept. (shown below) we observe that the Riverside, San Diego, San Luis Obispo and Shasta-Trinity Unit are the top four administrative zones dealing with fire.
- San Diego Fire Unit has been in the top 5 wildires handling unit throughout the years (2013-2019).
- Riverside Fire Unit has handled wildfires only in years 2013, 2017 and 2018, yet has still handled the most number of incidents than any other unit (2013-2019).
- There has been an increase in severity of wildfires upto 68.3 % of all the wildfires and total fatalities occuring being 95 % of all the fatalities since 2017 which are mostly concentrated towards south-east region of California - Riverside Unit (RRU) and San Diego Fire Unit.
- The Worst wildfire occured in the Butte Counties on November, 2018 with 85 fatalities.
- Wildfires with long fire durations occured in the years 2017 and 2018. With an average duration of around 190 hours, orders of magnitude larger than in other years.
- Fires are either dealt-with relatively fast within 20 days or as large as upto 200 days. The longest being around ~ 450 days.
- The longest fire was Thomas Fire in the Los Padres National Forest that started on 2017 Dec and was extinguished in March 2019 with a total duration of 465 days.