# Validating Alaska Lightning Probability Model

In this notebook you will learn:

- how to visualize model output data
- how to perform quick statistics and exploratory data analysis
- how to get model metrics for validation

For this part of the work we will be looking at outputs from the trained model on ALDN data. We will look at the model performance under:

- general conditions
- per biome
- per severity day
- per temporal window.

You will use the base code from the previous notebook to finish the last three plots from this notebook. I have provided examples below on how you can query the data and use it to plot.

## 1. Open database to work with

We will use a database to validate our model. This database includes observations across all dates, biomes, and temporal windows.

In [None]:
!pip install datasets geopandas

In [None]:
import os
import pandas as pd
import geopandas as gpd
from huggingface_hub import snapshot_download
from sklearn.metrics import accuracy_score, confusion_matrix, \
    classification_report, brier_score_loss, log_loss

In [None]:
DATASET_URL = 'jordancaraballo/alaska-lightning'
DATASET_FILENAME = 'validation/validation-alaska.gpkg'

In [None]:
#database_filename = '/explore/nobackup/people/jacaraba/development/wildfire-occurrence/notebooks/validation-alaska.gpkg'
alaska_dataset = snapshot_download(repo_id=DATASET_URL, allow_patterns="*.gpkg", repo_type='dataset')

In [None]:
database_filename = os.path.join(alaska_dataset, DATASET_FILENAME)
database_filename

In [None]:
validation_database = gpd.read_file(database_filename)

In [None]:
validation_database.head()

## 2. Basic Accuracy Metrics - Biome

In [None]:
def print_accuracy(df):
    conf_matrix = confusion_matrix(df['Label'], df['predictions'])
    a = conf_matrix[0][0]
    b = conf_matrix[0][1]
    c = conf_matrix[1][0]
    d = conf_matrix[1][1]

    print("Accuracy: ", accuracy_score(df['Label'], df['predictions']))
    print("POD:      ", a / (a+c))
    print("CSI:      ", a / (a+b+c))
    print("FAR:      ", b / (a+c))
    print("F:        ", b / (b+d))
    print("Brier:    ", brier_score_loss(df['Label'], df['predictions_proba']))
    print("Log Loss: ", log_loss(df['Label'], df['predictions_proba']))
    return

In [None]:
# Overall Accuracy, Alaska
print_accuracy(validation_database)

In [None]:
# Overall Accuracy, Tundra
tundra = validation_database[validation_database['BIOME'] == 'TUNDRA']
print_accuracy(tundra)

In [None]:
# Overall Accuracy, Boreal
boreal = validation_database[validation_database['BIOME'] == 'BOREAL']
print_accuracy(boreal)

## 3. Basic Accuracy Metrics - Severity

In [None]:
# Overall Accuracy, Severe
severe = validation_database[validation_database['Severity2'] == 'Severe']
print_accuracy(severe)

In [None]:
# Overall Accuracy, Moderate
moderate = validation_database[validation_database['Severity2'] == 'Moderate']
print_accuracy(moderate)

In [None]:
# Overall Accuracy, Low
low = validation_database[validation_database['Severity2'] == 'Low']
print_accuracy(low)

## Example plot - which dates are we performing the worst?

In [None]:
failed_points = validation_database[validation_database['Label'] != validation_database['predictions']]
ax = failed_points.WRFDATE_STR.value_counts().sort_index().plot(
    kind='barh', title='Least Accurate Days for Model Performance')
ax.set_xlabel("Count of Failed Classification Points")
ax.set_ylabel("Date")

## Example plot - which dates are we performing the best?

In [None]:
accurate_points = validation_database[validation_database['Label'] == validation_database['predictions']]
ax = accurate_points.WRFDATE_STR.value_counts().sort_index().plot(
    kind='barh', title='Most Accurate Days for Model Performance')
ax.set_xlabel("Count of True Classification Points")
ax.set_ylabel("Date")

The pattern between these two seems to be related to the number of points available. It would be interesting to understand if there are any climate variables driving this difference.

## 4. Task #1: Generate Bar Plot with Accuracy per Location (Boreal vs Tundra)

In [None]:
### Insert your code Here ###

## 5. Task #2: Generate Bar Plot with Accuracy per Severity Level (Severe, Moderate, Low)

In [None]:
### Insert your code Here ###

## 6. Task #3: Generate Map to Illustrate where our model fails the most

In [None]:
### Insert your code Here ###

## 7. Task #4: Write three conclusions from the results listed above

In [None]:
### Insert your text Here ###