---
title: "AICE: Probabilistic Benchmarking"
format: revealjs
execute:
  echo: true
  output: true
---

## Probabilistic Benchmarking 

This Jupyter notebook serves as an overview of the functions (and their associated functionality) living within the `mae_far_mr_probabilistic.py`. 

1. `get_forecast_probabilistic_twice_weekly`

* Purpose: open a model forecast file for a given year, select twice-weekly initializations (Monday & Thursdays) between May and July, normalize some dimension names, and return the precipitation DataArray 
* Key behavior & assumptions:
  * Expects a file named "{yr}.nc" in model_forecast_dir and variable "tp". 
  * Builds a May-July daily date range, then filters for Mondays and Thursdays. 


2. `load_imd_rainfall`
   
* Purpose: Load the IMD daily rainfall NetCDF for a given year and normalize dimension names to lat/lon/time. 
* Key behavior & assumptions:
  * Open the dataset of files with name "data_{year}.nc" or "{year}.nc". 
  * Extracts ds['RAINFALL']

3. `detect_observed_onset`

* Purpose: Detect the first observed monsoon onset data at each grid cell for a given year, using IMD daily rainfall and a spatial threshold. Optionally enforce the "MOK" start data (June 2) so onsets are only searched for after this date. 
* Key behavior & assumptions:
  * Start date gate: check if mok=True, uses June 2, else, uses May 1, and if neither are present, it falls back to April 1 and warns
  * Conducts checks to see if first-day condition: rainfall at day t is strictly > 1 and accumulation condition: the 5-day sum from  `t...t+4` is strictly > the cell's     `threshold _slice[lat, lon]`
  * Uses `xr.apply_func` to get the first index where the combined onset condition is True for each `(lat, long)`. If not, returns -1 
  * Apply function along time for each `(lat, lon)` to get the earliest onset index (or -1)
  * Create an empty lat, lon array of `datetime64[ns] with NaT` and for each grid cell with a valid index `k >= 0`, set `onset_data[i, j] = rain_subset.time[k]`.


4. `compute_mean_onset_for_all_members`

* Purpose: From an ensemble forecast (p_model) and observed onset dates, compute a single ensemble onset per (init_time, lat, lon) by:
  1. Detecting each member's onset day using a 5-day window criterion and an after-MOK gate 
  2. Requiring >=50% of members to register an onset 
  3. Taking the rounded mean of member onset days as the ensemble onset 
  4. Outputs a tidy DataFrame you can feed into windowed verification. 
 

5. `compute_onset_metrics_with_windows`

* Purpose: per-init, per-grid ensemble outcomes from prev function and an observed onset date for each grid, compute operational contingency metrics - TP, FP, FN, TN, plus MAE both per grid and aggregrated per grids. 
  * For each window, checks whether the model predicts an onset.
  * Compares the forecast date to the observed IMD-based onset data for that grid and year.
  * Calculates the absolute timing error (`|forecast - observed|`).
  * Labels the forecast as:
    * TP: if within += 3 days 
    * FP: if onset is predicted but timing error > tolerance 
    * FN: if onset occurred but model missed it 
    * TN: if neither observed or predicted within window 
  

6. `compute_metrics_multiple_years`

* Purpose: Follow the entire verification pipeline across multiple years:
  * Load ensemble precipitation forecases and IMD observations 
  * Detect observed onset dates 
  * Compute ensemble onset per (init, grid) using the >= 50% rule and evaluate deterministic metrics 
  * Returns per-year metric tables and the observed onset maps for downstream spatial summaries and plotting 

7. `create_spatial_far_mr_mae`
* Purpose: Aggregate per-grid, per-year verification results into spatial maps of:
  * False Alarm Rate 
  * Miss Rate 
  * Mean MAE 
  * Year-specific MAE layers 

8. `get_india_outline`
* Purpose: Get India coordinates and boundaries from shapefile path passed into the function.  

`plot_spatial_metrics`
* Purpose: Render a 3-panel figure showing spatial maps of:
  * Mean AME (days) across years
  * False Alarm Rate 
  * Miss Rate
  with Indai outline and (when supported) a Core Monsoon Zone (CMZ) plus CMZ averages

## Running Scripts

1. Standard 1-15 day evaluation

```python
python reference_scripts/mae_far_mr_probablistic_models/mae_far_mr_probabilistic_models.py \
    --years 2019 2020 2021 \
    --model_forecast_dir data/model_forecast_data/ngcm51/climatology/tp_2p0 \
    --imd_folder data/imd_rainfall_data/2p0 \
    --thres_file data/imd_onset_threshold/mwset2x2.nc4 \
    --shpfile_path data/ind_map_shpfile/india_shapefile.shp \
    --tolerance_days 3 \
    --verification_window 1 \
    --forecast_days 15 \
    --mok \
    --output_file data/output/results_1-15day_MOK.nc \
    --plot_dir data/output/plots
```

![Standard 1-15 Day Evaluation](../data/output/plots/spatial_metrics_2019-2021_1-15day_MOK.png "Standard 1-15 Day Evaluation Spatial Metrics") 


* Why does this make sense?

  1. Code uses the following:

```python
--verification_window 1
--forecast_days 15
--tolerance_days 3
```
  * Meaning that each forecast initialized on day $t_0$ is judged for onset events between $t_0 + 1 \text{ \& } t_0 + 15$ with a +- 3-day allowance for timing error. This is the short-medium range where the model can realistically catch the true onset about one to two weeks ahead.

2. Extended 16-30 Day Evaluation

```python
python reference_scripts/mae_far_mr_probablistic_models/mae_far_mr_probabilistic_models.py \
    --years 2019 2020 2021 \
    --model_forecast_dir data/model_forecast_data/ngcm51/climatology/tp_2p0 \
    --imd_folder data/imd_rainfall_data/2p0 \
    --thres_file data/imd_onset_threshold/mwset2x2.nc4 \
    --shpfile_path data/ind_map_shpfile/india_shapefile.shp \
    --tolerance_days 5 \
    --verification_window 16 \
    --forecast_days 30 \
    --mok \
    --output_file data/output/results_16-30day_MOK.nc \
    --plot_dir data/output/plots
```

![Extended 16-30 Day Evaluation](../data/output/plots/spatial_metrics_2019-2024_16-30day_MOK.png "Extended 16-30 Day Evaluation Spatial Metrics") 

* Why does this make sense?

  1. Shifting the verification window far in the future means that the forecast initialized on day $t_0$ is only evaluated for onset events occurring between $t_0 + 16 \text{ \& } t_0 + 30$ days. 
  2. Most real onsets from IMD data have already occurred so the model has no chance to hit them, and most models lose deterministic nature and revert to climatology $\rightarrow$ hence high miss rate because the onsets occur before or outside the 16-30 day window. 
  3. MAE drops / no entries because there are so few valid onsets.  


3. IFS-S2S 2019-2020 Metrics

```python
python reference_scripts/mae_far_mr_probablistic_models/mae_far_mr_pr obabilistic_models.py \
--years 2019 2020 \
--model_forecast_dir data/model_forecast_data/IFS-S2S/tp_2p0/ \
--imd_folder data/imd_rainfall_data/2p0 \
--thres_file data/imd_onset_threshold/mwset2x2.nc4 \
--shpfile_path data/ind_map_shpfile/india_shapefile.shp     
--tolerance_days 3\
--verification_window 1 \
--forecast_days 10 \    
--mok \    
--output_file data/output/results_1-10day_MOK.nc
```

![IFS-S2S](../data/output/plots/spatial_metrics_2019-2020_1-10day_MOK.png "IFS-S2S") 

## Understanding the CLI Arguments 

The self-explanatory parameters such as those input and output paths are excluded since they are not so relevant:

* --tolerance_days 5
  * Tolerance (± days) used to decide a hit: forecast is considered correct if it falls within ± tolerance_days of the observed onset.
* --verification_window 16
  * Start day offset after initialization used for evaluation. With verification_window=16 and forecast_days=30 the verification window is t0+16 .. t0+30.
* --forecast_days 30
  * End day offset for the verification window. Combined with verification_window defines the inclusive interval after init to check for onset.
* --mok
  * Flag to enforce MOK start gate (search for observed onsets only on/after June 2). If omitted, search starts at May 1.

## Tips for Running the Scripts

1. Make sure that you have all required data files in your `data` directory.
2. After you run `make run-interactive`, navigate to the `src` directory.
3. Now you are free to run your scripts. Ensure that for any of the path arguments the paths are correctly typed relative to the `src` directory within the container.

## Questions 

1. What is the purpose of initializing forecasts twice weekly instead of only around observed onset dates, and how does this choice help create a more operational evaluation framework?
2. What is the Webster–Yang index (WYI) and how does it directly relate to the onset predictions?