# Tutorial: **$\delta$ HBV 2.0**

---

This notebook demonstrates how to forward a pre-trained $\delta$ HBV 2.0UH model developed by [Yalan Song et al. (2024)](https://doi.org/10.22541/essoar.172736277.74497104/v1). For explanation of model structure, methodologies, [data](https://mhpi.github.io/datasets/CONUS/#results), and performance metrics, please refer to Song's publication [below](#publication). If you find this code is useful in your own work, please include the aforementioned citation.

<br>

#### Before Running:
- **Environment**: From `env/` a minimal Python environment can be setup for running this code... (see `docs/getting_started.md` for more details.)
    - Conda -- `deltamodel_env.yaml`
    - Pip -- `requirements.txt`


- **Model and Data**: The trained $\delta$ HBV 2.0 model and input data can be downloaded from [sharepoint](https://pennstateoffice365-my.sharepoint.com/:f:/g/personal/cxs1024_psu_edu/Eqi1NuJ3d2pMpEJpVu0EGSoBigi-VCWVHgOYIRoTeuGiOw?e=HaNNeA). After downloading...

    1. Update the `subbasin_data_path` key in data config `example/conf/observations/merit_forward.yaml` with your path to `MERIT_input_sample/71_0`.

    2. Update the `trained_model` key in model config `example/conf/config_dhbv_2_0.yaml` with the path to you directory containing the trained model `dHBV_2_0_Ep100.pt` AND normalization `test1980-2020_Ep100/normalization_statistics.json`.

- **Hardware**: The LSTMs used in this model require CUDA support only available with Nvidia GPUs. For those without access, T4 GPUs can be used when running this notebook with dMG on [Google Colab](https://colab.research.google.com/).




### Publication:

*Song, Yalan, Tadd Bindas, Chaopeng Shen, Haoyu Ji, Wouter Johannes Maria Knoben, Leo Lonzarich, Martyn P. Clark et al. "High-resolution national-scale water modeling is enhanced by multiscale differentiable physics-informed machine learning." Authorea Preprints (2024). https://essopenarchive.org/doi/full/10.22541/essoar.172736277.74497104.*

<br>

### Issues:
For questions, concerns, bugs, etc., please reach out by posting an issue on the [dMG repo](https://github.com/mhpi/generic_deltaModel/issues).

---

<br>

## 1. Train/Evaluate $\delta$ HBV 2.0

*Multiscale training for dHBV2.0 is not currently enabled in dMG. Training code will be released at a later time.*

## 2. Forward $\delta$ HBV 2.0

After completing [these](#before-running) steps, forward the $\delta$ HBV 2.0 model with the code block below.

--> For default settings expect evaluation time of ~1 minute with an Nvidia A100.

**Note**
- The settings defined in the config file `../example/conf/config_dhbv_2_0.yaml` are set to replecate benchmark performance.
- For model evaluation, set `mode: predict` in the config, or modify after the config dict has been created (see below).
- The default inference window is set from 1 January 1980 to 31 December 2020, which should use ~70GB of vram.
- The first year (`warm_up` in the config, 365 days is default) of the inference period is used for initializing HBV's internal states (water storages) and is, therefore, excluded from the model's prediction output.
- If you are new to the *dMG* framework and want further explanation and exposure of the methods used below, we suggest first looking at our notebook for $\delta$ HBV 1.0: `example/hydrology/example_dhbv_1_0.ipynb`.

In [None]:
import sys
sys.path.append('../../')
sys.path.append('../../dMG')  # Add the dMG root directory.

from example import load_config 
from models.model_handler import ModelHandler as dHBV
from core.utils import print_config
from core.utils.factory import import_data_loader, import_trainer



#------------------------------------------#
# Define model settings here.
CONFIG_PATH = '../example/conf/config_dhbv_2_0.yaml'
#------------------------------------------#



# 1. Load configuration dictionary of model parameters and options.
config = load_config(CONFIG_PATH)
print_config(config)

# 2. Setup a dataset loader to prepare NN and physics model inputs.
data_loader_cls = import_data_loader(config['data_loader'])
data_loader = data_loader_cls(config, test_split=True, overwrite=False)

# 3. Initialize the differentiable model dHBV 2.0 (LSTM + HBV 2.0).
model = dHBV(config, verbose=True)

# 4. Initialize trainer to handle forward pass.
trainer_cls = import_trainer(config['trainer'])
trainer = trainer_cls(
    config,
    model,
    inf_dataset=data_loader.dataset,
    verbose=True,
)

# 5. Forward pass through the model to get streamflow predictions.
predictions = trainer.inference()

### Visualizing Model Predictions

After running model inference we can, e.g., view the hydrograph for one of the catchments to see we are getting expected outputs.

We can do this with our target variable, streamflow, for instance... (though, there are many other states and fluxes we can output as shown in the output cell below.)

In [None]:
import zarr

from core.utils.dates import Dates
from core.post.plot_hydrograph import plot_hydrograph



#------------------------------------------#
# Choose a catchment by unit catchment ID (COMID) to plot.
COMID = 71024425
TARGET = 'flow_sim'

# Resample to 3-day prediction. Options: 'D', 'W', 'M', 'Y'.
RESAMPLE = 'D'

# Set the path to the zarr store of input data (containing COMIDs).
DATA_PATH = 'your/path/to/MERIT_input_sample/71_0'
#------------------------------------------#



print(f"HBV states and fluxes: {predictions.keys()} \n")


# 1. Get the streamflow predictions and daily timesteps of the prediction window.
pred = predictions[TARGET]
timesteps = Dates(config['predict'], config['dpl_model']['rho']).batch_daily_time_range

# Remove warm-up period to match model output (see Note above.)
timesteps = timesteps[config['dpl_model']['phy_model']['warm_up']:]


# 2. Load array of comids and get the index of the selected catchment.
root = zarr.open_group(DATA_PATH, mode='r+')
comids = root['COMID'][:]
print(f"First 20 available COMIDs: \n {comids[:20]} \n")

if COMID in comids:
    basin_idx = list(comids).index(COMID)
else:
    raise ValueError(f"Catchment with ID {COMID} not found in the MERIT dataset.")


# 3. Get the data for the chosen catchment and plot.
streamflow_pred_basin = pred[:, basin_idx].squeeze()

plot_hydrograph(
    timesteps,
    streamflow_pred_basin,
    streamflow_pred_basin,
    resample=RESAMPLE,
    title=f"Hydrograph for Catchment {COMID}",
    ylabel='Streamflow (ft$^3$/s)',
)