# Tutorial: **δHBV 2.0**

---

This notebook demonstrates forward simulation with the pre-trained model δHBV 2.0UH developed by [Yalan Song et al. (2025)](https://doi.org/10.1029/2024WR038928). For explanation of model structure, methodologies, [data](https://mhpi.github.io/datasets/CONUS/#results), and performance metrics, please refer to Song's publication [below](#publication). If you find this code is useful in your own work, please include the aforementioned citation.

**Note**: If you are new to the dMG framework, we suggest first looking at our [δHBV 1.0 tutorial](./../hydrology/example_dhbv_1_0.ipynb).

<br>

### Before Running:
- **Environment**: See [setup.md](./../../docs/setup.md) for ENV setup. dMG must be installed with dependencies + hydrodl2 to run this notebook.

- **Model and Data**: The pretrained δHBV 2.0 model weights + input data can be downloaded from [AWS](https://mhpi-spatial.s3.us-east-2.amazonaws.com/mhpi-release/models/dhbv_2_trained.zip). (The full δHBV 2.0 40-year, high-resolution simulation product can be downloaded from [sharepoint/dHBV2.0_datasets/](https://pennstateoffice365-my.sharepoint.com/:f:/g/personal/cxs1024_psu_edu/Eqi1NuJ3d2pMpEJpVu0EGSoBigi-VCWVHgOYIRoTeuGiOw?e=HaNNeA). Be warned it is 101GB in size.) After downloading, update model and data key paths in their respective configs:

    1. In [`./generic_deltamodel/example/conf/config_dhbv_2.yaml`](./../conf/config_dhbv_2.yaml), update *trained_model* with your path to the parent directory containing both trained model weights `dhbv_2_ep100.pt` **and** normalization file `normalization_statistics.json`.
        - **Note**: make sure this path includes the last closing forward slash: e.g., `./your/path/to/model/`.

    2. In [`./generic_deltamodel/example/conf/observations/merit.yaml`](./../conf/observations/merit.yaml), update *subbasin_data_path* with your path to `./merit_71_0/`.

- **Hardware**: The NNs used in this model require CUDA support only available with Nvidia GPUs. For those without access, T4 GPUs can be used when running this notebook with dMG on [Google Colab](https://colab.research.google.com/).

<br>

### Publication:
*Yalan Song, Tadd Bindas, Chaopeng Shen, Haoyu Ji, Wouter Johannes Maria Knoben, Leo Lonzarich, Martyn P. Clark, et al. "High-resolution national-scale water modeling is enhanced by multiscale differentiable physics-informed machine learning." Water Resources Research (2025). https://doi.org/10.1029/2024WR038928.*

<br>

### Issues:
For questions, concerns, bugs, etc., please reach out by posting an [issue](https://github.com/mhpi/generic_deltamodel/issues) on GitHub.

---


## 1. Forward δHBV 2.0

After completing [these](#before-running) steps, forward δHBV 2.0 with the code block below.

**Note**
- The settings defined in the config file `../example/conf/config_dhbv_2.yaml` are set to replecate benchmark performance.
- For model evaluation, set `mode: simulation` in the config, or modify after the config dict has been created (see below).
- The first year (`warm_up` in the config, 365 days is default) of the inference period is used for initializing HBV's internal states (water storages) and is, therefore, excluded from the model's prediction output.
- For default settings with the inference window set from 1 January 1980 to 31 December 2020, expect ~70GB of VRAM utilization and a runtime of ~2 minutes. (VRAM use can be reduced by increasing `simulation` batchsize in the model config.)

### 1.1 Demonstration

In [None]:
import sys
sys.path.append('../../')

from dmg import ModelHandler
from dmg.core.utils import (import_data_loader, import_trainer, print_config,
                            set_randomseed)
from example import load_config

#------------------------------------------#
# Define model settings here.
CONFIG_PATH = '../example/conf/config_dhbv_2.yaml'
#------------------------------------------#


# 1. Load configuration dictionary of model parameters and options.
config = load_config(CONFIG_PATH)
config['mode'] = 'simulation'
print_config(config)

# Set random seed for reproducibility.
set_randomseed(config['random_seed'])

# 2. Initialize the differentiable HBV 2.0 model (LSTM + HBV 2.0).
model = ModelHandler(config, verbose=True)

# 3. Load and initialize a dataset dictionary of NN and HBV model inputs.
data_loader_cls = import_data_loader(config['data_loader'])
data_loader = data_loader_cls(config, test_split=False, overwrite=False)

# 4. Initialize trainer to handle forward pass.
trainer_cls = import_trainer(config['trainer'])
trainer = trainer_cls(
    config,
    model,
    dataset=data_loader.dataset,
    verbose=True,
)

# 5. Forward pass through the model to get streamflow predictions.
predictions = trainer.inference()
print(f'Predictions saved to \n{config['out_path']}')

### 1.2 Visualizing Model Predictions

After running model inference we can, e.g., view the hydrograph for one of the basins to see we are getting expected outputs.

We can do this with our target variable, streamflow, for instance (though, there are many other states and fluxes we can view -- see cell output below).

In [None]:
import zarr

from dmg.core.post import plot_hydrograph
from dmg.core.utils import Dates

#------------------------------------------#
# Choose a catchment by unit catchment ID (COMID) to plot.
COMID = 71024425
TARGET = 'streamflow'

# Resample to 3-week prediction. Options: 'D', 'W', 'M', 'Y'.
RESAMPLE = '3W'

# Set the path to the zarr store of input data (containing COMIDs).
DATA_PATH = config['observations']['subbasin_data_path']  # ./MERIT_input_sample/71_0/'
#------------------------------------------#


# 1. Get the streamflow predictions and daily timesteps of the prediction window.
print(f"HBV states and fluxes: {predictions.keys()} \n")

pred = predictions[TARGET]
timesteps = Dates(config['simulation'], config['delta_model']['rho']).batch_daily_time_range

# Remove warm-up period to match model output (see Note above.)
timesteps = timesteps[config['delta_model']['phy_model']['warm_up']:]


# 2. Load array of comids and get the index of the selected catchment.
root = zarr.open_group(DATA_PATH, mode='r+')
comids = root['COMID'][:]
print(f"First 20 available COMIDs: \n {comids[:20]} \n")

if COMID in comids:
    basin_idx = list(comids).index(COMID)
else:
    raise ValueError(f"Catchment with ID {COMID} not found in the MERIT dataset.")


# 3. Get the data for the chosen catchment and plot.
streamflow_pred_basin = pred[:, basin_idx].squeeze()

plot_hydrograph(
    timesteps,
    streamflow_pred_basin,
    resample=RESAMPLE,
    title=f"Hydrograph for Catchment {COMID}",
    ylabel='Streamflow (mm/day)',
)

<br>

## 2. Train/Evaluate δHBV 2.0

*Multiscale training for δHBV 2.0 is not currently enabled in dMG. Training code will be released at a later time.*