# Hydrological modelling - HYDROTEL

<div class="alert alert-warning"> <b>WARNING</b>
    
`xHydro` provides tools to execute HYDROTEL, but will not prepare the model itself. This should be done beforehand.

</div>

<div class="alert alert-info"> <b>INFO</b>
    
The HYDROTEL executable can be acquired from this [GitHub repository](https://github.com/INRS-Modelisation-hydrologique/hydrotel).

</div>

`xHydro` provides a collection of functions designed to facilitate hydrological modelling, focusing on two key models: [HYDROTEL](https://github.com/INRS-Modelisation-hydrologique/hydrotel) and a suite of models emulated by the [Raven Hydrological Framework](https://raven.uwaterloo.ca/). It is important to note that Raven already possesses an extensive Python library, [RavenPy](https://github.com/CSHS-CWRA/RavenPy), which enables users to build, calibrate, and execute models. `xHydro` wraps some of these functions to support multi-model assessments with HYDROTEL, though users seeking advanced functionalities may prefer to use `RavenPy` directly. 

The primary contribution of `xHydro` to hydrological modelling is thus its support for HYDROTEL, a model that previously lacked a dedicated Python library. However, building a HYDROTEL project is best done using PHYSITEL and the HYDROTEL GUI, both of which are proprietary software. Therefore, for the time being, `xHydro` is designed to facilitate the execution and modification of an already established HYDROTEL project, rather than assist in building one from scratch.

A similar Notebook to this one, but that covers `RavenPy` models, is available [here](hydrological_modelling_raven.ipynb).

## Basic information

In [None]:
import xhydro as xh
import xhydro.modelling as xhm

In [None]:
# Workaround for determining the notebook folder within a running notebook
# This cell is not visible when the documentation is built.

from __future__ import annotations

try:
    from _finder import _find_current_folder

    notebook_folder = _find_current_folder()
except ImportError:
    from pathlib import Path

    notebook_folder = Path().cwd()

import logging

logger = logging.getLogger()
logger.setLevel(logging.CRITICAL)

The `xHydro` modelling framework is based on a `model_config` dictionary, which is meant to contain all necessary information to execute a given hydrological model. For example, depending on the model, it can store meteorological datasets directly, paths to datasets (netCDF files or other), csv configuration files, parameters, and basically anything that is required to configure and execute an hydrological model.

The list of required inputs for the dictionary can be obtained one of two ways. The first is to look at the hydrological model's class, such as `xhydro.modelling.Hydrotel`. The second is to use the `xh.modelling.get_hydrological_model_inputs` function to get a list of the required keys for a given model, as well as the documentation.

In [None]:
help(xhm.get_hydrological_model_inputs)

In [None]:
import xhydro as xh
import xhydro.modelling as xhm

# This function can be called to get a list of the keys for a given model, as well as its documentation.
inputs, docs = xhm.get_hydrological_model_inputs("Hydrotel", required_only=False)
inputs

In [None]:
print(docs)

HYDROTEL and Raven vary in terms of required inputs and available functions, but an effort will be made to standardize the outputs as much as possible. Currently, all models include the following three functions:

- `.run()`: Executes the model, reformats the outputs to be compatible with analysis tools in `xHydro`, and returns the simulated streamflow as a `xarray.Dataset`.
  - The streamflow variable will be named `q` and will have units of `m3 s-1`.
  - For 1D data (such as hydrometric stations), the corresponding dimension in the dataset will be identified by the `cf_role: timeseries_id` attribute.
  
- `.get_inputs()`: Retrieves the meteorological inputs used by the model.

- `.get_streamflow()`: Retrieves the simulated streamflow output from the model.

## Initializing and running a calibrated model

A typical HYDROTEL project consists of multiple subfolders and files that describe meteorological inputs, watershed characteristics, and more. An example is given in the cell below. The model primarily relies on three key files:

- A project file located in the main directory, which may have any given name (e.g., `SLNO.csv`).
- A `simulation/simulation/simulation.csv` file that manages all the parameters for the run, including simulation dates, the path to meteorological data, and the physical processes to be used.
- A `simulation/simulation/output.csv` file that specifies which results to produce, such as which variables and river reaches to output results for.

When initializing a `Hydrotel` instance through `xHydro`, two options are available:

- `use_defaults = False` (default): This will attempt to read the three required files from the project folder.
- `use_defaults = True`: This option provides an approximation of typical parameters from a project, but it may need to be reviewed and adjusted.

In all cases, providing additional configuration options to `project_config`, `simulation_config`, or `output_config` when initializing the HYDROTEL model, or through the `.update_config()` function later, will update the corresponding CSV files accordingly. 

The following parameters must always be specified:

- `DATE DEBUT` (start date)
- `DATE FIN` (end date)
- `PAS DE TEMPS` (timestep frequency)

If these parameters are not already present in `simulation.csv`, they should be added to `simulation_config`. Additionally, either `FICHIER STATIONS METEO` (meteorological stations file) or `FICHIER GRILLE METEO` (meteorological grid file) must be specified to guide the model to the meteorological data.

If using the defaults, streamflow for all river reaches will be outputted. You can modify `output.csv` to change this behavior.


In [None]:
# This is a hidden cell. We'll create a fake Hydrotel directory for the purpose of this example.
import xhydro.testing

xhydro.testing.utils.fake_hydrotel_project(
    notebook_folder / "_data" / "example_hydrotel", meteo=False, debit_aval=True
)

In [None]:
from pathlib import Path


def print_file_structure(directory, indent=0):
    path = Path(directory)
    for item in path.iterdir():
        print(" " * indent + item.name)
        # If the item is a directory, recurse into it
        if item.is_dir():
            print_file_structure(item, indent + 2)


# Example usage
print_file_structure(notebook_folder / "_data" / "example_hydrotel")

In [None]:
model_config = {
    "model_name": "Hydrotel",
    "project_dir": notebook_folder / "_data" / "example_hydrotel",
    "project_file": "projet.csv",
    "simulation_config": {
        "DATE DEBUT": "1981-01-01",
        "DATE FIN": "1981-12-31",
        "FICHIER STATIONS METEO": "meteo/ERA5.nc",
        "PAS DE TEMPS": 24,
    },
    "output_config": {"TRONCONS": 1, "DEBITS_AVAL": 1},
    "use_defaults": True,
    "executable": "path/to/Hydrotel/executable",
}

With `model_config` on hand, an instance of the hydrological model can be initialized using `xhydro.modelling.hydrological_model` or the `xhydro.modelling.Hydrotel` class directly.

In [None]:
ht = xhm.hydrological_model(model_config)

print(f"Simulation directory, taken from the project file: '{ht.simulation_dir}'\n")
print(f"Project configuration: '{ht.project_config}'\n")
print(f"Simulation configuration: '{ht.simulation_config}'\n")
print(f"Output configuration: '{ht.output_config}'")

### Formatting meteorological data

The acquisition of raw meteorological data is covered in the [GIS notebook](gis.ipynb) and [Use Case Example](use_case.ipynb) notebooks. Therefore, this notebook will use a test dataset.

In [None]:
import xarray as xr

from xhydro.testing.helpers import (  # In-house function to get data from the xhydro-testdata repo
    deveraux,
)

D = deveraux()

meteo_file = D.fetch("hydro_modelling/ERA5_testdata.nc")
ds = xr.open_dataset(meteo_file)
ds

Every hydrological model has different requirements when it comes to their input data. In this example, the data variables have units (temperatures in `°K` and precipitation in `m`) and time units that would not be compatible with the requirements for the Hydrotel model. Additionally, while HYDROTEL can manage 2D grids, it is often preferable to have a 1D spatial dimension to quicken some manipulations done by the model.

The function `xh.modelling.format_input` can be used to reformat CF-compliant datasets for use in hydrological models.

In [None]:
help(xh.modelling.format_input)

In [None]:
# You can also use the 'save_as' argument to save the new file(s) in your project folder.
ds_reformatted, config = xh.modelling.format_input(
    ds,
    "Hydrotel",
    save_as=notebook_folder / "_data" / "example_hydrotel" / "meteo" / "ERA5.nc",
)

HYDROTEL requires a configuration file to accompany the meteorological file. This configuration file must have the same name as the corresponding NetCDF file, but with a `.nc.config` extension. 

If the `save_as` option is used, this configuration file will also be saved along with the meteorological data.


In [None]:
ds_reformatted

In [None]:
config

### Validating the Meteorological Data

Before executing hydrological models, a few basic checks will be performed automatically. However, users may want to conduct more advanced health checks on the meteorological inputs (e.g., identifying unrealistic values). This can be done using `xhydro.utils.health_checks`. For the full list of available checks, refer to [the 'xscen' documentation](https://xscen.readthedocs.io/en/latest/notebooks/3_diagnostics.html#Health-checks).

We can use `.get_inputs()` to automatically retrieve the meteorological data. In this example, we'll ensure there are no abnormal meteorological values or sequences of values.

In [None]:
health_checks = {
    "raise_on": [],  # If an entry is not here, it will warn the user instead of raising an exception.
    "flags": {
        "pr": {  # You can have specific flags per variable.
            "negative_accumulation_values": {},
            "very_large_precipitation_events": {},
            "outside_n_standard_deviations_of_climatology": {"n": 5},
            "values_repeating_for_n_or_more_days": {"n": 5},
        },
        "tasmax": {
            "tasmax_below_tasmin": {},
            "temperature_extremely_low": {},
            "temperature_extremely_high": {},
            "outside_n_standard_deviations_of_climatology": {"n": 5},
            "values_repeating_for_n_or_more_days": {"n": 5},
        },
        "tasmin": {
            "temperature_extremely_low": {},
            "temperature_extremely_high": {},
            "outside_n_standard_deviations_of_climatology": {"n": 5},
            "values_repeating_for_n_or_more_days": {"n": 5},
        },
    },
}

In [None]:
from xclim.core.units import amount2rate

ds_in = ht.get_inputs()
ds_in["pr"] = amount2rate(ds_in["pr"])  # Precipitation in xclim needs to be a flux.

xh.utils.health_checks(ds_in, **health_checks)

### Executing the model
A few basic checks are performed when the `.run()` function is called, before executing the model itself. In the case of HYDROTEL, these checks will be made:

- All files mentioned in the configuration exist.
- The meteorological dataset has the dimensions, coordinates, and variables named in its configuration file (e.g. `ERA5.nc.config`, in this example).
- The dataset has a standard calendar.
- The frequency is uniform (i.e. all time steps are equally spaced).
- The start and end dates are contained in the dataset.
- The dataset is complete (i.e. no missing values).

Only if these checks pass will the function proceed to execute the model. Note that HYDROTEL itself will perform a series of checkups, which is why they are kept at a minimum in `xHydro`.

Once the model is executed, `xHydro` will automatically reformat the NetCDF file to bring it closer to CF conventions, ensuring compatibility with other `xHydro` modules. Note that, at this time, this reformatting only supports the outgoing streamflow.

In [None]:
# HYDROTEL has a few specific options
help(ht.run)

In [None]:
# For the purpose of this example, we'll leave 'dry_run' as True.
print("Command that would be run in the terminal:")
ht.run(check_missing=True, dry_run=True)

In [None]:
# This is how the output would look like after reformatting (which was skipped by the dry_run argument)
ht._standardise_outputs()
ht.get_streamflow()

## Model calibration

<div class="alert alert-warning"> <b>WARNING</b>
    
Only Raven-based models are currently implemented.

</div>