# Distributed hydrological modelling

## Using Ravenpy to build a distributed hydrological model

In this notebook, we will demonstrate how to build a distributed hydrological model using Raven as well as "Routing product" (Generated by BasinMaker), a database of subbasins and how they link to one another in a river network. Currently, Routing product is only available for North American catchments. However, if in time it becomes available on a larger scale, it would be trivial to change the setup apply it to other supported regions.

In [1]:
# Import the list of possible model templates for distributed hydrological modelling
from ravenpy.new_config.emulators import (
    GR4JCN,
    HBVEC,
    HMETS,
    HYPR,
    SACSMA,
    Blended,
    CanadianShield,
    Mohyse,
)

In [2]:
import datetime as dt
import tempfile
from pathlib import Path

import matplotlib.pyplot as plt

from ravenpy import Emulator
from ravenpy.extractors.new_config.routing_product import (
    BasinMakerExtractor,
    GridWeightExtractor,
    open_shapefile,
    upstream_from_coords,
)
from ravenpy.new_config import commands as rc
from ravenpy.new_config.emulators import GR4JCN
from ravenpy.utilities.testdata import get_file, get_local_testdata

tmp_path = Path(tempfile.mkdtemp())

In the next step, we will get the Routing product file for our catchment. These can be downloaded here: http://hydrology.uwaterloo.ca/basinmaker/download_regional.html

In [3]:
%%capture --no-display
# Get path to pre-downloaded BasinMaker Routing product database for our catchment
shp_path = get_file("basinmaker/drainage_region_0175_v2-1/finalcat_info_v2-1.zip")

# Note that for this to work, the coordinates must be in the small
# BasinMaker example (drainage_region_0175)
df = open_shapefile(shp_path)

# Gauge station for observations at Matapedia
# SubId: 175000128
# -67.12542 48.10417
sub = upstream_from_coords(-67.12542, 48.10417, df)

# Extract the subbasins and HRUs (one HRU per sub-basin)
bm = BasinMakerExtractor(
    df=sub,
    hru_aspect_convention="ArcGIS",
)

# Get the .rvh file that we will provide to the config and that links HRUs/subbasins to the river network
rvh = bm.extract()

Now that we have the HRUs and river network all setup, let's get the hydrometeorological data. We first get the database of streamflows and then do the same for weather. You can provide your own for your own catchments, here we are using our datasets to keep things tidy.

In [4]:
# Streamflow observations file
qobs_fn = get_file("matapedia/Qobs_Matapedia_01BD009.nc")

# Make an obervation gauge from the observed streamflow
qobs = rc.ObservationData.from_nc(qobs_fn, alt_names=("discharge",))

Now prepare the meteorological data using the Gauge format. Note that this dataset of stations is a combination of stations that we iterate on, making a Gauge object for each station in our dataset:

In [5]:
# TODO: SKIP THIS FOR NOW.

# Meteo observations file
meteo_grid_fn = get_local_testdata("matapedia/Matapedia_meteo_data_2Db.nc", "./")
# Dict of GW attributes
gw = GridWeightExtractor(
    meteo_grid_fn,
    shp_path,
    dim_names=("longitude", "latitude"),
    var_names=("longitude", "latitude"),
    gauge_ids=[
        "01BD009",
    ],
).extract()

# Write GW command to file
gw_fn = tmp_path / "gw.txt"
gw_fn.write_text(rc.GridWeights(**gw).to_rv())

alt_names = {"TEMP_MIN": "TEMP_MIN", "TEMP_MAX": "TEMP_MAX", "PRECIP": "PRECIP"}

meteo_forcing_grid = [
    rc.GriddedForcing.from_nc(
        meteo_grid_fn, dtyp, alt_names=(alias,), grid_weights=gw_fn
    )
    for (dtyp, alias) in alt_names.items()
]

In [6]:
# Meteo observations file
meteo_grid_fn = get_file("matapedia/Matapedia_meteo_data_stations.nc")

# Alternate names for variables in the files
alt_names = {
    "TEMP_MIN": "tmin",
    "TEMP_MAX": "tmax",
    "PRECIP": "pr",
}

# Make virtual Gauges
meteo_forcing_stations = [
    rc.Gauge.from_nc(
        meteo_grid_fn,
        data_type=alt_names.keys(),
        station_idx=i + 1,
        alt_names=alt_names,
    )
    for i in range(6)  # Since we have 6 stations
]

Now that we have the data, we can run the distributed model as usual:

In [7]:
%%capture --no-display

# Prepare the model configuration
model_config = GR4JCN(
    params=[0.529, -3.396, 407.29, 1.072, 16.9, 0.947],
    StartDate=dt.datetime(1982, 1, 1),
    Duration=15,
    ObservationData=[qobs],
    Gauge=meteo_forcing_stations,
    **rvh,
)

# Run the model with the configuration we just built
distributed_outputs = Emulator(model_config, workdir=tmp_path).run(overwrite=True)

OSError: Raven segfaulted : 
============================================================
                        RAVEN                               
 a robust semi-distributed hydrological modelling framework 
    Copyright 2008-2023, the Raven Development Team 
                    Version 3.6 w/ netCDF
                BuildDate Mar 25 2023
============================================================
Generating Master Parameter List...
Autocalculating Model Parameters...
...done Autocalculating.
Checking for Required Model Parameters...
...Done Checking


Explore the results, just like for any other model. However, this time we have a few gauges because the Routing Product integrates some gauges already. We want data for the first gauge:

In [8]:
# Show the hydrographs object
display(distributed_outputs.hydrograph)

NameError: name 'distributed_outputs' is not defined

In [None]:
# Plot the resulting streamflow
distributed_outputs.hydrograph.q_sim.isel(nbasins=0).plot.line(
    x="time", label="Distributed model", color="blue", lw=1.5
)
plt.legend()