In [1]:
import earthkit.hydro as ekh
import numpy as np
import xarray as xr

network = ekh.river_network.load("efas", "5", use_cache=False)

Cache disabled.


# Gridded/raster vs masked/vector river networks

By default, earthkit-hydro returns gridded outputs when possible. This is generally easier to interpret. However, each river network also has a masked/vector representation which returns just values for each node in the river network graph. This is harder to interpret but computationall more efficient, and so can be used in performance-sensitive cirumstances. It is possible to change this behaviour via the `return_type` function argument.

The classical example returns a full grid, with nans for masked locations.

In [2]:
ekh.upstream.sum(network, np.random.rand(*network.shape), return_type="gridded")

The masked version returns just at the river network nodes.

In [3]:
ekh.upstream.sum(network, np.random.rand(*network.shape), return_type="masked")

The difference can be clearly seen in the array shapes:

In [4]:
print("gridded shape:", ekh.upstream.array.sum(network, np.random.rand(*network.shape), return_type="gridded").shape, "=", "network.shape:", network.shape)
print("masked shape:", ekh.upstream.array.sum(network, np.random.rand(*network.shape), return_type="masked").shape, "=", "network.n_nodes", network.n_nodes)

gridded shape: (2970, 4530) = network.shape: (2970, 4530)
masked shape: (7446075,) = network.n_nodes 7446075


Note furthermore that the input field we provided was gridded for both cases. We can however also specify the field as a masked field, and still return either gridded or masked.

In [5]:
ekh.upstream.sum(network, np.random.rand(network.n_nodes), return_type="gridded")

This also works with xarray inputs.

In [6]:
example_arr = np.random.rand(network.n_nodes)

index = np.arange(network.n_nodes)

example_da = xr.DataArray(
    example_arr,
    dims = ["index"],
    coords = {"index": index},
    name = "precip",
    attrs={"units": "m", "description": "Sample precipitation data"}
)

ekh.upstream.sum(network, example_da, return_type="gridded")

Changing the `return_type` in each function call can be cumbersome, so a default can be set for the river network object itself.

In [7]:
print("default is gridded: ", ekh.upstream.sum(network, np.random.rand(*network.shape)).shape)
network.set_default_return_type("masked")
print("default is now masked: ", ekh.upstream.sum(network, np.random.rand(*network.shape)).shape)

default is gridded:  (2970, 4530)
default is now masked:  (7446075,)
