In [1]:
from ugrid import UGrid
import netCDF4 as nc
from pathlib import Path
from sys import getsizeof

nc_path = Path(r"../data/dellen_map.nc")

As a user I would like to query the (max)-values of a limited amount of locations (< 1000) for hydrological variables as water level and discharge. I know the node-ids, later probably x and y coordinates of these locations. A developper can expect models to contain as many nodes as DHYDRO can compute (e.g. 1E10^6) and dense timeseries (e.g. 1000 timesteps per simulation).

I tested two approaches for reading a DHYDRO NetCDF file:
1. Using Deltares UGrid: https://github.com/Deltares/UGridpy
2. Using NeCDF4: https://unidata.github.io/netcdf4-python/

For every approach I show the info I can get from each module:
* UGrid gives me a flat (1D) array of a multi-dimensional variable.
* NetCDF gives me the "metadata" of the variable; the variable is lazy-loaded. This metadata contains handy info (e.g. dimension variables) I can use to slice and transform the data.

I checked the size and performance. As I can only read the entire array in memory UGrid more time and memory to load 1 variable. For this simple model (653 nodes, 145 timesteps) I do not run into a wall. But be aware, my model with 1E10^6 nodes and 1000 timesteps will consume 8GB of memory for loading 1 variable.

netCDF4 also outperforms UGridpy in terms of user-friendlyness. As shown below I can see the dimension-variables (time and mesh1d_nNodes) I can use to conveniently slice the data and only load what I need.

In [2]:
%%time
with UGrid(str(nc_path), "r") as ug:
    ug_level = ug.variable_get_data_double("mesh1d_s1")

print(getsizeof(ug_level))
print(ug_level)

757592
[-1.57       -2.15       -0.54       ... -3.10330762 -3.10849061
 -3.11363585]
Wall time: 10.6 ms


In [3]:
%%time
ds = nc.Dataset(nc_path)
ds_level = ds["mesh1d_s1"]

print(getsizeof(ds_level))
print(ds_level)

200
<class 'netCDF4._netCDF4.Variable'>
float64 mesh1d_s1(time, mesh1d_nNodes)
    mesh: mesh1d
    location: node
    coordinates: mesh1d_node_x mesh1d_node_y
    standard_name: sea_surface_height
    long_name: Water level
    units: m
    grid_mapping: 
    _FillValue: -999.0
unlimited dimensions: time
current shape = (145, 653)
filling on
Wall time: 3.99 ms
