-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support forcing without exact model grid definition #162
Comments
I would rephrase "nice addition" to something imperatively important where it concerns very large runs :-) |
I suppose that if we set up the cell to cell mapping once at the beginning and reuse it, it doesn't have to add too much runtime overhead. Though it's good to refine a bit what is needed.
|
I agree, some additional info about requirements would be good. @hcwinsemius or @markhegnauer could you please add (see also questions from @visr)? For now, I would say this is only for forcing, I don't see much added value to implement this for staticmaps. |
Agreed, for now only for forcing needed. A simple interpolation (nearest) would already be very useful. In the future, reprojection would be a nice-to-have. |
And what about using a lapse rate to correct temperature with the DEM? |
Just to butt in, uninvited (thanks to Github notifications): I'm rather confident that in-memory runtime regridding is a poisoned chalice. I feel confident because in-memory runtime regridding has been a staple of iMODFLOW models for as long as I've used them, and I've honestly always hated it. The primary reason is that when debugging your model, it's no longer possible to inspect your model input directly and it gives me a distinct feeling of lacking control -- maybe this is just a personal defect. Secondly, as Martijn mentions, it can also quickly spiral out of control in terms of complexity, especially if you consider your staticmaps. In this case, the appropriate regridding method depends on the forcing or parameter in question (with many possible valid schemes: nearest, mean, median, max, min, sum, first-order conserving, etc., etc.; all area-weighted or not). Automatic reprojection is probably ten times worse due to the generally poor understanding of coordinate systems and incomplete CRS specification by users. They will get confused, and they will blame you. In comparison, if Wflow expects a single set of coordinates, a user can simply look for the odd one out and fix it. Note that is also a form of feature creep which has nasty implications if you want to support non-structured topologies on the longer term. Regridding rasters is quite straightforward, but doing it efficiently for triangular meshes (or other convex cell types) is not: https://github.com/deltares/numba_celltree I see two reasons why this is desirable (at first glance):
It's also worth noting there are a number of exciting developments in compression schemes (with sufficient CPUs, reading and and decompressing data can actually be faster than reading due to memory access being relatively slow compared to how fast CPUs have become): Zarr is an ND array storage convention, using blosc: Some Zarr support in the netCDF library: I'd recommend:
On a slightly longer term:
|
Hi @Huite, thanks for your input! Just to give you an idea, but we're talking about 10,000 year of simulation for the whole of Europe on 1x1 (or 2x2) km2. Based on a very rough estimation, based on some input data for the Meuse cacthment, I estimated to be in need of around 50+TB of storage. An in between solution could also be regridding using hydromt directly in the workflow. Next to the projects' need, I do see added value for this to provide an easy step-in for new wflow users. @verseve, good point about lapse-rate correction. This could maybe added by supplying a lapse-rate correction grid in the staticmaps as was also available (but not working) in the Python code. |
@Huite: yes, indeed thanks for your input and recommendations. I share your concerns. In my view staticmaps is out of scope, this belongs to HydroMT, is applied mostly to higher resolution data than a typical wflow model resolution, and makes use of parameter specific upscale operators. For the forcing part, I am not sure about the added value (reason I wrote: "could be a nice addition"). In the process of converting Wflow Python to Julia we have stripped away most pre-processing (e.g. calculation of slope from the DEM) so there is a clear boundary between data and computations. Also, most probably once we start with this (simple) we can expect requests to add different regridding methods, reprojection etc., all functionality that is already part of HydroMT. A solution where you indeed use the forcing part of HydroMT directly (in the cloud?) in a workflow is most probably more practical. Also you could do this in a loop (say 100 years) to reduce the data storage. For an easy step-in for new wflow users I think having examples how to run HydroMT without the Deltares project drive data is probably more beneficial. |
I'm in favor of this feature request, but it is indeed good to consider its scope (e.g.: only downscaling of forcing with nearest interpolation, no reprojection, etc.). Such a feature is also available in other models like SFINCS and CaMa-Flood and i'm sure many more. Just two small remarks regarding lapse rates based on the current hydroMT workflows.
An alternative solution could be to use BMI to pass forcing to the model and downscale it on the fly (with hydroMT). This would not require additional methods in wflow but is likely slower and definitely more complex but does come with greater flexibility to the choice of downscale methods. This could be implemented by making a BMI model adapter on top of an xarray dataset object. |
@DirkEilander : I think if you break up the problem in time slices of ~100 years, you have an alternative solution that does not require the use of BMI. |
Wflow is a computational engine for hydrology, thus without a focus on pre- or post-processing of input data (forcing, parameters) and output data. We're happy to leave that part to other software developments like for example HydroMT, and want to keep that boundary very clear. Furthermore, there are alternative solutions available for this specific problem (very long runs, large data storage), mentioned in some of the comments here. |
So, the conclusion is that we will not support this kind of functionality in Wflow. |
For the lazy reader, to enumerate these solutions in a single list:
The last option also provides a natural proving ground for new features. If after some time a significant fraction of use is via custom run logic, it stands to reason to then consider merging that logic into Wflow.jl proper. |
For custom run loops, point 3, it comes down to essentially copy pasting the run function from here, and making the changes you need. It would be good to document this, and perhaps we can make it a bit easier still as well. It works quite well, I've already used it for instance to create a Makie plot, which is then updated on each timestep, to visually show the state of the model as it runs. That is an example of something that we don't want by default in the computational core (heavy plotting dependencies), but can be useful for some applications. |
Many wflow applications use forcing (precipitation, potential evapotranspiration and temperature) at a (much) coarser resolution than the actual wflow model grid. Supporting regridding of forcing during a wflow run (in memory) could be a nice addition to wflow for these kind applications.
The text was updated successfully, but these errors were encountered: