Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Smart extraction / interpolation of points from large dfsu #482

Open
Hendrik1987 opened this issue Nov 23, 2022 · 4 comments
Open

Smart extraction / interpolation of points from large dfsu #482

Hendrik1987 opened this issue Nov 23, 2022 · 4 comments
Labels
enhancement New feature or request FAQ

Comments

@Hendrik1987
Copy link
Contributor

Hendrik1987 commented Nov 23, 2022

I still struggle to extract data from larger-then-memory .dfsu files.

Or maybe I am just not aware of the newest way to do this in mikeio?

What I have tried / done so far:

  1. read only first timestep of .dfsu
  2. identify n nearest elements
  3. read all data only for the identified elements
dfs = mikeio.open(fn)
mds0 = dfs.read(time=0)
elements = mds0.geometry.find_nearest_elements(x=pts.x, y=pts.y, n_nearest=3)
elements_load = np.unique(elements)
mds = dfs.read(elements=elements_load)
mds[0].plot()

image

  1. loop over the query points and interpolate from n nearest elements:
mdsi =  [mds.interp(x=x, y=y, n_nearest=3) for x, y in zip(pts.x, pts.y)]

image

What I think could be enhanced:

  1. Make this (or a better) workflow easily available, maybe via mikeio.read(x, y, n_nearest = 3) or dfs.extract_stations() as sister to dfs.extract_track()?
  2. Is there a way to then concat the mikeio.Datasets in the list mdsi, maybe resulting in a GeometryPoint2DCollection geometry?
  3. How do I write the extracted points to .dfs, while maintain geometry information?

Possibly related to fmskill enhancement

  • Here I also struggle to extract from large .dfsu model results. Maybe some of the same logic applies here?
  • ModelResult should be able to mikeio.Dataset/DataArray (Longlist of new ideas modelskill#132)
@Hendrik1987 Hendrik1987 added enhancement New feature or request FAQ labels Nov 23, 2022
@ecomodeller
Copy link
Member

  1. No, and I'm not sure it is a good idea to create such a geometry, since the existing geometries Grid1D, Grid2D, Grid3D, GeometryFM, GeometryFM3D, corresponds to different types of dfs files. I don't think we have an appropriate file format for a GeometryPoint2DCollection.
  2. It is possible to store the location for each item in a Dfs0 file (see mikecore example), but it is still not as flexible as a NetCDF, where you could have two dimensions (station, time) since there is no way of storing the name of the station other than in the name of the item e.g. "Buoy 2: Sign. Wave Height" or other station specific metadata.

@Hendrik1987
Copy link
Contributor Author

Hendrik1987 commented Nov 23, 2022

  1. And what about this point... ? :-)
  2. I see and understand the motivation.
  3. Yes, the "Meta_data: Parameter" is also the work around used when outputting .dfs0 directly from the MIKE model. I don't think it is a nice solution and maybe the .nc or the .dfs0 + .json or similar is the better choice.
    However, at the moment the GeometryPoint2D coordinates seem to be lost on .to_xarray(). So also after conversion I can currently not easily concat the xarray.DataSets.
    image

@ecomodeller
Copy link
Member

3, See PR #483 should take you one step closer to being able to concatenate data

@Hendrik1987
Copy link
Contributor Author

Hendrik1987 commented Nov 24, 2022

Thanks, that is a fix for 3, @ecomodeller

Taking a step back here, what I generally try to achieve is an interpolation along a new dimension. To do the same on a structured NetCDF file I would use this xarray equivalent.

In my special case I have a list of x, y coordinates and my new dimension would be points. And all the steps above are only my current solution, also handling a larger-than-memory dfsu file.

I understand the motivation to keep geometry types and .dfs files aligned.

Is the conclusion then to work on a dfsu.extract_points((x, y), n_nearest) -> xr.Dataset, where (x, y) can be a list or array of coordinates? (or maybe a df, allowing to specify further coordinates such as station_name)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request FAQ
Projects
None yet
Development

No branches or pull requests

2 participants