Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regridding with iris.analysis.UnstructuredNearest does not preserve dtype and masks #4463

Closed
schlunma opened this issue Dec 16, 2021 · 10 comments · Fixed by #5062
Closed

Regridding with iris.analysis.UnstructuredNearest does not preserve dtype and masks #4463

schlunma opened this issue Dec 16, 2021 · 10 comments · Fixed by #5062

Comments

@schlunma
Copy link
Contributor

schlunma commented Dec 16, 2021

🐛 Bug Report

Hi guys, I found two issues with regridding using iris.analysis.UnstructuredNearest:

  1. The data type of the input cube is not preserved, the output type is always float64.
  2. The output is not masked, i.e., the output data is a regular array (and not a masked array) that contains values of the fill_value.

How To Reproduce

Minimal working example using this dataset (this is an nc file, but GitHub does not allow nc files): test.nc.txt

Test dataset
netcdf test.nc {
dimensions:
        time = 1 ;
        dim1 = 10 ;
        bnds_3 = 3 ;
variables:
        float tas(time, dim1) ;
                tas:long_name = "temperature in 2m" ;
                tas:units = "K" ;
                tas:coordinates = "clat clon" ;
        double time(time) ;
                time:standard_name = "time" ;
                time:invalid_units = "day as %Y%m%d.%f" ;
        float clat(dim1) ;
                clat:bounds = "clat_bnds" ;
                clat:units = "degrees_north" ;
                clat:standard_name = "latitude" ;
                clat:long_name = "center latitude" ;
        float clat_bnds(dim1, bnds_3) ;
        float clon(dim1) ;
                clon:bounds = "clon_bnds" ;
                clon:units = "degrees_east" ;
                clon:standard_name = "longitude" ;
                clon:long_name = "center longitude" ;
        float clon_bnds(dim1, bnds_3) ;

// global attributes:
                :CDI = "Climate Data Interface version 1.8.3rc (http://mpimet.mpg.de/cdi)" ;
                :CDI_grid_type = "unstructured" ;
                :uuidOfHGrid = "e941b1d0-ab58-11e8-ba4f-bdd1e82b9e6d" ;
                :Conventions = "CF-1.7" ;
data:

 tas =
  _, 293.6598, 292.4131, _, _, _, _, _, 291.7204, 291.7908 ;

 time = 20051001 ;

 clat = 52.61851, 52.93201, 52.93201, 51.98309, 53.88348, 53.56083, 54.50582, 
    53.56083, 51.97181, 52.60703 ;

 clat_bnds =
  53.25029, 52.29887, 52.29887,
  52.29887, 53.23846, 53.25029,
  53.25029, 53.23846, 52.29887,
  52.29887, 51.34985, 52.29887,
  54.19249, 54.19249, 53.25029,
  53.25029, 53.23846, 54.19249,
  54.19249, 55.13753, 54.19249,
  54.19249, 53.23846, 53.25029,
  52.27613, 51.33891, 52.29887,
  52.29887, 53.23846, 52.27613 ;

 clon = 73, 73.91156, 72.08844, 73, 73, 73.92289, 73, 72.07711, 71.22221, 
    71.20061 ;

 clon_bnds =
  73, 72.10562, 73.89438,
  73.89438, 74.83416, 73,
  73, 71.16584, 72.10562,
  72.10562, 73, 73.89438,
  73.94114, 72.05886, 73,
  73, 74.83416, 73.94114,
  73.94114, 73, 72.05886,
  72.05886, 71.16584, 73,
  70.31771, 71.25498, 72.10562,
  72.10562, 71.16584, 70.31771 ;
}
# Regrid unstructured grid

import numpy as np
import iris
import iris.coords
import iris.cube
from iris.analysis import UnstructuredNearest

cube = iris.load_cube('test.nc')

scheme = UnstructuredNearest()
target_lat = iris.coords.DimCoord([50.0, 52.0, 54.0], standard_name='latitude', units='degrees')
target_lon = iris.coords.DimCoord([70.0, 71.0, 72.0], standard_name='longitude', units='degrees')
target_grid = iris.cube.Cube(np.full((3, 3), 0.0), dim_coords_and_dims=[(target_lat, 0), (target_lon, 1)])

regridded_cube = cube.regrid(target_grid, scheme)

print("dtype of original cube:", cube.dtype)
print("dtype of regridded cube:", regridded_cube.dtype)
print("")

print("data:")
print(regridded_cube.data)
print("data is masked?", np.ma.is_masked(regridded_cube.data))

print("")
print("Original fill value:", cube.data.fill_value)

gives

dtype of original cube: float32
dtype of regridded cube: float64

data:
[[[2.91720398e+02 2.91720398e+02 2.91720398e+02]
  [2.91720398e+02 2.91720398e+02 2.91720398e+02]
  [9.96920997e+36 9.96920997e+36 9.96920997e+36]]]
data is masked? False

Original fill value: 9.96921e+36

Expected behaviour

The dtype should be preserved, i.e., in the example both dtypes should be float32 and the data should be masked correctly, i.e., it should look like this:

[[[2.91720398e+02 2.91720398e+02 2.91720398e+02]
  [2.91720398e+02 2.91720398e+02 2.91720398e+02]
  [      --             --             --      ]]]

Environment

  • OS & Version: openSUSE Tumbleweed
  • Iris Version: 3.1.0

I know that you are currently refactoring the support of unstructured grids. I'm not entirely sure if this also includes new regridders or not; if this is the case and you are in the middle of rewriting this regridder, please ignore this issue.

Thanks for your help 👍

@bjlittle
Copy link
Member

bjlittle commented Dec 17, 2021

Hey @schlunma,

Great to hear from you and thanks for taking the time to raise this issue 👍

Note that, if you're interested in regridding with unstructured grids, I'd highly recommend that you take a look at https://github.com/SciTools-incubator/iris-esmf-regrid

We're actively developing this package in conjunction with our current UGRID work, and @stephenworsley (lead developer) would be super keen for feedback... He'd also be on hand to help you with using iris-esmf-regrid and introducing you to that world.

Note that, like all our regridders, it plums straight into iris using the regrid/interpolate scheme pattern... give it a whirl.

It's also available on both conda-forge and pypi 👍

@bjlittle bjlittle added this to Backlog in Iris v3.2.0 via automation Dec 17, 2021
@bjlittle bjlittle added this to the v3.2.0 milestone Dec 17, 2021
@schlunma
Copy link
Contributor Author

Thanks for the quick answer @bjlittle!

iris-esmf-regrid looks super interesting, I will definitely have a look at that. One question on this: is there any documentation available for it? I couldn't find it on the GitHub page. Thanks!!

@bjlittle
Copy link
Member

bjlittle commented Dec 17, 2021

No worries, glad to help.

It's a great question, thanks for asking. The simple answer is no, not yet... It's that new! 😉

Although I'm sure @stephenworsley has plans in the pipeline. Why not create an issue on iris-esmf-regrid asking for some rudimentary docs. If you want it, so will others, so let's make sure that happens.

Perhaps @stephenworsley could provide you with some code examples in the meantime.

What do you think @stephenworsley ? What's the best way for @schlunma to spin up and start playing ? 🤔

@jonseddon
Copy link
Contributor

Rudimentary docs would be really useful for me too please and so I've created SciTools-incubator/iris-esmf-regrid#139

@bjlittle
Copy link
Member

Note that, at the moment iris-esmf-regrid only supports area weighted regridding from lat/lat to unstructured, and vice versa... Although @stephenworsley is more qualified to comment on that

@schlunma
Copy link
Contributor Author

schlunma commented Dec 17, 2021

Thanks for opening the issue @jonseddon!

Note that, at the moment iris-esmf-regrid only supports area weighted regridding from lat/lat to unstructured, and vice versa... Although @stephenworsley is more qualified to comment on that

That sounds great for a start! I actually have a use case where area-weighted regridding from unstructured grids to lat/lon could come in handy 👍

@stephenworsley
Copy link
Contributor

So currently iris-esmf-regrid only handles area weighted regridding. This is not applicable in all cases since it requires bounds information to be present on the source and target cubes, though if you do have compatible cubes you should find it useful. With that said, ESMF itself does provide good support for nearest neighbour regridding and I think it would be worthwhile to eventually add a nearest neighbour regridder to iris-esmf-regrid. I imagine this may potentially be able to cover a wider range of types of nearest neighbour regridding than iris currently handles (e.g. distinguishing between nearest-source-to-destination and nearest-destination-to-source regridding) so feel free to add an issue to iris-esmf-regrid if you feel like you need this functionality.

@stephenworsley
Copy link
Contributor

Also, I'm interested to know what you believe the correct handling of masks should be. Comments in the code suggest that the current handling is not correct:

# This is **not** proper mask handling, because we cannot produce a
# masked result, but it ensures we use a "filled" version of the
# input in this case.

Though I could think of different behaviours which could claim to be "more correct".

  1. Masking the output when the nearest input is masked
  2. Taking the value of the nearst unmasked input.

I would imagine that the first of these would be easier to implement, especially when additional dimensions are involved. It may also be more appropriate in certain cases, though I could imagine there may also be cases where the second case would be more appropriate. I'd have to do some more research into existing nearest neighbour regridders like ESMF to see which is "standard".

@schlunma
Copy link
Contributor Author

Hmm...that's a tricky question. For me both options make sense, but I tend to agree more with option 1.

From what I can tell from the code (filling the data before interpolation) I guess you are kind of using option 1 at the moment without applying a mask afterwards but just leaving the fill values as is, right?

@stephenworsley
Copy link
Contributor

That appears to be right as far as I can tell, yes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Status: Done
Development

Successfully merging a pull request may close this issue.

6 participants