Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keep xarray attributes and dtype after regridding #66

Open
raspstephan opened this issue Oct 3, 2019 · 3 comments
Open

Keep xarray attributes and dtype after regridding #66

raspstephan opened this issue Oct 3, 2019 · 3 comments

Comments

@raspstephan
Copy link
Contributor

Currently, the regridding seems to delete the attributes of the original dataset. I assume the happens during xr.apply_ufunc. Is there any reason not use keep_attrs=True?

Similarly, all data is converted to 64 bit floats, even if the input data is 32 bit. Would it be reasonable to use output_dtypes=[dr_in.dtype] instead of output_dtypes=[float]?

I am happy to create a pull request of this if nothing speaks against these changes.

@JiaweiZhuang
Copy link
Owner

JiaweiZhuang commented Oct 3, 2019

Thanks for bringing this up. PRs are welcome!

Is there any reason not use keep_attrs=True?

The only reason is that xr.apply_ufunc defaults to keep_attrs=False and I just keep the defaults. To be consistent with xr.apply_ufunc, I would suggest an optional keep_attrs kwarg that defaults to False, and you can set it to True if needed.

regridder(indata)  # doesn't keep attributes
regridder(indata, keep_attrs=True)  # keeps attributes

Regarding the data type, that's because ESMF stores regridding weights in float64. In numpy, float32 * float64 gives float64. Changing output_dtypes won't actually help in this case. Consider this example:

import numpy as np
import xarray as xr
a = np.array([1, 2, 3], dtype=np.float64)
x = np.array([1, 2, 3], dtype=np.float32)
out = a * x
out.dtype  # float64
out2 = xr.apply_ufunc(lambda x: a * x, x, output_dtypes=[np.float32])
out2.dtype  # still float64

You can cast regridder.weights to np.float32, using scipy.sparse.coo_matrix.astype(). This is actually also useful for nearest neighbor methods where the weights are just 1.0 and can be cast to integers for regridding categorical variables.

@JiaweiZhuang
Copy link
Owner

JiaweiZhuang commented Oct 3, 2019

Is it useful to have a method to set weights dtype in the Regridder class? It would be just one line:

def set_dtype(self, dtype)
    self.weights = self.weights.astype(dtype)

@raspstephan
Copy link
Contributor Author

I created a pull request to implement keep_attrs.
The datatype is not such a big issue for me, since it's just as easy to convert the data afterwards.

aulemahal pushed a commit to Ouranosinc/xESMF that referenced this issue Jan 26, 2021
Regrid xarray by matching dimension names
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants