Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Masking and preserving int type #3955

Closed
zxdawn opened this issue Apr 8, 2020 · 5 comments
Closed

Masking and preserving int type #3955

zxdawn opened this issue Apr 8, 2020 · 5 comments

Comments

@zxdawn
Copy link

zxdawn commented Apr 8, 2020

When DataArray is masked by .where(), the type is converted to float64.

But, if we need to use the DataArray ouput from .where() in .isel(), the dtype should be int.
(#3949 )

MCVE Code Sample

import numpy as np
import xarray as xr

val_arr = xr.DataArray(np.arange(27).reshape(3, 3, 3),
                       dims=['z', 'y', 'x'])

z_indices = xr.DataArray(np.array([[1, 0, 2],
                                  [0, 0, 1],
                                  [-2222, 0, 1]]),
                         dims=['y', 'x'])

fill_value = -2222
sub = z_indices.where(z_indices != fill_value)
indexed_array = val_arr.isel(z=sub)

Expected Output

array([[ 1,  0,  2],
       [ 0,  0,  1],
       [nan,  0,  1]])

Problem Description

  File "E:\miniconda3\envs\satpy\lib\site-packages\xarray\core\indexing.py", line 446, in __init__
    f"invalid indexer array, does not have integer dtype: {k!r}"
TypeError: invalid indexer array, does not have integer dtype: array([[ 1.,  0.,  2.],
       [ 0.,  0.,  1.],
       [nan,  0.,  1.]])

Currently, pandas supports NaN values. Is this possible for xarray? or another method around?

@kmuehlbauer
Copy link
Contributor

kmuehlbauer commented Apr 8, 2020

There has been a lot of discussion about the int vs nan problem in the past, here one issue #1194. My question for xarray-devs would be too, if there is some idea on adapting to the pandas scheme?

In the time being, you might just go the other way round (isel before where) and this little hack:

# overwrite fill_values with 0
sub = xr.where(z_indices == fill_value, 0, z_indices)
# isel with sub and mask with where
indexed_array = val_arr.isel(z=sub).where(z_indices != fill_value)

Update: Nevermind, this will make the indexed_array a float. You might use the same where-machinery and overwrite with a fill_value of your liking:

# overwrite fill_values with 0
sub = xr.where(z_indices == fill_value, 0, z_indices)
# isel with sub and mask with where
indexed_array = val_arr.isel(z=sub)
indexed_array = xr.where(z_indices == fill_value, fill_value, indexed_array)

I can't immediately see, but there might be a cleaner way to achieve this.

@zxdawn
Copy link
Author

zxdawn commented Apr 8, 2020

@kmuehlbauer Thanks, Nice trick! It works well for this situation.

@shoyer
Copy link
Member

shoyer commented Apr 9, 2020

I would love to have support for integer NA values in xarray, but I don't think we want to build it into xarray.

Ideally this would either be built into NumPy (i.e., with a custom dtype, which will require some work before its possible) or someone could build an "integer with NA" duckarray, which could implement the various NumPy protocols such as __array_function__. The later is a bit less elegant but could be done today with very few changes in xarray.

@stale
Copy link

stale bot commented May 2, 2022

In order to maintain a list of currently relevant issues, we mark issues as stale after a period of inactivity

If this issue remains relevant, please comment here or remove the stale label; otherwise it will be marked as closed automatically

@stale stale bot added the stale label May 2, 2022
@dcherian
Copy link
Contributor

dcherian commented May 2, 2022

Closing as an upstream issue. We need either numpy to add support or for pandas extensionarrays to support the necssary protocols (#5287)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants