Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

groupby reduce with flox - any() got an unexpected keyword argument 'skipna' #8819

Closed
5 tasks done
claytharrison opened this issue Mar 11, 2024 · 5 comments · Fixed by xarray-contrib/flox#339
Closed
5 tasks done
Labels

Comments

@claytharrison
Copy link

claytharrison commented Mar 11, 2024

What happened?

When using xarray.DataArray.groupby with flox installed and any or all as the aggregator (and a different dimension for any than I use for grouping) I get the error any() (or all()) got an unexpected keyword argument 'skipna'

It seems that, while flox expects any aggregator to accept a skipna argument, any and all do not yet. I'm not sure if this is more xarray's problem or flox's (or my own?), but I've put the issue in here simply because it's more active.

What did you expect to happen?

Running the same code (below) without flox installed returns an expected result:

<xarray.DataArray (x: 2)> Size: 2B
array([False,  True])
Coordinates:
  * x        (x) int64 16B 10 20

Minimal Complete Verifiable Example

# with flox installed
import xarray as xr
import numpy as np
rand_bools = np.random.choice([True, False], size=(2,3))
data = xr.DataArray(rand_bools, dims=("x", "y"), coords={"x": [10, 20], "y": [0, 1, 2]})
data.groupby("x").any(dim="y")

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.
  • Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[243], line 3
      1 data = np.random.choice([True, False], size=(2,3))
      2 data = xr.DataArray(data, dims=("x", "y"), coords={"x": [10, 20], "y": [0, 1, 2]})
----> 3 data.groupby("x").any(dim="y")

File ~/micromamba/envs/egu2024/lib/python3.8/site-packages/xarray/core/_aggregations.py:5393, in DataArrayGroupByAggregations.any(self, dim, keep_attrs, **kwargs)
   5333 """
   5334 Reduce this DataArray's data by applying ``any`` along some dimension(s).
   5335 
   (...)
   5386   * labels   (labels) object 'a' 'b' 'c'
   5387 """
   5388 if (
   5389     flox_available
   5390     and OPTIONS["use_flox"]
   5391     and contains_only_dask_or_numpy(self._obj)
   5392 ):
-> 5393     return self._flox_reduce(
   5394         func="any",
   5395         dim=dim,
   5396         # fill_value=fill_value,
   5397         keep_attrs=keep_attrs,
   5398         **kwargs,
   5399     )
   5400 else:
   5401     return self.reduce(
   5402         duck_array_ops.array_any,
   5403         dim=dim,
   5404         keep_attrs=keep_attrs,
   5405         **kwargs,
   5406     )

File ~/micromamba/envs/egu2024/lib/python3.8/site-packages/xarray/core/groupby.py:756, in GroupBy._flox_reduce(self, dim, keep_attrs, **kwargs)
    753     expected_groups = (self._unique_coord.values,)
    754     isbin = False
--> 756 result = xarray_reduce(
    757     self._original_obj.drop_vars(non_numeric),
    758     group,
    759     dim=parsed_dim,
    760     expected_groups=expected_groups,
    761     isbin=isbin,
    762     keep_attrs=keep_attrs,
    763     **kwargs,
    764 )
    766 # Ignore error when the groupby reduction is effectively
    767 # a reduction of the underlying dataset
    768 result = result.drop_vars(unindexed_dims, errors="ignore")

File ~/micromamba/envs/egu2024/lib/python3.8/site-packages/flox/xarray.py:306, in xarray_reduce(obj, func, expected_groups, isbin, sort, dim, fill_value, dtype, method, engine, keep_attrs, skipna, min_count, reindex, *by, **finalize_kwargs)
    302     raise NotImplementedError(
    303         "func must be a string when reducing along a dimension not present in `by`"
    304     )
    305 # TODO: skipna needs test
--> 306 result = getattr(ds_broad, dsfunc)(dim=dim_tuple, skipna=skipna)
    307 if isinstance(obj, xr.DataArray):
    308     return obj._from_temp_dataset(result)

File ~/micromamba/envs/egu2024/lib/python3.8/site-packages/xarray/core/_aggregations.py:243, in DatasetAggregations.any(self, dim, keep_attrs, **kwargs)
    179 def any(
    180     self,
    181     dim: Dims = None,
   (...)
    184     **kwargs: Any,
    185 ) -> Dataset:
    186     """
    187     Reduce this Dataset's data by applying ``any`` along some dimension(s).
    188 
   (...)
    241         da       bool True
    242     """
--> 243     return self.reduce(
    244         duck_array_ops.array_any,
    245         dim=dim,
    246         numeric_only=False,
    247         keep_attrs=keep_attrs,
    248         **kwargs,
    249     )

File ~/micromamba/envs/egu2024/lib/python3.8/site-packages/xarray/core/dataset.py:5892, in Dataset.reduce(self, func, dim, keep_attrs, keepdims, numeric_only, **kwargs)
   5875         if (
   5876             # Some reduction functions (e.g. std, var) need to run on variables
   5877             # that don't have the reduce dims: PR5393
   (...)
   5885             # the former is often more efficient
   5886             # keep single-element dims as list, to support Hashables
   5887             reduce_maybe_single = (
   5888                 None
   5889                 if len(reduce_dims) == var.ndim and var.ndim != 1
   5890                 else reduce_dims
   5891             )
-> 5892             variables[name] = var.reduce(
   5893                 func,
   5894                 dim=reduce_maybe_single,
   5895                 keep_attrs=keep_attrs,
   5896                 keepdims=keepdims,
   5897                 **kwargs,
   5898             )
   5900 coord_names = {k for k in self.coords if k in variables}
   5901 indexes = {k: v for k, v in self._indexes.items() if k in variables}

File ~/micromamba/envs/egu2024/lib/python3.8/site-packages/xarray/core/variable.py:1955, in Variable.reduce(self, func, dim, axis, keep_attrs, keepdims, **kwargs)
   1951     if isinstance(axis, tuple) and len(axis) == 1:
   1952         # unpack axis for the benefit of functions
   1953         # like np.argmin which can't handle tuple arguments
   1954         axis = axis[0]
-> 1955     data = func(self.data, axis=axis, **kwargs)
   1956 else:
   1957     data = func(self.data, **kwargs)

File <__array_function__ internals>:198, in any(*args, **kwargs)

TypeError: any() got an unexpected keyword argument 'skipna'

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS

commit: None
python: 3.12.2 | packaged by conda-forge | (main, Feb 16 2024, 20:50:58) [GCC 12.3.0]
python-bits: 64
OS: Linux
OS-release: 5.15.0-92-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: C.UTF-8
LANG: en_US.UTF-8
LOCALE: ('C', 'UTF-8')
libhdf5: None
libnetcdf: None

xarray: 2024.2.0
pandas: 2.2.1
numpy: 1.26.4
scipy: 1.12.0
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: None
nc_time_axis: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
fsspec: None
cupy: None
pint: None
sparse: None
flox: 0.9.2
numpy_groupies: 0.10.2
setuptools: 69.1.1
pip: 24.0
conda: None
pytest: None
mypy: None
IPython: None
sphinx: None

@claytharrison claytharrison added bug needs triage Issue that has not been reviewed by xarray team member labels Mar 11, 2024
@max-sixty
Copy link
Collaborator

Which version of flox is installed? In the Environment, I see flox: None...

@claytharrison
Copy link
Author

Apologies, flox is 0.9.2. Looks like I ran show_versions after removing it to test the non-flox case. I've updated the environment details in the OP.

@max-sixty
Copy link
Collaborator

Yes, I can repro. CC @dcherian

@max-sixty max-sixty removed the needs triage Issue that has not been reviewed by xarray team member label Mar 12, 2024
dcherian added a commit to xarray-contrib/flox that referenced this issue Mar 13, 2024
dcherian added a commit to xarray-contrib/flox that referenced this issue Mar 13, 2024
dcherian added a commit to xarray-contrib/flox that referenced this issue Mar 13, 2024
dcherian added a commit to xarray-contrib/flox that referenced this issue Mar 13, 2024
* Fix direct reductions of Xarray objects

Closes pydata/xarray#8819

* Fix doctest
@dcherian
Copy link
Contributor

Thanks @claytharrison Should be fixed on flox 0.9.3.

@claytharrison
Copy link
Author

Thank you for the quick response!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants