Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support multiple dimensions in DataArray.argmin() and DataArray.argmax() methods #3936

Merged
merged 46 commits into from Jun 29, 2020
Merged
Show file tree
Hide file tree
Changes from 17 commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
8e7fb53
DataArray.indices_min() and DataArray.indices_max() methods
johnomotani Apr 5, 2020
2b06811
Update whats-new.rst and api.rst with indices_min(), indices_max()
johnomotani Apr 5, 2020
f6a966c
Fix type checking in DataArray._unravel_argminmax()
johnomotani Apr 5, 2020
4395e7a
Fix expected results for TestReduce3D.test_indices_max()
johnomotani Apr 5, 2020
deee3f8
Respect global default for keep_attrs
johnomotani Apr 6, 2020
be8b26c
Merge behaviour of indices_min/indices_max into argmin/argmax
johnomotani Apr 6, 2020
6d9d509
Basic overload of argmin() and argmax() for Dataset
johnomotani Apr 6, 2020
70aaa9d
Update Variable and dask tests with _argmin_base, _argmax_base
johnomotani Apr 6, 2020
f8952a8
Update api-hidden.rst with _argmin_base and _argmax_base
johnomotani Apr 6, 2020
8caf2b8
Explicitly defined class methods override injected methods
johnomotani Apr 7, 2020
4778cfd
Move StringAccessor back to bottom of DataArray class definition
johnomotani Apr 7, 2020
66cf085
Revert use of _argmin_base and _argmax_base
johnomotani Apr 7, 2020
c78c1fe
Move implementation of argmin, argmax from DataArray to Variable
johnomotani Apr 8, 2020
cb6742d
Update tests for change to coordinates on result of argmin, argmax
johnomotani Apr 8, 2020
ab480b5
Add 'out' keyword to argmin/argmax methods - allow numpy call signature
johnomotani Apr 10, 2020
dca8e45
Update and correct docstrings for argmin and argmax
johnomotani Apr 10, 2020
52554b6
Correct suggested replacement for da.argmin() and da.argmax()
johnomotani Apr 10, 2020
ef826f6
Remove use of _injected_ methods in argmin/argmax
shoyer Apr 21, 2020
8a7c7ad
Fix typo in name of argminmax_func
johnomotani Apr 21, 2020
e56e2e7
Mark argminmax argument to _unravel_argminmax as a string
johnomotani Apr 21, 2020
a99697a
Hidden internal methods don't need to appear in docs
johnomotani Apr 21, 2020
a785c34
Basic docstrings for Dataset.argmin() and Dataset.argmax()
johnomotani Apr 21, 2020
ac897d4
Set stacklevel for DeprecationWarning in argmin/argmax methods
johnomotani Apr 21, 2020
752518e
Revert "Explicitly defined class methods override injected methods"
johnomotani Apr 21, 2020
8b7365b
Revert "Add 'out' keyword to argmin/argmax methods - allow numpy call…
johnomotani Apr 21, 2020
46b04a6
Remove argmin and argmax from ops.py
johnomotani Apr 21, 2020
1ef3c97
Use self.reduce() in Dataset.argmin() and Dataset.argmax()
johnomotani Apr 21, 2020
65ca2ad
Whitespace after 'title' lines in docstrings
johnomotani Apr 21, 2020
1736abf
Remove tests of np.argmax() and np.argmin() functions from test_units.py
johnomotani Apr 21, 2020
d9b55ee
Clearer deprecation warnings in Dataset.argmin() and Dataset.argmax()
johnomotani Apr 21, 2020
432dfbb
Add unravel_index to duck_array_ops, use in Variable._unravel_argminmax
johnomotani Apr 21, 2020
20b448a
Filter argmin/argmax DeprecationWarnings in tests
johnomotani Apr 21, 2020
95845f9
Correct test for exception for nan in test_argmax
johnomotani Apr 21, 2020
0ee5146
Remove injected argmin and argmax methods from api-hidden.rst
johnomotani Apr 21, 2020
daa2ea5
Merge branch 'master' into indices_minmax
johnomotani Jun 24, 2020
d029183
flake8 fixes
johnomotani Jun 24, 2020
a758b0f
Tidy up argmin/argmax following code review
johnomotani Jun 26, 2020
9a54e0c
Remove filters for warnings from argmin/argmax from tests
johnomotani Jun 26, 2020
a07ce29
Swap order of reduce_dims checks in Dataset.reduce()
johnomotani Jun 26, 2020
f73e10e
Merge branch 'master' into indices_minmax
keewis Jun 26, 2020
d77fe11
revert the changes to Dataset.reduce
keewis Jun 26, 2020
308bb23
use dim instead of axis
keewis Jun 26, 2020
1b53f49
use dimension instead of Ellipsis
keewis Jun 26, 2020
5f80205
Make passing 'dim=...' to Dataset.argmin() or Dataset.argmax() an error
johnomotani Jun 26, 2020
540c281
Better docstrings for Dataset.argmin() and Dataset.argmax()
johnomotani Jun 26, 2020
4aca9d9
Update doc/whats-new.rst
johnomotani Jun 27, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
12 changes: 12 additions & 0 deletions doc/api-hidden.rst
Expand Up @@ -18,6 +18,8 @@
Dataset.any
Dataset.argmax
Dataset.argmin
Dataset._injected_argmax
Dataset._injected_argmin
johnomotani marked this conversation as resolved.
Show resolved Hide resolved
Dataset.max
Dataset.min
Dataset.mean
Expand Down Expand Up @@ -94,6 +96,8 @@
core.resample.DatasetResample.ffill
core.resample.DatasetResample.fillna
core.resample.DatasetResample.first
core.resample.DatasetResample._injected_argmax
core.resample.DatasetResample._injected_argmin
core.resample.DatasetResample.last
core.resample.DatasetResample.map
core.resample.DatasetResample.max
Expand Down Expand Up @@ -160,6 +164,8 @@
DataArray.any
DataArray.argmax
DataArray.argmin
DataArray._injected_argmax
DataArray._injected_argmin
DataArray.max
DataArray.min
DataArray.mean
Expand Down Expand Up @@ -234,6 +240,8 @@
core.resample.DataArrayResample.ffill
core.resample.DataArrayResample.fillna
core.resample.DataArrayResample.first
core.resample.DataArrayResample._injected_argmax
core.resample.DataArrayResample._injected_argmin
core.resample.DataArrayResample.last
core.resample.DataArrayResample.map
core.resample.DataArrayResample.max
Expand Down Expand Up @@ -369,6 +377,8 @@
Variable.fillna
Variable.get_axis_num
Variable.identical
Variable._injected_argmax
Variable._injected_argmin
Variable.isel
Variable.isnull
Variable.item
Expand Down Expand Up @@ -442,6 +452,8 @@
IndexVariable.get_axis_num
IndexVariable.get_level_variable
IndexVariable.identical
IndexVariable._injected_argmax
IndexVariable._injected_argmin
IndexVariable.isel
IndexVariable.isnull
IndexVariable.item
Expand Down
7 changes: 7 additions & 0 deletions doc/whats-new.rst
Expand Up @@ -29,6 +29,13 @@ Breaking changes

New Features
~~~~~~~~~~~~
- :py:meth:`DataArray.argmin` and :py:meth:`DataArray.argmax` now support
sequences of 'dim' arguments, and if a sequence is passed return a dict
(which can be passed to :py:meth:`isel` to get the value of the minimum) of
the indices for each dimension of the minimum or maximum of a DataArray.
(:pull:`3936`)
By `John Omotani <https://github.com/johnomotani>`_, thanks to `Keisuke Fujii
<https://github.com/fujiisoup>`_ for work in :pull:`1469`.
- Added :py:meth:`DataArray.polyfit` and :py:func:`xarray.polyval` for fitting polynomials. (:issue:`3349`)
johnomotani marked this conversation as resolved.
Show resolved Hide resolved
By `Pascal Bourgault <https://github.com/aulemahal>`_.
- Control over attributes of result in :py:func:`merge`, :py:func:`concat`,
Expand Down
209 changes: 209 additions & 0 deletions xarray/core/dataarray.py
Expand Up @@ -3722,6 +3722,215 @@ def idxmax(
keep_attrs=keep_attrs,
)

def argmin(
self,
dim: Union[Hashable, Sequence[Hashable]] = None,
axis: Union[int, None] = None,
johnomotani marked this conversation as resolved.
Show resolved Hide resolved
keep_attrs: bool = None,
skipna: bool = None,
out=None,
) -> Union["DataArray", Dict[Hashable, "DataArray"]]:
"""Index or indices of the minimum of the DataArray over one or more dimensions.
If a sequence is passed to 'dim', then result returned as dict of DataArrays,
johnomotani marked this conversation as resolved.
Show resolved Hide resolved
which can be passed directly to isel(). If a single str is passed to 'dim' then
returns an int.
johnomotani marked this conversation as resolved.
Show resolved Hide resolved

If there are multiple minima, the indices of the first one found will be
returned.

Parameters
----------
dim : hashable, sequence of hashable or ..., optional
The dimensions over which to find the minimum. By default, finds minimum over
all dimensions - for now returning an int for backward compatibility, but
this is deprecated, in future will return a dict with indices for all
dimensions; to return a dict with all dimensions now, pass '...'.
axis : int, optional
Axis over which to apply `argmin`. Only one of the 'dim' and 'axis' arguments
can be supplied.
keep_attrs : bool, optional
If True, the attributes (`attrs`) will be copied from the original
object to the new one. If False (default), the new object will be
returned without attributes.
skipna : bool, optional
If True, skip missing values (as marked by NaN). By default, only
skips missing values for float dtypes; other dtypes either do not
have a sentinel missing value (int) or skipna=True has not been
implemented (object, datetime64 or timedelta64).
out : None
'out' should not be passed - provided for compatibility with numpy function
signature
johnomotani marked this conversation as resolved.
Show resolved Hide resolved

Returns
-------
result : DataArray or dict of DataArray

See also
--------
Variable.argmin, DataArray.idxmin

Examples
--------
>>> array = xr.DataArray([0, 2, -1, 3], dims="x")
>>> array.min()
<xarray.DataArray ()>
array(-1)
>>> array.argmin()
<xarray.DataArray ()>
array(2)
>>> array.argmin(...)
{'x': <xarray.DataArray ()>
array(2)}
>>> array.isel(array.argmin(...))
array(-1)

>>> array = xr.DataArray([[[3, 2, 1], [3, 1, 2], [2, 1, 3]],
... [[1, 3, 2], [2, -5, 1], [2, 3, 1]]],
... dims=("x", "y", "z"))
>>> array.min(dim="x")
<xarray.DataArray (y: 3, z: 3)>
array([[ 1, 2, 1],
[ 2, -5, 1],
[ 2, 1, 1]])
Dimensions without coordinates: y, z
>>> array.argmin(dim="x")
<xarray.DataArray (y: 3, z: 3)>
array([[1, 0, 0],
[1, 1, 1],
[0, 0, 1]])
Dimensions without coordinates: y, z
>>> array.argmin(dim=["x"])
{'x': <xarray.DataArray (y: 3, z: 3)>
array([[1, 0, 0],
[1, 1, 1],
[0, 0, 1]])
Dimensions without coordinates: y, z}
>>> array.min(dim=("x", "z"))
<xarray.DataArray (y: 3)>
array([ 1, -5, 1])
Dimensions without coordinates: y
>>> array.argmin(dim=["x", "z"])
{'x': <xarray.DataArray (y: 3)>
array([0, 1, 0])
Dimensions without coordinates: y, 'z': <xarray.DataArray (y: 3)>
array([2, 1, 1])
Dimensions without coordinates: y}
>>> array.isel(array.argmin(dim=["x", "z"]))
<xarray.DataArray (y: 3)>
array([ 1, -5, 1])
Dimensions without coordinates: y
"""
result = self.variable.argmin(dim, axis, keep_attrs, skipna, out)
if isinstance(result, dict):
return {k: self._replace_maybe_drop_dims(v) for k, v in result.items()}
else:
return self._replace_maybe_drop_dims(result)

def argmax(
self,
dim: Union[Hashable, Sequence[Hashable]] = None,
axis: Union[int, None] = None,
johnomotani marked this conversation as resolved.
Show resolved Hide resolved
keep_attrs: bool = None,
skipna: bool = None,
out=None,
) -> Union["DataArray", Dict[Hashable, "DataArray"]]:
"""Index or indices of the maximum of the DataArray over one or more dimensions.
If a sequence is passed to 'dim', then result returned as dict of DataArrays,
which can be passed directly to isel(). If a single str is passed to 'dim' then
returns an int.
johnomotani marked this conversation as resolved.
Show resolved Hide resolved

If there are multiple maxima, the indices of the first one found will be
returned.

Parameters
----------
dim : hashable, sequence of hashable or ..., optional
The dimensions over which to find the maximum. By default, finds maximum over
all dimensions - for now returning an int for backward compatibility, but
this is deprecated, in future will return a dict with indices for all
dimensions; to return a dict with all dimensions now, pass '...'.
axis : int, optional
Axis over which to apply `argmin`. Only one of the 'dim' and 'axis' arguments
can be supplied.
keep_attrs : bool, optional
If True, the attributes (`attrs`) will be copied from the original
object to the new one. If False (default), the new object will be
returned without attributes.
skipna : bool, optional
If True, skip missing values (as marked by NaN). By default, only
skips missing values for float dtypes; other dtypes either do not
have a sentinel missing value (int) or skipna=True has not been
implemented (object, datetime64 or timedelta64).
out : None
'out' should not be passed - provided for compatibility with numpy function
signature

Returns
-------
result : DataArray or dict of DataArray

See also
--------
Variable.argmax, DataArray.idxmax

Examples
--------
>>> array = xr.DataArray([0, 2, -1, 3], dims="x")
>>> array.max()
<xarray.DataArray ()>
array(3)
>>> array.argmax()
<xarray.DataArray ()>
array(3)
>>> array.argmax(...)
{'x': <xarray.DataArray ()>
array(3)}
>>> array.isel(array.argmax(...))
<xarray.DataArray ()>
array(3)

>>> array = xr.DataArray([[[3, 2, 1], [3, 1, 2], [2, 1, 3]],
... [[1, 3, 2], [2, 5, 1], [2, 3, 1]]],
... dims=("x", "y", "z"))
>>> array.max(dim="x")
<xarray.DataArray (y: 3, z: 3)>
array([[3, 3, 2],
[3, 5, 2],
[2, 3, 3]])
Dimensions without coordinates: y, z
>>> array.argmax(dim="x")
<xarray.DataArray (y: 3, z: 3)>
array([[0, 1, 1],
[0, 1, 0],
[0, 1, 0]])
Dimensions without coordinates: y, z
>>> array.argmax(dim=["x"])
{'x': <xarray.DataArray (y: 3, z: 3)>
array([[0, 1, 1],
[0, 1, 0],
[0, 1, 0]])
Dimensions without coordinates: y, z}
>>> array.max(dim=("x", "z"))
<xarray.DataArray (y: 3)>
array([3, 5, 3])
Dimensions without coordinates: y
>>> array.argmax(dim=["x", "z"])
{'x': <xarray.DataArray (y: 3)>
array([0, 1, 0])
Dimensions without coordinates: y, 'z': <xarray.DataArray (y: 3)>
array([0, 1, 2])
Dimensions without coordinates: y}
>>> array.isel(array.argmax(dim=["x", "z"]))
<xarray.DataArray (y: 3)>
array([3, 5, 3])
Dimensions without coordinates: y
"""
result = self.variable.argmax(dim, axis, keep_attrs, skipna, out)
if isinstance(result, dict):
return {k: self._replace_maybe_drop_dims(v) for k, v in result.items()}
else:
return self._replace_maybe_drop_dims(result)

# this needs to be at the end, or mypy will confuse with `str`
# https://mypy.readthedocs.io/en/latest/common_issues.html#dealing-with-conflicting-names
str = property(StringAccessor)
Expand Down
50 changes: 50 additions & 0 deletions xarray/core/dataset.py
Expand Up @@ -6294,5 +6294,55 @@ def idxmax(
),
)

def argmin(self, dim=None, axis=None, **kwargs):
johnomotani marked this conversation as resolved.
Show resolved Hide resolved
if dim is None and axis is None:
warnings.warn(
"Behaviour of DataArray.argmin() with neither dim nor axis argument "
johnomotani marked this conversation as resolved.
Show resolved Hide resolved
"will change to return a dict of indices of each dimension, and then it "
"will be an error to call Dataset.argmin() with no argument. To get a "
"single, flat index, please use np.argmin(ds) instead of ds.argmin().",
DeprecationWarning,
johnomotani marked this conversation as resolved.
Show resolved Hide resolved
)
if (
dim is None
or axis is not None
or not isinstance(dim, Sequence)
or isinstance(dim, str)
):
# Return int index if single dimension is passed, and is not part of a
# sequence
return getattr(self, "_injected_argmin")(dim=dim, axis=axis, **kwargs)
else:
raise ValueError(
"When dim is a sequence, DataArray.argmin() returns a "
"dict. dicts cannot be contained in a Dataset, so cannot "
"call Dataset.argmin() with a sequence for dim"
)

def argmax(self, dim=None, axis=None, **kwargs):
if dim is None and axis is None:
warnings.warn(
"Behaviour of DataArray.argmin() with neither dim nor axis argument "
"will change to return a dict of indices of each dimension, and then it "
"will be an error to call Dataset.argmin() with no argument. To get a "
"single, flat index, please use np.argmin(ds) instead of ds.argmin().",
DeprecationWarning,
)
if (
dim is None
or axis is not None
or not isinstance(dim, Sequence)
or isinstance(dim, str)
):
# Return int index if single dimension is passed, and is not part of a
# sequence
return getattr(self, "_injected_argmax")(dim=dim, axis=axis, **kwargs)
else:
raise ValueError(
"When dim is a sequence, DataArray.argmax() returns a "
"dict. dicts cannot be contained in a Dataset, so cannot "
"call Dataset.argmax() with a sequence for dim"
)


ops.inject_all_ops_and_reduce_methods(Dataset, array_only=False)
19 changes: 18 additions & 1 deletion xarray/core/ops.py
Expand Up @@ -260,6 +260,9 @@ def inject_reduce_methods(cls):
+ [("count", duck_array_ops.count, False)]
)
for name, f, include_skipna in methods:
if hasattr(cls, name):
name = "_injected_" + name
johnomotani marked this conversation as resolved.
Show resolved Hide resolved

numeric_only = getattr(f, "numeric_only", False)
available_min_count = getattr(f, "available_min_count", False)
min_count_docs = _MINCOUNT_DOCSTRING if available_min_count else ""
Expand All @@ -278,6 +281,9 @@ def inject_reduce_methods(cls):
def inject_cum_methods(cls):
methods = [(name, getattr(duck_array_ops, name), True) for name in NAN_CUM_METHODS]
for name, f, include_skipna in methods:
if hasattr(cls, name):
name = "_injected_" + name

numeric_only = getattr(f, "numeric_only", False)
func = cls._reduce_method(f, include_skipna, numeric_only)
func.__name__ = name
Expand Down Expand Up @@ -325,24 +331,35 @@ def inject_all_ops_and_reduce_methods(cls, priority=50, array_only=True):

# patch in standard special operations
for name in UNARY_OPS:
if hasattr(cls, op_str(name)):
name = "_injected_" + name
setattr(cls, op_str(name), cls._unary_op(get_op(name)))
inject_binary_ops(cls, inplace=True)

# patch in numpy/pandas methods
for name in NUMPY_UNARY_METHODS:
if hasattr(cls, op_str(name)):
name = "_injected_" + name
setattr(cls, name, cls._unary_op(_method_wrapper(name)))

for name in PANDAS_UNARY_FUNCTIONS:
f = _func_slash_method_wrapper(getattr(duck_array_ops, name), name=name)
if hasattr(cls, op_str(name)):
name = "_injected_" + name
setattr(cls, name, cls._unary_op(f))

f = _func_slash_method_wrapper(duck_array_ops.around, name="round")
setattr(cls, "round", cls._unary_op(f))
if hasattr(cls, "round"):
setattr(cls, "_injected_round", cls._unary_op(f))
johnomotani marked this conversation as resolved.
Show resolved Hide resolved
else:
setattr(cls, "round", cls._unary_op(f))

if array_only:
# these methods don't return arrays of the same shape as the input, so
# don't try to patch these in for Dataset objects
for name in NUMPY_SAME_METHODS:
if hasattr(cls, op_str(name)):
name = "_injected_" + name
setattr(cls, name, _values_method_wrapper(name))

inject_reduce_methods(cls)
Expand Down