Skip to content
8 changes: 7 additions & 1 deletion xarray/core/dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -6065,6 +6065,8 @@ def drop_sel(
Data variables:
A (x, y) int64 32B 0 2 3 5
"""
from xarray.core.dataarray import DataArray
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've imported DataArray here as that seems to fit with the style in other parts of the codebase (here), however I'm not entirely sure why this is done.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that dataarray.py imports dataset.py already (historically Dataset predates DataArray slightly), so this avoids a recursive import.

Copy link
Contributor Author

@owena11 owena11 Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks for the explanation👍


if errors not in ["raise", "ignore"]:
raise ValueError('errors must be either "raise" or "ignore"')

Expand All @@ -6076,7 +6078,11 @@ def drop_sel(
# is a large numpy array
if utils.is_scalar(labels_for_dim):
labels_for_dim = [labels_for_dim]
labels_for_dim = np.asarray(labels_for_dim)
# Most conversion to arrays is better handled in the indexer, however
# DataArrays are a special case where the underlying libraries don't provide
# a good conversition.
if isinstance(labels_for_dim, DataArray):
labels_for_dim = np.asarray(labels_for_dim)
Comment on lines +6081 to +6085
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you wanted to make this a little safer, could add:

Suggested change
# Most conversion to arrays is better handled in the indexer, however
# DataArrays are a special case where the underlying libraries don't provide
# a good conversition.
if isinstance(labels_for_dim, DataArray):
labels_for_dim = np.asarray(labels_for_dim)
# Most conversion to arrays is better handled in the indexer, however
# DataArrays are a special case where the underlying libraries don't provide
# a good conversition.
if isinstance(labels_for_dim, DataArray):
if labels_for_dim.dims not in ((), (dim,)):
raise ValueError(
"cannot use drop_sel() with DataArray values with "
"along dimensions other than the dimensions being "
f"indexed along: {labels_for_dim}"
)
labels_for_dim = np.asarray(labels_for_dim)

But this LGTM to me! Definitely an incremental improvement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit and then reverted, highlighed by the tests it might break peoples current usage where a DataArray gets assigned the default dim names (i.e dim_0 etc) .

Also despite thinking this would nudge sel and drop_sel to be more consistent. After checking neither enforcing the alignment between dims for selecting with a DataArray:

>>> data = xr.Dataset({"x": ["a", "b"]})
>>> data.sel(x=xr.DataArray(["a",], dims=("y",)))
<xarray.Dataset> Size: 4B
Dimensions:  (y: 1)
Coordinates:
    x        (y) <U1 4B 'a'
Dimensions without coordinates: y
Data variables:
    *empty*

So would propose leaving this change for now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, sel() imposes the dimensions and coordinates of the indexer rather than checking for alignment. It is not obvious to me what the inverse of that would be!

try:
index = self.get_index(dim)
except KeyError as err:
Expand Down
15 changes: 15 additions & 0 deletions xarray/tests/test_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -2980,6 +2980,21 @@ def test_drop_multiindex_level(self) -> None:
actual = data.drop_vars("level_1")
assert_identical(expected, actual)

def test_drop_multiindex_labels(self) -> None:
data = create_test_multiindex()
mindex = pd.MultiIndex.from_tuples(
[
("a", 2),
("b", 1),
("b", 2),
],
names=("level_1", "level_2"),
)
expected = Dataset({}, Coordinates.from_pandas_multiindex(mindex, "x"))

actual = data.drop_sel(x=("a", 1))
assert_identical(expected, actual)

def test_drop_index_labels(self) -> None:
data = Dataset({"A": (["x", "y"], np.random.randn(2, 3)), "x": ["a", "b"]})

Expand Down
Loading