-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
What happened?
There appears to be differing behaviour between sel
and drop_sel
for MultiIndexes where the index contains a mix of types (i.e str, int). With drop_sel
raising and error and sel
behaving as expected.
What did you expect to happen?
The DataArray to be returned with the element matching the index dropped.
Minimal Complete Verifiable Example
# /// script
# requires-python = ">=3.11"
# dependencies = [
# "xarray[complete]@git+https://github.com/pydata/xarray.git@main",
# ]
# ///
#
# This script automatically imports the development branch of xarray to check for issues.
# Please delete this header if you have _not_ tested this script with `uv run`!
import xarray as xr
xr.show_versions()
# your reproducer code ...
import numpy as np
# Create a DataArray with a Multiindex with mixed types.
arr = xr.DataArray(
np.arange(6).reshape(2, 3),
coords=[("x", ["a", "b"]), ("y", [0, 1, 2])],
)
stacked = arr.stack(z=("x", "y"))
# Select a single item using the multiindex
selected_stack = stacked.sel(z=("a",0))
# Try and delete an item using the multiindex
dropped_stack = stacked.drop_sel(z=("a",0))
Steps to reproduce
A run of the MCVE (uv run issue.py
)
MVCE confirmation
- Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- Complete example — the example is self-contained, including all data and the text of any traceback.
- Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
- New issue — a search of GitHub Issues suggests this is not a duplicate.
- Recent environment — the issue occurs with the latest version of xarray and its dependencies.
Relevant log output
/data/users/aled.owen/.cache/uv/environments-v2/xarray-drop-sel-7f718ae04b721fc0/lib/python3.13/site-packages/xarray/core/dataset.py:6086: PerformanceWarning: dropping on a non-lexsorted multi-index without a level parameter may impact performance.
new_index = index.drop(labels_for_dim, errors=errors)
Traceback (most recent call last):
File "/data/users/aled.owen/.cache/uv/environments-v2/xarray-drop-sel-7f718ae04b721fc0/lib/python3.13/site-packages/pandas/core/indexes/base.py", line 3812, in get_loc
return self._engine.get_loc(casted_key)
~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^
File "pandas/_libs/index.pyx", line 167, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 196, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 7088, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 7096, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: np.str_('0')
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/data/users/aled.owen/repos/ao-scratch_worktrees/main/scripts/xarray_drop_sel.py", line 29, in <module>
dropped_stack = stacked.drop_sel(z=("a",0))
File "/data/users/aled.owen/.cache/uv/environments-v2/xarray-drop-sel-7f718ae04b721fc0/lib/python3.13/site-packages/xarray/core/dataarray.py", line 3364, in drop_sel
ds = self._to_temp_dataset().drop_sel(labels, errors=errors)
File "/data/users/aled.owen/.cache/uv/environments-v2/xarray-drop-sel-7f718ae04b721fc0/lib/python3.13/site-packages/xarray/core/dataset.py", line 6086, in drop_sel
new_index = index.drop(labels_for_dim, errors=errors)
File "/data/users/aled.owen/.cache/uv/environments-v2/xarray-drop-sel-7f718ae04b721fc0/lib/python3.13/site-packages/pandas/core/indexes/multi.py", line 2438, in drop
loc = self.get_loc(level_codes)
File "/data/users/aled.owen/.cache/uv/environments-v2/xarray-drop-sel-7f718ae04b721fc0/lib/python3.13/site-packages/pandas/core/indexes/multi.py", line 3059, in get_loc
loc = self._get_level_indexer(key, level=0)
File "/data/users/aled.owen/.cache/uv/environments-v2/xarray-drop-sel-7f718ae04b721fc0/lib/python3.13/site-packages/pandas/core/indexes/multi.py", line 3410, in _get_level_indexer
idx = self._get_loc_single_level_index(level_index, key)
File "/data/users/aled.owen/.cache/uv/environments-v2/xarray-drop-sel-7f718ae04b721fc0/lib/python3.13/site-packages/pandas/core/indexes/multi.py", line 2999, in _get_loc_single_level_index
return level_index.get_loc(key)
~~~~~~~~~~~~~~~~~~~^^^^^
File "/data/users/aled.owen/.cache/uv/environments-v2/xarray-drop-sel-7f718ae04b721fc0/lib/python3.13/site-packages/pandas/core/indexes/base.py", line 3819, in get_loc
raise KeyError(key) from err
KeyError: np.str_('0')
Anything else we need to know?
Looks to be linked to the array conversion here where this changes the type trying to be index:
Line 6079 in b5e4b0e
labels_for_dim = np.asarray(labels_for_dim) |
Need to look usage of pandas.Index.drop
and pandas.MultiIndex.drop
to understand implications of this conversion before proposing a fix.
Environment
INSTALLED VERSIONS
commit: None
python: 3.13.5 | packaged by conda-forge | (main, Jun 16 2025, 08:27:50) [GCC 13.3.0]
python-bits: 64
OS: Linux
OS-release: 5.14.0-570.46.1.el9_6.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_GB.UTF-8
LOCALE: ('en_GB', 'UTF-8')
libhdf5: 1.14.6
libnetcdf: 4.9.3
xarray: 2025.10.2.dev13+gb5e4b0e02
pandas: 2.3.3
numpy: 2.3.4
scipy: 1.16.2
netCDF4: 1.7.3
pydap: 3.5.8
h5netcdf: 1.7.2
h5py: 3.15.1
zarr: 3.1.3
cftime: 1.6.5
nc_time_axis: 1.4.1
iris: None
bottleneck: 1.6.0
dask: 2025.10.0
distributed: 2025.10.0
matplotlib: 3.10.7
cartopy: 0.25.0
seaborn: 0.13.2
numbagg: 0.9.3
fsspec: 2025.9.0
cupy: None
pint: None
sparse: 0.17.0
flox: 0.10.7
numpy_groupies: 0.11.3
setuptools: None
pip: None
conda: None
pytest: None
mypy: None
IPython: None
sphinx: None