Skip to content

create_default_indexes=False behaves poorly in DataTrees #11321

@BorisTheBrave

Description

@BorisTheBrave

What happened?

If a dataset is missing a index, it can use the index from further up the datatree.
If a dataset and it's ancestors are missing an index, then an index will be created from the datasets coordinates, but not from ancestors.

Together, these create an inconsistency. It would be more logical that if an index cannot be found, then a coordinate is searched for using the same logic.

As it is, you can get different behaviour for create_default_indices=True/False, when I feel the intention was that it just disables eager loading.

What did you expect to happen?

I would expect that open_datatree(create_default_indexes=False) and open_datatree(create_default_indexes=True) give similar, differing only in the timing of when an index is saved, and if it is persisted.

Minimal Complete Verifiable Example

# /// script
# requires-python = ">=3.11"
# dependencies = [
#   "xarray[complete]@git+https://github.com/pydata/xarray.git@main",
# ]
# ///
#
# This script automatically imports the development branch of xarray to check for issues.
# Please delete this header if you have _not_ tested this script with `uv run`!

import xarray as xr
xr.show_versions()
# your reproducer code ...


import xarray as xr

child = xr.Dataset(
    data_vars={
        "a": (["x"], [0, 1]),
    },
)

dt = xr.DataTree(
    dataset=xr.Dataset(
        data_vars={
            "b": (["x"], [5, 6]),
        },
        coords={
            "x": [10, 20],
        },
    ),
    children={"child": xr.DataTree(dataset=child)}
)


dt.to_zarr("test.zarr", mode="w")


dt2 = xr.open_datatree("test.zarr", create_default_indexes=True)
dt2.b.sel(x=10).values # 5
dt2.child.a.sel(x=10).values # 0


dt2 = xr.open_datatree("test.zarr", create_default_indexes=False)
dt2.b.sel(x=10).values # 5
dt2.child.a.sel(x=10).values # BoundsCheckError

dt2.ds = dt2.ds.assign_coords(xr.Coordinates({'x': [10, 20]}))
dt2.child.a.sel(x=10).values # 0

Steps to reproduce

No response

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.
  • Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

last):
  File "/root/test/test_script.py", line 48, in <module>
    dt2.child.a.sel(x=10).values  # BoundsCheckError
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.uv/cache/environments-v2/test-script-6aac3b5ed3a4de9a/lib/python3.12/site-packages/xarray/core/dataarray.py", line 803, in values
    return self.variable.values
           ^^^^^^^^^^^^^^^^^^^^
  File "/.uv/cache/environments-v2/test-script-6aac3b5ed3a4de9a/lib/python3.12/site-packages/xarray/core/variable.py", line 555, in values
    return _as_array_or_item(self._data)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.uv/cache/environments-v2/test-script-6aac3b5ed3a4de9a/lib/python3.12/site-packages/xarray/core/variable.py", line 335, in _as_array_or_item
    data = np.asarray(data)
           ^^^^^^^^^^^^^^^^
  File "/.uv/cache/environments-v2/test-script-6aac3b5ed3a4de9a/lib/python3.12/site-packages/xarray/core/indexing.py", line 604, in __array__
    return np.asarray(self.get_duck_array(), dtype=dtype, copy=copy)
                      ^^^^^^^^^^^^^^^^^^^^^
  File "/.uv/cache/environments-v2/test-script-6aac3b5ed3a4de9a/lib/python3.12/site-packages/xarray/core/indexing.py", line 970, in get_duck_array
    duck_array = self.array.get_duck_array()
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.uv/cache/environments-v2/test-script-6aac3b5ed3a4de9a/lib/python3.12/site-packages/xarray/core/indexing.py", line 924, in get_duck_array
    return self.array.get_duck_array()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.uv/cache/environments-v2/test-script-6aac3b5ed3a4de9a/lib/python3.12/site-packages/xarray/core/indexing.py", line 764, in get_duck_array
    array = self.array[self.key]
            ~~~~~~~~~~^^^^^^^^^^
  File "/.uv/cache/environments-v2/test-script-6aac3b5ed3a4de9a/lib/python3.12/site-packages/xarray/backends/zarr.py", line 316, in __getitem__
    return indexing.explicit_indexing_adapter(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.uv/cache/environments-v2/test-script-6aac3b5ed3a4de9a/lib/python3.12/site-packages/xarray/core/indexing.py", line 1156, in explicit_indexing_adapter
    result = raw_indexing_method(raw_key.tuple)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.uv/cache/environments-v2/test-script-6aac3b5ed3a4de9a/lib/python3.12/site-packages/xarray/backends/zarr.py", line 279, in _getitem
    return self._array[key]
           ~~~~~~~~~~~^^^^^
  File "/.uv/cache/environments-v2/test-script-6aac3b5ed3a4de9a/lib/python3.12/site-packages/zarr/core/array.py", line 2832, in __getitem__
    return self.get_orthogonal_selection(pure_selection, fields=fields)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.uv/cache/environments-v2/test-script-6aac3b5ed3a4de9a/lib/python3.12/site-packages/zarr/core/array.py", line 3302, in get_orthogonal_selection
    indexer = OrthogonalIndexer(selection, self.shape, self.metadata.chunk_grid)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.uv/cache/environments-v2/test-script-6aac3b5ed3a4de9a/lib/python3.12/site-packages/zarr/core/indexing.py", line 920, in __init__
    dim_indexer = IntDimIndexer(dim_sel, dim_len, dim_chunk_len)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.uv/cache/environments-v2/test-script-6aac3b5ed3a4de9a/lib/python3.12/site-packages/zarr/core/indexing.py", line 387, in __init__
    object.__setattr__(self, "dim_sel", normalize_integer_selection(dim_sel, dim_len))
                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.uv/cache/environments-v2/test-script-6aac3b5ed3a4de9a/lib/python3.12/site-packages/zarr/core/indexing.py", line 355, in normalize_integer_selection
    raise BoundsCheckError(msg)
zarr.errors.BoundsCheckError: index out of bounds for dimension with length 2

Anything else we need to know?

No response

Environment

Details INSTALLED VERSIONS ------------------ commit: None python: 3.11.11 (main, Mar 17 2025, 21:02:09) [Clang 20.1.0 ] python-bits: 64 OS: Linux OS-release: 5.15.0-139-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: C.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.14.6 libnetcdf: 4.9.3

xarray: 2026.4.0
pandas: 3.0.2
numpy: 2.4.4
scipy: 1.17.1
netCDF4: 1.7.4
pydap: None
h5netcdf: None
h5py: None
zarr: 3.1.6
cftime: 1.6.5
nc_time_axis: None
iris: None
bottleneck: None
dask: 2026.3.0
distributed: 2026.3.0
matplotlib: 3.10.9
cartopy: None
seaborn: 0.13.2
numbagg: None
fsspec: 2025.3.0
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 82.0.1
pip: 26.1
conda: None
pytest: 7.4.3
mypy: None
IPython: 9.10.1
sphinx: 7.3.7

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugneeds triageIssue that has not been reviewed by xarray team member

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions