Skip to content

Cannot reindex onto a stacked MultiIndex via indexers — only reindex_like works #11368

@FBumann

Description

@FBumann

What happened?

Reindexing a DataArray whose dimension is backed by a stacked pd.MultiIndex onto a different MultiIndex (e.g. the full index, where the array covers a subset) fails for every indexer form:

  1. a raw pd.MultiIndex as indexer value → ValueError: unmatched keys found in indexes and variables: {'kind', 'city'}
  2. a Coordinates.from_pandas_multiindex(...) mapping as indexers → AlignmentError: Indexer has dimensions ('station',) that are different from that to be indexed along 'city'
  3. the MultiIndex-backed coordinate variable as indexer value → same ValueError as 1.

The only working path is reindex_like(template) where the template is a throwaway object constructed with Coordinates.from_pandas_multiindex — which shows the operation itself is supported by the alignment machinery; it's only unreachable through the reindex() indexers API.

What did you expect to happen?

At least one of the indexer forms (ideally 1. and 3.) to be equivalent to the reindex_like call: a pd.MultiIndex indexer should be expanded into the dimension index plus its level coordinates internally, the same way Coordinates.from_pandas_multiindex and reindex_like do.

Minimal Complete Verifiable Example

import numpy as np
import pandas as pd
import xarray as xr
from xarray import Coordinates, DataArray

# A DataArray indexed by a stacked MultiIndex dimension "station",
# covering a SUBSET of the full index
full = pd.MultiIndex.from_product([[1, 2], ["a", "b"]], names=("city", "kind"))
subset = pd.MultiIndex.from_tuples([(1, "a"), (2, "b")], names=("city", "kind"))

da = DataArray(
    [10.0, 20.0],
    coords=Coordinates.from_pandas_multiindex(subset, "station"),
    dims="station",
)

# GOAL: conform `da` to the full MultiIndex, filling missing entries.

# 1) raw pd.MultiIndex as indexer
try:
    da.reindex({"station": full}, fill_value=0.0)
except Exception as e:
    print(f"1) raw pd.MultiIndex indexer:\n   {type(e).__name__}: {e}\n")

# 2) Coordinates.from_pandas_multiindex as the indexers mapping
try:
    da.reindex(Coordinates.from_pandas_multiindex(full, "station"), fill_value=0.0)
except Exception as e:
    print(f"2) Coordinates as indexers:\n   {type(e).__name__}: {e}\n")

# 3) the MultiIndex-backed coordinate variable as indexer
full_coords = Coordinates.from_pandas_multiindex(full, "station")
try:
    da.reindex({"station": full_coords["station"]}, fill_value=0.0)
except Exception as e:
    print(f"3) MI-backed coord variable indexer:\n   {type(e).__name__}: {e}\n")

# 4) reindex_like with a template object carrying the same Coordinates -> WORKS
template = DataArray(np.zeros(len(full)), coords=full_coords, dims="station")
result = da.reindex_like(template, fill_value=0.0)
print("4) reindex_like(template) works:")
print(result.to_series())

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example runs when copied & pasted into an empty Python session (output included below).
  • New issue — a search of GitHub Issues suggests this is not a duplicate (reindex fails on Dataset from MultiIndex DataFrame with RuntimeError #10347 is related but covers a level-of-MultiIndex coord as indexer, fixed in 2025.x; this is about the stacked MultiIndex dimension itself).

Relevant log output

1) raw pd.MultiIndex indexer:
   ValueError: unmatched keys found in indexes and variables: {'kind', 'city'}

2) Coordinates as indexers:
   AlignmentError: Indexer has dimensions ('station',) that are different from that to be indexed along 'city'

3) MI-backed coord variable indexer:
   ValueError: unmatched keys found in indexes and variables: {'kind', 'city'}

4) reindex_like(template) works:
city  kind
1     a       10.0
      b        0.0
2     a        0.0
      b       20.0
dtype: float64

Anything else we need to know?

Use case: in linopy (and PyPSA's multi-investment models, where snapshots are a (period, timestep) MultiIndex), we conform user input to a target index that exists only as coordinates — there is no object to reindex_like against, so we have to construct a throwaway template just to perform the reindex.

This issue was discovered and written up with AI assistance (Claude Code); the repro and outputs are verified on a real environment.

Environment

Details
INSTALLED VERSIONS
------------------
commit: None
python: 3.13.2 (main, Mar 17 2025, 21:26:38) [Clang 20.1.0 ]
python-bits: 64
OS: Darwin
OS-release: 25.2.0
machine: arm64
processor: arm
byteorder: little
LC_ALL: None
LANG: C.UTF-8
LOCALE: (None, 'UTF-8')
libhdf5: 1.14.6
libnetcdf: 4.9.3
xarray: 2026.4.0
pandas: 2.3.3
numpy: 1.26.4
scipy: 1.17.1
netCDF4: 1.7.4
cftime: 1.6.5
bottleneck: 1.6.0
dask: 2026.3.0
fsspec: 2026.4.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions