Regression in DataArrays created from Pandas

### What happened?

Given:
```
index1 = np.array([1, 2, 3])
index2 = np.array([1, 2, 4])
srs = pd.Series(index=index1, data=1).convert_dtypes()
arr = srs.to_xarray()
```
Now consider:
```
>>> arr.reindex(index=index2)
```
In xarray 2023.1.0 this gave a reasonable (if weakly-typed) result.
```
<xarray.DataArray (index: 3)>
array([1, 1, nan], dtype=object)
Coordinates:
  * index    (index) int64 1 2 4
```
While upgrading to xarray 2025.3.x + pandas 2.x, my colleagues found it now raises:
```
TypeError: Cannot interpret 'Int64Dtype()' as a data type
```

### What did you expect to happen?

Ideally, the result would be:
```
<xarray.DataArray (index: 3)> Size: 27B
PandasExtensionArray(array=<IntegerArray>
[1, 1, <NA>]
Length: 3, dtype: Int64)
Coordinates:
  * index    (index) <U1 12B '1' '2' '4'
```


### Minimal Complete Verifiable Example

```Python
import numpy as np
import pandas as pd
import xarray as xr
index1 = np.array([1, 2, 3])
index2 = np.array([1, 2, 4])
srs = pd.Series(index=index1, data=1).convert_dtypes()
arr = srs.to_xarray()
arr.reindex(index=index2)
```

### MVCE confirmation

- [ ] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- [ ] Complete example — the example is self-contained, including all data and the text of any traceback.
- [ ] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result.
- [ ] New issue — a search of GitHub Issues suggests this is not a duplicate.
- [ ] Recent environment — the issue occurs with the latest version of xarray and its dependencies.

### Anything else we need to know?

The difference is that `arr.dtype` is now `pd.Int64Dtype()` rather than `np.dtype("object")`, thanks to https://github.com/pydata/xarray/pull/8723.  While arguably an improvement in typing, the xarray core doesn't seem ready to handle the former.  In this case, `core.dtypes.maybe_promote()` is blindly passing a Pandas dtype to `np.issubdtype`, oops.

Patching this immediate issue is more revealing: `reindex` then fails when `duck_array_ops.where(condition, x, y)` tries to coerce `x` & `y` to a common dtype.  The new extension-array code in `as_shared_dtype` is not at all general: when `y` is a scalar (the `fill_value` from the reindex operation), it simply gives up.

Once I understood the cause of the `reindex` issue above, producing more -- and much more worrisome -- failures was trivial:
```
>>> arr + 5

TypeError: unsupported operand type(s) for +: 'PandasExtensionArray' and 'int'

>>> np.add(arr, 5)

TypeError: 'PandasExtensionArray' object is not callable

>>> arr.fillna(0)

AttributeError: 'int' object has no attribute 'dtype'
```

I'd venture to say that the pandas `df.to_xarray()` / `srs.to_xarray()` methods have become foot-guns, bordering on unusable, now that pandas 2.x has reimplemented all of its native datatypes on top of `ExtensionArray` / `ExtensionDtype`.

The good news is I have a fix.  The bad news is it's pretty invasive, needing careful oversight from someone who actually knows what they're doing.  (Before this week I'd never used xarray, nor looked at the numpy / pandas source code.)

For now I might recommend excluding ALL numeric dtypes from being promoted to duck arrays, similar to what https://github.com/pydata/xarray/pull/9042 did for datetimes.  (Basically everything except Categoricals, which seem to be the one extension type with good coverage in the xarray test suite, and which don't support the vast majority of `ufunc`s regardless.)  That would at least allow people to safely continue using `to_xarray()` on modern versions of pandas, though you'd lose all the speed & type safety that @ilan-gold worked to achieve in 2024.5 & onward.

### Environment

<details>

INSTALLED VERSIONS
------------------
commit: None
python: 3.12.10 | packaged by conda-forge | (main, Apr 10 2025, 22:21:13) [GCC 13.3.0]
python-bits: 64
OS: Linux
OS-release: 4.18.0-372.32.1.el8_6.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.14.2
libnetcdf: None

xarray: 2025.3.1
pandas: 2.2.3
numpy: 1.26.4
scipy: 1.15.2
netCDF4: None
pydap: None
h5netcdf: 1.6.1
h5py: 3.9.0
zarr: 3.0.6
cftime: None
nc_time_axis: None
iris: None
bottleneck: 1.4.2
dask: 2025.3.0
distributed: 2025.3.0
matplotlib: 3.10.1
cartopy: None
seaborn: 0.13.2
numbagg: 0.9.0
fsspec: 2024.9.0
cupy: 13.4.0
pint: None
sparse: 0.16.0
flox: None
numpy_groupies: None
setuptools: 78.1.0
pip: 25.0.1
conda: None
pytest: 8.3.5
mypy: 1.15.0
IPython: 8.35.0
sphinx: None

</details>


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GitHub Sponsors

Uh oh!

Regression in DataArrays created from Pandas #10301

What happened?

What did you expect to happen?

Minimal Complete Verifiable Example

MVCE confirmation

Anything else we need to know?

Environment

INSTALLED VERSIONS

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Regression in DataArrays created from Pandas #10301

Description

What happened?

What did you expect to happen?

Minimal Complete Verifiable Example

MVCE confirmation

Anything else we need to know?

Environment

INSTALLED VERSIONS

Activity

ilan-gold commented on May 9, 2025

ilan-gold commented on May 9, 2025

dcherian commented on May 9, 2025

ilan-gold commented on May 9, 2025

richard-berg commented on May 9, 2025

dcherian commented on May 9, 2025

ilan-gold commented on May 9, 2025

ilan-gold commented on May 9, 2025

ilan-gold commented on May 10, 2025

richard-berg commented on May 30, 2025

ilan-gold commented on May 30, 2025

ilan-gold commented on May 30, 2025

ilan-gold commented on Jun 11, 2025

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Issue actions