Skip to content

Allow Dataset in numpy array with dtype=object #10044

@telearis

Description

@telearis

Discussed in #10043

Originally posted by telearis February 12, 2025

Situation

xarray.Dataset explicitly restricts being put into a numpy.ndarray even if one sets dtype=object:

import numpy as np
import xarray as xr

a = np.array([xr.Dataset({'a': [1, 2,3]})], dtype=object)

This code fails with error:
"TypeError: cannot directly convert an xarray.Dataset into a numpy array. Instead, create an xarray.DataArray first, either with indexing on the Dataset or by invoking the to_dataarray() method."

However, this works:

a = np.empty((1,), dtype=object)
a[0] = xr.Dataset({'a': [1, 2,3]})

Proposal:

xarray.Dataset should not care about being put into numpy.ndarray if dtype=object.

Reason

  • Using np.array([ <whatever> ], dtype=object) should work for any objects put into a numpy array.
  • Assignment via index is possible (see above). Hence, this behavior is inconsistent.

Application

I came across this when using xarray.DataArray to store results from ray tracing. The data contains (among other data) points where rays have been reflected or diffracted. The number of points is variable. So I wanted to store the information of the reflection/refraction points into a separate xarray.Dataset and store this in an xarray.DataArray with dtype=object (via detour of a numpy array).

Workaround

I currently work around this limitation by subclassing xarray.Dataset:

class MyDataset(xr.Dataset):
    __slots__ = ()
    
    def __array__(self, dtype=None, copy=None):
        assert dtype == object
        assert (copy is None) or (not copy)
        
        x = np.array(None, dtype=object)
        x.flat[0] = self
        
        return x
``´</div>

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions