Skip to content

Dataset constructor with DataArray triggers computation #4529

@eric-czech

Description

@eric-czech

Is it intentional that creating a Dataset with a DataArray and dimension names for a single variable causes computation of that variable? In other words, why does xr.Dataset(dict(a=('d0', xr.DataArray(da.random.random(10))))) cause the dask array to compute?

A longer example:

import dask.array as da
import xarray as xr
x = da.random.randint(1, 10, size=(100, 25))
ds = xr.Dataset(dict(a=xr.DataArray(x, dims=('x', 'y'))))
type(ds.a.data)
dask.array.core.Array

# Recreate the dataset with the same array, but also redefine the dimensions
ds2 = xr.Dataset(dict(a=(('x', 'y'), ds.a))
type(ds2.a.data)
numpy.ndarray

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions