What happened?
I encountered an unexpected behavior while using xarray, which I would consider to be a bug.
Similar to other types/classes, I attempted to create a copy of a dataset by using the constructor of the xarray.Dataset class.
For some time, I did not notice any issues, but just today I realized that this creates a new dataset that holds all relevant information except for the .attrs of the dataset.
Specifically, I checked that the new dataset copies over the data vars from the provided dataset and reconstructs the coordinates, but completely loses the attributes.
I am using xarray version "2025.12.0" for my tests, but the relevant code in the __init__() function of xarray.Dataset at [datataset.py:lines 378-410](https://github.com/pydata/xarray/blob/main/xarray/core/dataset.py) has not changed.
Looking at the code in the constructor, the first argument to the constructor is used as a source for variables in the new dataset and then the list of coordinates is reconstructed.
There is a special treatment for the coord parameter being a Dataset, but not for the data_vars parameter.
What did you expect to happen?
Given that the constructor for all my intents and purposes creates a shallow copy of the Dataset, I expected it to create a copy of the attributes as well.
I am aware of the .copy() function, but I would expect the call to xr.Dataset(ds) to effectively work as a copy instructor thus copying the attributes from the input as well, to fail the initialization or to output a warning that the result may be unexpected.
Minimal Complete Verifiable Example
# /// script
# requires-python = ">=3.11"
# dependencies = [
# "xarray[complete]@git+https://github.com/pydata/xarray.git@main",
# ]
# ///
#
# This script automatically imports the development branch of xarray to check for issues.
# Please delete this header if you have _not_ tested this script with `uv run`!
import xarray as xr
xr.show_versions()
# your reproducer code ...import xarray as xr
ds = xr.Dataset()
ds.attrs['magic_key'] = True
assert 'magic_key' in ds.attrs # This is successful
# Create a copy:
ds2 = xr.Dataset(ds)
assert 'magic_key' in ds2.attrs # This is unsuccessful, the attrs are completely empty
Steps to reproduce
- Create a dataset
ds = xr.Dataset()
- Set an attribute on the dataset
ds.attrs['magic_key']=True
- Create a new dataset from the old dataset
ds2 = xr.Dataset(ds)
- The new dataset has lost all attributes.
MVCE confirmation
Relevant log output
Anything else we need to know?
I would suggest a change to the constructor that it can effectively be used as a shallow copy of the dataset, e.g. such that it can used in copy operations of the form type(ds)(ds) similar to other datatypes.
For this, a simple change to line 398 in my version or line 404 in the current (at time of writing) commit to the main branch of the constructor would suffice:
self._attrs = dict(attrs) if attrs else None
could be replaced by
self._attrs = dict(attrs) if attrs else dict(data_vars.attrs) if isinstance(data_vars, Dataset) else None
If there is a reason for why the constructor should not be able to operate similar to a copy constructor, I would suggest adding a warning if a dataset is supplied with some attributes in ds.attrs set and no attrs parameter being provided because the current behavior may be unexpected.
Environment
Details
INSTALLED VERSIONS
------------------
commit: None
python: 3.12.2 (tags/v3.12.2:6abddd9, Feb 6 2024, 21:26:36) [MSC v.1937 64 bit (AMD64)]
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: AMD64 Family 25 Model 33 Stepping 2, AuthenticAMD
byteorder: little
LC_ALL: None
LANG: None
LOCALE: ('de_DE', 'cp1252')
libhdf5: 1.14.6
libnetcdf: None
xarray: 2025.12.0
pandas: 2.3.3
numpy: 2.3.5
scipy: 1.16.3
netCDF4: None
pydap: None
h5netcdf: 1.7.3
h5py: 3.15.1
zarr: None
cftime: None
nc_time_axis: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: 3.10.8
cartopy: None
seaborn: 0.13.2
numbagg: None
fsspec: 2025.12.0
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 80.9.0
pip: 25.3
conda: None
pytest: 9.0.2
mypy: 1.18.2
IPython: 9.8.0
sphinx: 8.2.3
What happened?
I encountered an unexpected behavior while using xarray, which I would consider to be a bug.
Similar to other types/classes, I attempted to create a copy of a dataset by using the constructor of the
xarray.Datasetclass.For some time, I did not notice any issues, but just today I realized that this creates a new dataset that holds all relevant information except for the
.attrsof the dataset.Specifically, I checked that the new dataset copies over the data vars from the provided dataset and reconstructs the coordinates, but completely loses the attributes.
I am using xarray version "2025.12.0" for my tests, but the relevant code in the
__init__()function ofxarray.Datasetat[datataset.py:lines 378-410](https://github.com/pydata/xarray/blob/main/xarray/core/dataset.py)has not changed.Looking at the code in the constructor, the first argument to the constructor is used as a source for variables in the new dataset and then the list of coordinates is reconstructed.
There is a special treatment for the
coordparameter being a Dataset, but not for thedata_varsparameter.What did you expect to happen?
Given that the constructor for all my intents and purposes creates a shallow copy of the Dataset, I expected it to create a copy of the attributes as well.
I am aware of the
.copy()function, but I would expect the call toxr.Dataset(ds)to effectively work as a copy instructor thus copying the attributes from the input as well, to fail the initialization or to output a warning that the result may be unexpected.Minimal Complete Verifiable Example
Steps to reproduce
ds = xr.Dataset()ds.attrs['magic_key']=Trueds2 = xr.Dataset(ds)MVCE confirmation
Relevant log output
Anything else we need to know?
I would suggest a change to the constructor that it can effectively be used as a shallow copy of the dataset, e.g. such that it can used in copy operations of the form
type(ds)(ds)similar to other datatypes.For this, a simple change to line 398 in my version or line 404 in the current (at time of writing) commit to the main branch of the constructor would suffice:
self._attrs = dict(attrs) if attrs else Nonecould be replaced by
self._attrs = dict(attrs) if attrs else dict(data_vars.attrs) if isinstance(data_vars, Dataset) else NoneIf there is a reason for why the constructor should not be able to operate similar to a copy constructor, I would suggest adding a warning if a dataset is supplied with some attributes in
ds.attrsset and noattrsparameter being provided because the current behavior may be unexpected.Environment
Details
INSTALLED VERSIONS ------------------ commit: None python: 3.12.2 (tags/v3.12.2:6abddd9, Feb 6 2024, 21:26:36) [MSC v.1937 64 bit (AMD64)] python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: AMD64 Family 25 Model 33 Stepping 2, AuthenticAMD byteorder: little LC_ALL: None LANG: None LOCALE: ('de_DE', 'cp1252') libhdf5: 1.14.6 libnetcdf: Nonexarray: 2025.12.0
pandas: 2.3.3
numpy: 2.3.5
scipy: 1.16.3
netCDF4: None
pydap: None
h5netcdf: 1.7.3
h5py: 3.15.1
zarr: None
cftime: None
nc_time_axis: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: 3.10.8
cartopy: None
seaborn: 0.13.2
numbagg: None
fsspec: 2025.12.0
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 80.9.0
pip: 25.3
conda: None
pytest: 9.0.2
mypy: 1.18.2
IPython: 9.8.0
sphinx: 8.2.3