memory bug with Dataset starting in version 2025.3.0 in combination with dask #10350
Open
2 of 5 tasks
Labels
plan to close
May be closeable, needs more eyeballs
What happened?
Hi,
I think I found a memory bug that happens when using xarray from version 2025.3.0 when also dask in any version is present. The memory of the very fist Dataset created is never released. For all later created Datasets it works and a workaround for me is in fact to initialize a small Dataset at the beginning of the code.
What did you expect to happen?
a deleted Dataset should release the memory as in xarray version 2025.1.2 or older.
Minimal Complete Verifiable Example
MVCE confirmation
Relevant log output
Anything else we need to know?
initializing a tiny Datasets at the top of the code mitigates the problem
xr.Dataset({}, coords={ "a": [1]})
funnily, even calling xr.show_versions() does....
Feels like the very first call to Dataset leaves a reference somewhere, so it is not picked up by the garbage collector.
Might be related to: #9807
But here we have a much simpler minimal example.
Environment
xarray: 2025.4.0
pandas: 2.2.3
numpy: 2.2.6
scipy: None
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
zarr: None
cftime: None
nc_time_axis: None
iris: None
bottleneck: None
dask: 2025.5.1
distributed: None
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
fsspec: 2025.5.0
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 65.5.1
pip: 23.2.1
conda: None
pytest: None
mypy: None
IPython: None
sphinx: None
The text was updated successfully, but these errors were encountered: