Description
What happened?
The default writing of to_netcdf
fails when you only install xarray and h5netcdf. It appears to only look for scipy
or netcdf4
if you don't specify an engine, despite the general support for h5netcdf
elsewhere.
What did you expect to happen?
I expected to_netcdf
to work without specifying an engine when the appropriate packages are installed.
Minimal Complete Verifiable Example
# uvx --with "xarray>=2025" --with h5netcdf ipython
import xarray as xr
da = xr.DataArray([0], dims=('x',), coords={'x': [1]})
da.to_netcdf("test.nc")
MVCE confirmation
- Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- Complete example — the example is self-contained, including all data and the text of any traceback.
- Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
- New issue — a search of GitHub Issues suggests this is not a duplicate.
- Recent environment — the issue occurs with the latest version of xarray and its dependencies.
Relevant log output
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
File ~/.cache/uv/archive-v0/5EymGo2UVcKwq9ooGHMw4/lib/python3.13/site-packages/xarray/backends/api.py:126, in _get_default_engine_netcdf()
125 try:
--> 126 import netCDF4 # noqa: F401
128 engine = "netcdf4"
ModuleNotFoundError: No module named 'netCDF4'
During handling of the above exception, another exception occurred:
ModuleNotFoundError Traceback (most recent call last)
File ~/.cache/uv/archive-v0/5EymGo2UVcKwq9ooGHMw4/lib/python3.13/site-packages/xarray/backends/api.py:131, in _get_default_engine_netcdf()
130 try:
--> 131 import scipy.io.netcdf # noqa: F401
133 engine = "scipy"
ModuleNotFoundError: No module named 'scipy'
The above exception was the direct cause of the following exception:
ValueError Traceback (most recent call last)
Cell In[6], line 1
----> 1 da.to_netcdf("test.nc")
File ~/.cache/uv/archive-v0/5EymGo2UVcKwq9ooGHMw4/lib/python3.13/site-packages/xarray/core/dataarray.py:4204, in DataArray.to_netcdf(self, path, mode, format, group, engine, encoding, unlimited_dims, compute, invalid_netcdf, auto_complex)
4200 else:
4201 # No problems with the name - so we're fine!
4202 dataset = self.to_dataset()
-> 4204 return to_netcdf( # type: ignore[return-value] # mypy cannot resolve the overloads:(
4205 dataset,
4206 path,
4207 mode=mode,
4208 format=format,
4209 group=group,
4210 engine=engine,
4211 encoding=encoding,
4212 unlimited_dims=unlimited_dims,
4213 compute=compute,
4214 multifile=False,
4215 invalid_netcdf=invalid_netcdf,
4216 auto_complex=auto_complex,
4217 )
File ~/.cache/uv/archive-v0/5EymGo2UVcKwq9ooGHMw4/lib/python3.13/site-packages/xarray/backends/api.py:1871, in to_netcdf(dataset, path_or_file, mode, format, group, engine, encoding, unlimited_dims, compute, multifile, invalid_netcdf, auto_complex)
1869 elif isinstance(path_or_file, str):
1870 if engine is None:
-> 1871 engine = _get_default_engine(path_or_file)
1872 path_or_file = _normalize_path(path_or_file)
1873 else: # file-like object
File ~/.cache/uv/archive-v0/5EymGo2UVcKwq9ooGHMw4/lib/python3.13/site-packages/xarray/backends/api.py:148, in _get_default_engine(path, allow_remote)
146 return _get_default_engine_gz()
147 else:
--> 148 return _get_default_engine_netcdf()
File ~/.cache/uv/archive-v0/5EymGo2UVcKwq9ooGHMw4/lib/python3.13/site-packages/xarray/backends/api.py:135, in _get_default_engine_netcdf()
133 engine = "scipy"
134 except ImportError as err:
--> 135 raise ValueError(
136 "cannot read or write netCDF files without "
137 "netCDF4-python or scipy installed"
138 ) from err
139 return engine
ValueError: cannot read or write netCDF files without netCDF4-python or scipy installed
Anything else we need to know?
Is there any reason not to change the current default engine logic to be something like
def _get_default_engine_netcdf() -> Literal["netcdf4", "h5netcdf", "scipy"]:
engine: Literal["netcdf4", "h5netcdf", "scipy"]
try:
import netCDF4 # noqa: F401
engine = "netcdf4"
except ImportError: # pragma: no cover
try:
import h5netcdf # noqa: F401
engine = "h5netcdf"
except ImportError:
try:
import scipy.io.netcdf # noqa: F401
engine = "scipy"
except ImportError as err:
raise ValueError(
"cannot read or write netCDF files without "
"netCDF4-python, h5netcdf, or scipy installed"
) from err
return engine
(besides the triple nested-ness)
Environment
In [2]: xr.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.13.1 (main, Jan 14 2025, 23:31:50) [Clang 19.1.6 ]
python-bits: 64
OS: Darwin
OS-release: 24.5.0
machine: arm64
processor: arm
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.14.6
libnetcdf: None
xarray: 2025.4.0
pandas: 2.3.0
numpy: 2.2.6
scipy: None
netCDF4: None
pydap: None
h5netcdf: 1.6.1
h5py: 3.13.0
zarr: None
cftime: None
nc_time_axis: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
fsspec: None
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: None
pip: None
conda: None
pytest: None
mypy: None
IPython: 9.3.0
sphinx: None