-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Open
Labels
bugneeds triageIssue that has not been reviewed by xarray team memberIssue that has not been reviewed by xarray team member
Description
What happened?
This issue relates to #9858, and captures the Dataset.append_dim behaviour noted by @TomNicholas in this comment.
What did you expect to happen?
Dataset.append_dim should be consistent with Dataset.concat, which does not raise an error. For empty Dataset objects, this results in the example Tom provided, with no time dimension, because there are no variables using that time dimension. For Dataset objects with variables, using Dataset.append_dim on those Datasets should introduce a new, single-element, dimension added to the contained variables, and the output variables should be concatenated along that new dimension.
Minimal Complete Verifiable Example
import xarray as xr
# Create Datasets
ds_one = xr.Dataset(
data_vars={"temp": (["lat", "lon"], np.array([[270, 271, 270], [273, 272, 272]]))},
coords={"lat": [10, 20], "lon": [-20, -10, 0]},
)
ds_two = xr.Dataset(
data_vars={"temp": (["lat", "lon"], np.array([[271, 272, 271], [274, 273, 273]]))},
coords={"lat": [10, 20], "lon": [-20, -10, 0]},
)
ds.to_zarr("ds.zarr")
ds_two.to_zarr("ds.zarr", append_dim="time")MVCE confirmation
- Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- Complete example — the example is self-contained, including all data and the text of any traceback.
- Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
- New issue — a search of GitHub Issues suggests this is not a duplicate.
- Recent environment — the issue occurs with the latest version of xarray and its dependencies.
Relevant log output
File ~/Documents/git/pydata/xarray/xarray/core/dataset.py:2622, in Dataset.to_zarr(self, store, chunk_store, mode, synchronizer, group, encoding, compute, consolidated, append_dim, region, safe_chunks, storage_options, zarr_version, zarr_format, write_empty_chunks, chunkmanager_store_kwargs)
2454 """Write dataset contents to a zarr group.
2455
2456 Zarr chunks are determined in the following way:
(...)
2618 The I/O user guide, with more details and examples.
2619 """
2620 from xarray.backends.api import to_zarr
-> 2622 return to_zarr( # type: ignore[call-overload,misc]
2623 self,
2624 store=store,
2625 chunk_store=chunk_store,
2626 storage_options=storage_options,
2627 mode=mode,
2628 synchronizer=synchronizer,
2629 group=group,
2630 encoding=encoding,
2631 compute=compute,
2632 consolidated=consolidated,
2633 append_dim=append_dim,
2634 region=region,
2635 safe_chunks=safe_chunks,
2636 zarr_version=zarr_version,
2637 zarr_format=zarr_format,
2638 write_empty_chunks=write_empty_chunks,
2639 chunkmanager_store_kwargs=chunkmanager_store_kwargs,
2640 )
File ~/Documents/git/pydata/xarray/xarray/backends/api.py:2184, in to_zarr(dataset, store, chunk_store, mode, synchronizer, group, encoding, compute, consolidated, append_dim, region, safe_chunks, storage_options, zarr_version, zarr_format, write_empty_chunks, chunkmanager_store_kwargs)
2182 writer = ArrayWriter()
2183 # TODO: figure out how to properly handle unlimited_dims
-> 2184 dump_to_store(dataset, zstore, writer, encoding=encoding)
2185 writes = writer.sync(
2186 compute=compute, chunkmanager_store_kwargs=chunkmanager_store_kwargs
2187 )
2189 if compute:
File ~/Documents/git/pydata/xarray/xarray/backends/api.py:1920, in dump_to_store(dataset, store, writer, encoder, encoding, unlimited_dims)
1917 if encoder:
1918 variables, attrs = encoder(variables, attrs)
-> 1920 store.store(variables, attrs, check_encoding, writer, unlimited_dims=unlimited_dims)
File ~/Documents/git/pydata/xarray/xarray/backends/zarr.py:907, in ZarrStore.store(self, variables, attributes, check_encoding_set, writer, unlimited_dims)
905 existing_dims = self.get_dimensions()
906 if self._append_dim not in existing_dims:
--> 907 raise ValueError(
908 f"append_dim={self._append_dim!r} does not match any existing "
909 f"dataset dimensions {existing_dims}"
910 )
912 variables_encoded, attributes = self.encode(
913 {vn: variables[vn] for vn in new_variable_names}, attributes
914 )
916 if existing_variable_names:
917 # We make sure that values to be appended are encoded *exactly*
918 # as the current values in the store.
919 # To do so, we decode variables directly to access the proper encoding,
920 # without going via xarray.Dataset to avoid needing to load
921 # index variables into memory.
ValueError: append_dim='time' does not match any existing dataset dimensions {'lat': 2, 'lon': 3}Anything else we need to know?
No response
Environment
Details
<function xarray.util.print_versions.show_versions(file=<_io.TextIOWrapper name='' mode='w' encoding='utf-8'>)>
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugneeds triageIssue that has not been reviewed by xarray team memberIssue that has not been reviewed by xarray team member