Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Silent value assignment failure in open_zarr Dataset due to hidden mode='r' #3282

Open
jkmacc-LANL opened this issue Sep 5, 2019 · 0 comments

Comments

@jkmacc-LANL
Copy link

Hello Xarray devs,

Thanks for your work on this fantastic package. I'm a new user, and the subtleties of different data stores are unfamiliar to me. I got tripped up by the fact that Zarr stores are (silently) read-only, and I think it would be helpful if this were more prominent in the docstring or zarr section of the docs.

When I try to assign values to parts of a local Zarr-backed Dataset, I get a silent failure:

In [142]: ds = xr.open_zarr('tmp.zarr', chunks=None)

In [143]: selector = dict(time='2014-06-06T01:00:00', azimuth=0, frequency=0.0)

In [144]: ds['counts'].loc[selector].values
Out[144]: array(4294967295, dtype=uint32)

# try to assign a value here, like the example in the docs: 
# In [55]: ds['empty'].loc[dict(lon=260, lat=30)] = 100
In [145]: ds['counts'].loc[selector].values = 0

# just get the same value back
In [146]: ds['counts'].loc[selector].values
Out[146]: array(4294967295, dtype=uint32)

The answer seems to be buried in the open_zarr source code:

    ...
    # Zarr supports a wide range of access modes, but for now xarray either
    # reads or writes from a store, never both. For open_zarr, we only read
    mode = 'r'
    zarr_store = ZarrStore.open_group(store, mode=mode,
                                      synchronizer=synchronizer,
                                      group=group, consolidated=consolidated)
    ...

Expected Output

Assignment that follows the examples in the documentation.

  1. I think that mentioning mode='r' in the open_zarr docstring would be the most helpful.
  2. A description of this and the reasoning behind why zarr datasets are read-only would be helpful in the zarr section of the docs.
  3. Optionally, a note in the indexing and assignment that not all store backends support assignment would also be helpful.

I'm happy to make a PR on 1 & 3, but I'm not familiar with the reasoning behind why stores are never mixed-mode.

Thanks again!

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 | packaged by conda-forge | (default, Mar 27 2019, 15:43:19) [Clang 4.0.1 (tags/RELEASE_401/final)] python-bits: 64 OS: Darwin OS-release: 16.7.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: None libnetcdf: None

xarray: 0.12.3
pandas: 0.24.2
numpy: 1.16.3
scipy: 1.3.0
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: 2.3.1
cftime: None
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 1.2.2
distributed: 1.28.1
matplotlib: 3.1.0
cartopy: None
seaborn: None
numbagg: None
setuptools: 41.0.1
pip: 19.1
conda: None
pytest: None
IPython: 7.5.0
sphinx: None

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants