Add support for Zarr data I/O format #659

forman · 2018-05-18T09:21:28Z

xarray 0.10 introduced two new methods, xr.open_zarr() and Dataset.to_zarr.

After a few first tests, I am very enthusiastic about the Zarr, a new data format optimized for distributed and concurrent array I/O. It seems to offer much better I/O performance over NetCDF4, which maybe due to single-threaded HDF5 decompression in Python (not checked).

As it seems to be a 1:1 representation of the NetCDF4 / HDF5 data model, Cate could use it for very efficient workspaces persistence or users could use it for intermediate computation results.

The good news is, that Cate doesn't require any extra dependencies as the zarr package is already a dependency of xarray 0.10.

JanisGailis · 2018-05-18T10:46:00Z

After quickly googling around and reading about Zarr, it really looks quite impressive.

forman · 2018-05-18T12:38:25Z

And it is lightning fast. Just ingested a Zarr data cube with dims=(time=250, lat=1000, lon=2000) (for another project) and when I time-travel through it, layers are displayed immediately.

papesci · 2018-05-31T10:57:36Z

it looks an impressive persistence support. It would be a good idea to include it enabling the user to save processed data.

forman · 2018-06-01T16:07:20Z

I'll merge my branch, so we can play with it.

forman added feature ds perf labels May 18, 2018

forman self-assigned this May 18, 2018

forman added a commit that referenced this issue May 18, 2018

first version of read_zarr() to address #659

7d159d8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for Zarr data I/O format #659

Add support for Zarr data I/O format #659

forman commented May 18, 2018 •

edited

JanisGailis commented May 18, 2018

forman commented May 18, 2018

papesci commented May 31, 2018

forman commented Jun 1, 2018

Add support for Zarr data I/O format #659

Add support for Zarr data I/O format #659

Comments

forman commented May 18, 2018 • edited

JanisGailis commented May 18, 2018

forman commented May 18, 2018

papesci commented May 31, 2018

forman commented Jun 1, 2018

forman commented May 18, 2018 •

edited