Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hierarchical chunk storage #29

Closed
constantinpape opened this issue Feb 23, 2021 · 6 comments · Fixed by #40
Closed

Hierarchical chunk storage #29

constantinpape opened this issue Feb 23, 2021 · 6 comments · Fixed by #40
Assignees

Comments

@constantinpape
Copy link
Contributor

Currently, the ome zarr specification demands that the chunks are stored in a single directory.
This causes issues on the file system for large images / volumes with many chunks.

N5 stores chunks in a hierarchy natively and zarr also supports this with the NestedDirectoryStore (although this is currently not working according to @joshmoore).

@constantinpape constantinpape changed the title Allow nested chunk storage Nested chunk storage Feb 23, 2021
@constantinpape constantinpape changed the title Nested chunk storage Hierarchical chunk storage Feb 23, 2021
@joshmoore
Copy link
Member

joshmoore commented Feb 23, 2021

Here's what I've been working on:

Conversations starting https://gitter.im/zarr-developers/community?at=601d471fc83ec358be27944f and during the 2021-02-10 community call suggest that even before the V3 spec (which defaults to nested storage) it might be possible for us to transition to using it more heavily with the V2 spec. A requirement would be a heuristic or metadata entry to determine if chunks are nested or flat. zarr-python is waiting on a PR from me.

@imagesc-bot
Copy link

This issue has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/next-call-on-next-gen-bioimaging-data-tools-feb-23/48386/9

@joshmoore
Copy link
Member

Status update:

  • bioformats2raw is moving towards 0.3.0. There will already be some breaking layout changes.
  • zarr-python has the FSStore fixe merged to the mainline. The 2.7.0 release will contain it.
  • The jzarr work will be taken over by @SabineEmbacher but is moving forward.

With all of this in mind, I'd propose that we go ahead and bump to OME-Zarr version to 0.2.0 stating in the spec that nested storage is a MUST. A client can then detect whether the fileset is 0.1.0 or 0.2.0 from the metadata and make a choice as to the chunk index separator.

cc: @manzt (since I've failed to engage on the JS front)

joshmoore added a commit to joshmoore/ngff that referenced this issue Mar 27, 2021
This defines the dimension separator (sometimes called
"key separator") for chunks in the NGFF file to be "/"
rather than ".". The benefit of storing hierarchically
is that for very large volumes the number of files in
a directory is limited to `shape / chunk` for a given
dimension.

close: ome#29
joshmoore added a commit to joshmoore/ngff that referenced this issue Mar 27, 2021
This defines the dimension separator (sometimes called
"key separator") for chunks in the NGFF file to be "/"
rather than ".". The benefit of storing hierarchically
is that for very large volumes the number of files in
a directory is limited to `shape / chunk` for a given
dimension.

close: ome#29
joshmoore added a commit to joshmoore/ngff that referenced this issue Mar 27, 2021
This defines the dimension separator (sometimes called
"key separator") for chunks in the NGFF file to be "/"
rather than ".". The benefit of storing hierarchically
is that for very large volumes the number of files in
a directory is limited to `shape / chunk` for a given
dimension.

close: ome#29
@joshmoore
Copy link
Member

see: zarr-developers/zarr-python#715

@joshmoore
Copy link
Member

Feels like most implementations have now implemented this new strategy. Likely a solid next step would be to add v0.1 and v0.2 implementations to https://github.com/ome/ome_zarr_test_suite

@imagesc-bot
Copy link

This issue has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/open-an-image-from-omero-in-imagej-js/47747/17

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants