-
Notifications
You must be signed in to change notification settings - Fork 10
feat: consolidating metadata for opening zarr files in read mode #82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: consolidating metadata for opening zarr files in read mode #82
Conversation
cwognum
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To optimize performance, we don't want to consolidate the meta-data everytime.
From the Zarr docs:
>>> zarr.consolidate_metadata(store)This creates a special key with a copy of all of the metadata from all of the metadata objects in the store.
The key here refers to a single file for us (i.e. by default named .zmetadata, although this can be combined with the metadata_key parameter in the consolidate methods).
I think what we would want to do is:
- Consolidate the archive locally.
- Then copy over the consolidated archive to the Hub with
zarr.convenience.copy_all. - If this doesn't work (i.e. I assume it would copy over the
.zmetadatafile, but maybe not?), then we should find out a way to copy over this single file manually.
Could you look into the above? I would be curious to know if this is possible!
Not having to consolidate everything on the Hub would make things a lot faster!
|
Now that #83 is merged, could you actually look into adding the consolidation in the flow for using Zarr datasets from the Hub: |
cwognum
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work!
Let's assume that any Zarr archive that is loaded for a dataset has been consolidated. This means that we should also change these lines of code to load in consolidated mode!
cwognum
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost there!
The test cases are failing though because the test Zarr archive is not consolidated! The formatting also fails right now.
Changelogs
Incorporated Zarr function that consolidates metadata when opening a Zarr group in read mode. This results in reduced number of
lscalls and increases the speed in reading from the Zarr group.Profiling without consolidation
Profiling with consolidation