Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store conversion methods #137

Closed
jakirkham opened this issue Mar 1, 2017 · 9 comments · Fixed by #217
Closed

Store conversion methods #137

jakirkham opened this issue Mar 1, 2017 · 9 comments · Fixed by #217
Labels
enhancement New features or improvements
Milestone

Comments

@jakirkham
Copy link
Member

As follow-up/an extension to discussion in issue ( https://github.com/alimanfoo/zarr/issues/129 ), it would be pretty handy to have some methods to convert back and forth between DirectoryStore and ZipStore. Given that Zip files really are best treated as a write once medium, this would allow one to write out to a DirectoryStore and then convert to a ZipStore. Similarly if editing needs to occur, one could extract the ZipStore and perform edits on the DirectoryStore and then archive it afterwards.

@jakirkham
Copy link
Member Author

Musing on this a bit, maybe leveraging the update methods from the MutableMapping interface might be better. One would then provide it the store they would like to move their data to. For instance, converting to a DirectoryStore or a TempStore should be able to follow similar code paths, which should just work thanks to inheritance. This should also allow it to remain extensible in the future.

@jakirkham jakirkham changed the title Adding to_zipfile/to_dir methods Store conversion methods Mar 1, 2017
@jakirkham jakirkham mentioned this issue Mar 1, 2017
@alimanfoo
Copy link
Member

alimanfoo commented Mar 1, 2017 via email

@alimanfoo
Copy link
Member

I think it would be worth adding an example to the tutorial to show how to copy data from one store to another.

@alimanfoo alimanfoo added this to the v2.2 milestone Nov 19, 2017
@jakirkham
Copy link
Member Author

jakirkham commented Nov 20, 2017

Just to clarify, we are thinking of this as a docs/examples item? If so, I think I agree. The existing interface is already sufficient, but a couple line example would be good. Any thoughts on where this would be documented?

@alimanfoo
Copy link
Member

alimanfoo commented Nov 20, 2017 via email

@jakirkham
Copy link
Member Author

No complaints from me. This is somewhat similar to how I imagined the Group copy method. Does that fit the use cases you are imagining here or do you see other benefits from handling this at the Store level instead of the Group level?

@alimanfoo
Copy link
Member

I was thinking there is potentially a use for both a low-level zarr.copy_store(source, dest, ...) where source and dest are store objects, and a higher-level zarr.copy(source, dest, ...) where source is array-like or group-like and dest is group-like.

The low-level copy_store() is for the case where you want to replicate data exactly, and so you just copy key/value pairs from one store to another, which is going to be faster because there is no need to decompress/recompress chunks, and because only initialized chunks will get copied.

The higher-level copy() would be for a more general case where you want to copy an entity (group or array), going via the create_group/create_dataset API. This could mean potentially that source and dest could be anything array- or group-like, e.g., either zarr or h5py, i.e., this would provide a way to migrate data between two zarr hierarchies, or zarr to/from h5py. This could also allow for copying but using different compression to store data in the destination than is used in the source.

@jakirkham
Copy link
Member Author

I see. Sure that seems fine.

My initial thinking with copy was that it would special case Zarr objects and thus handle direct data replication with the performance of copy_store, but through the same easy high level interface.

@alimanfoo alimanfoo added the enhancement New features or improvements label Nov 21, 2017
@alimanfoo
Copy link
Member

alimanfoo commented Nov 21, 2017 via email

@alimanfoo alimanfoo mentioned this issue Dec 9, 2017
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New features or improvements
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants