Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion with Darr developers #8

Open
jakirkham opened this issue Oct 30, 2018 · 5 comments
Open

Discussion with Darr developers #8

jakirkham opened this issue Oct 30, 2018 · 5 comments

Comments

@jakirkham
Copy link
Member

Ran across @gbeckers's Darr library recently, which seems very similar in intent and behavior to Zarr. Would be great to compare and contrast Darr and Zarr to see where the two implementations can learn from each other and possibly work together.

@gbeckers
Copy link

Darr developer here. Thanks for you interest and reaching out. I am very much interested in Zarr although I must confess to my shame that I haven't actually tried it out yet. Will certainly do so very soon. I have used pytables for much of my work in behavioral neuroscience, for over 10 year, and still use it. It is brilliant. Zarr seems to aim at similar use cases, so I should definitely have a closer look.

Darr really is much more modest than Zarr or PyTables. In essence it is not much more than a way to save flat binary data with a separate description of how to read it for an audience that is as wide as possible. I wrote it because I need it for my own work work, in cases where I want to quickly share data with colleagues or students who do not use Python, but R or Matlab or something else. A npy file would already be too complex. Darr does not have hierarchies, compression etc etc.

I am also interested in long-term archiving of data. My data likely is longer-lived than the tools I use. In my field, especially in electrophysiology, at least some people seem to backpedal from complex formats, at least for simple binary data such as signal recordings.

I would be happy to work together in whatever way is useful. An obvious start could be to create the possibility to save a darr array as a zarr array and the other way around. For the array data this should be very simple, but attribute/metadata need attention. Since darr is very new and no one except our lab is using it yet, I am not sure how useful it is for zarr to save as darr array. Potentially zarr could use darr, or code from darr, for writing the descriptive readme file on how to read the binary data without zarr. But only if arrays are not compressed of course.

@alimanfoo
Copy link
Member

alimanfoo commented Oct 31, 2018 via email

@gbeckers
Copy link

gbeckers commented Nov 2, 2018

OK, thanks. I had a closer look at Zarr. Very impressed. I am going to try this out in some actual analyses. In the mean time I started writing some code to read and write zarr arrays in darr, which turned indeed out to be very simple. It is easy to go from one to the other, also very large arrays pose no memory problems because zarr reads in chunks. Nice!

@eddienko
Copy link

Sorry, linked to wrong ticket, it should have been 325 (fixed it now), please can someone remove the above ?

@joshmoore
Copy link
Member

Hi @eddienko. I don't know of a way to remove a reference, but thanks for letting us know.

@alimanfoo alimanfoo transferred this issue from zarr-developers/zarr-python Jul 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants