Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Format for 3D data (because npz is not so good) #14

Closed
almarklein opened this issue Oct 30, 2014 · 25 comments · Fixed by gordon-n-stevenson/imageio#1 or #113
Closed

Format for 3D data (because npz is not so good) #14

almarklein opened this issue Oct 30, 2014 · 25 comments · Fixed by gordon-n-stevenson/imageio#1 or #113

Comments

@almarklein
Copy link
Member

NPZ is not supported on pypy and has issues with certain combinations of Python 2.x and numpy. We better had a good alternative ...

@ghisvail
Copy link
Contributor

Do you have a link for these claims ? (I was personally not aware of it)

Is that something that cannot be fixed upstream, instead of re-inventing a new format for data serialization ?

@almarklein
Copy link
Member Author

I tried NPZ in pypy and it simply was not implemented. In Vispy we've had some trouble with data stored in npz files. It seemed that with a certain combination of Python2 and Numpy the file could not be read in Python3.

The title could also be: implement plugins for other (existing) 3D data formats.

@ghisvail
Copy link
Contributor

Do you have specific formats in mind ?

@almarklein
Copy link
Member Author

In particular MHD (#29) and hdf5. However, the first needs 2 files to store data (bah) and I suspect hdf5 wont work without relying on another library ...

@ghisvail
Copy link
Contributor

Actually, dual-file data storage is still pretty common. Most of the proprietary formats I deal with in my research are designed that way (one file for the descriptor, one for the raw data). So I'd be interested to see how you are planning to wrap that out.

For HDF5, which I am also familiar with, pytables and h5py are already decent solutions around libhdf5. I don't know whether there is a niche for a pure-Python implementation of it though.

@rossant
Copy link

rossant commented Dec 15, 2014

For HDF5, just stick with h5py...

@rossant
Copy link

rossant commented Dec 15, 2014

About npz: it looks like there are no problems when files are created in Python 3 (see this issue). Is it conceivable to just throw a big warning when users save npz files in Python 2? Python 3 users will probably never have a problem. It might just be a matter of mentioning this potential problem in the documentation.

@almarklein
Copy link
Member Author

As for hdf5, imageio could provide a thin wrapper to easy the storing and retrieving of image/volume data. Not sure if this is at all useful; it might already be simple enough with pytables/h5py.

Still, I'd like to have something that is pure Python (i.e. works everywhere) and does not rely on another lib...

@rossant
Copy link

rossant commented Dec 15, 2014

I agree that a pure Python HDF5 lib might be useful generally speaking, but it might be a lot of work. In my experience the HDF5 C API is horrible to work on, even just getting the data can be quite complicated...

Also, anyone willing to work with HDF5 files in Python will always have h5py or pytables installed. (they are installed by default in anaconda for example)

@almarklein
Copy link
Member Author

I agree, but I was not talking of hdf5 per see.

@rossant
Copy link

rossant commented Dec 15, 2014

@almarklein then I think you have too options:

  • npz/npy
  • flat binary + metadata (either in a header, or in a second file)

@almarklein
Copy link
Member Author

npz is not widely available (it depends on numpy and is not available on pypy). I think I can do better than a flat binary with meta data :) Would be best if there was already a format that we could use. If not, I might just implement something simple.

@rossant
Copy link

rossant commented Dec 15, 2014

Oh so you really mean pure Python (no NumPy etc)? Just out of curiosity, why do you need PyPy support?

How can you do best and simpler than flat+header in pure Python?

@almarklein
Copy link
Member Author

I just hate it that we cannot do volumes in pypy :)

gordon-n-stevenson added a commit to gordon-n-stevenson/imageio that referenced this issue Jul 25, 2015
fixes imageio#14 and imageio#29

Using SimpleITK adds NIFTI and MetaImage format that is in a single file format. Adds additional functionality for Medical Imaging data
@almarklein
Copy link
Member Author

Reopening. I'm thinking of something pure Python (not relying on simpleITK), or more generic (preferably both).

@almarklein almarklein reopened this Jul 26, 2015
@ghisvail
Copy link
Contributor

Have you looked into the formats that OpenImageIO supports for inspiration?

@almarklein
Copy link
Member Author

I did now. But their plugins that support volumetric images are either based on hdf5 or aimed very much on animation.

@dimatura
Copy link

There's the nrrd format, which is quite simple (basically flat binary, optionally compressed): http://teem.sourceforge.net/nrrd/
I've used the pure python (with numpy) implementation here: https://github.com/mhe/pynrrd/blob/master/nrrd.py

@almarklein
Copy link
Member Author

Thanks @dimatura that sounds interesting, especially since its pure Python

@jni
Copy link
Contributor

jni commented Jun 7, 2017

@almarklein why has TIFF not been discussed here? iirc TiffFile has a pure python implementation...?

@almarklein
Copy link
Member Author

Would TIFF be suited for storing a 512x512x512 volume? I've never seen people do that with GIF, perhaps because the compression is not that good?

In some cases it would also be necessary to store meta data like spacing between voxels, the origin, or a transformation matrix.

@jni
Copy link
Contributor

jni commented Jun 7, 2017

Haha I use TIFFs for such data all the time. They have some compression support, but I don't know anything about its performance in comparison to other options. TIFF also has vast metadata capabilities, although see my upcoming comment in #263 for caveats.

@almarklein
Copy link
Member Author

Fair enough :) The Tiff format recently became 3D capable by supporting volread(). My impression was that this was mostly to read all channels at once, but it can indeed also be used to store actual volumetric data.

@almarklein
Copy link
Member Author

Maybe this is why Tiff, although 3D capable, is a bit hard to sell as the format for 3D data: #263 (comment) :)

@jni
Copy link
Contributor

jni commented Jun 8, 2017

🏳️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants