Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reading in raw lightsheet data (CZI file) #81

Closed
pr4deepr opened this issue Apr 28, 2021 · 21 comments
Closed

reading in raw lightsheet data (CZI file) #81

pr4deepr opened this issue Apr 28, 2021 · 21 comments

Comments

@pr4deepr
Copy link

System and Software

  • aicspylibczi Version: 2.8.0
  • Python Version: 3.7.10
  • Operating System: Windows 10

Description

I am trying open a czi file which is raw lightsheet data, i.e., not deskewed. The deskewed data as a czi file opens fine, but the raw data (not deskewed) throws an error. The idea is to read in the raw data and perform deskewing and deconvolution in Python.

Expected Behavior

Expected it to return the czi file as a dask array

Reproduction

This is just an example code for troubleshooting. I was initially using aicsimageio directly using imread_dask and was getting the same error

from aicsimageio.readers import czi_reader
from aicspylibczi import CziFile
img='D://Pradeep//Lightsheet//skew_deskew_example/image.czi'
czi_deskew = CziFile(img)
czi_reader.CziReader._daread(img,czi_deskew)

It throws an error: *

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-2-ba37ba0ae147> in <module>
      1 # Read first plane for information used by dask.array.from_delayed
----> 2 sample, sample_dims = czi.read_image(**first_plane_read_dims)
      3 print(sample_dims)

~\AppData\Local\Continuum\anaconda3\envs\lightsheet\lib\site-packages\aicspylibczi\CziFile.py in read_image(self, **kwargs)
    386         #print(cores)
    387         #print(plane_constraints)
--> 388         image, shape = self.reader.read_selected(plane_constraints, m_index, cores)
    389         #print(shape)
    390         #print(image)

RuntimeError: The method or operation is not implemented.

I am not sure what this error means.

I have deskewed data generated from another source on the same data and it works really well gibing the output:
(dask.array<concatenate, shape=(119, 3, 75, 1166, 1488), dtype=uint16, chunksize=(1, 1, 75, 1166, 1488), chunktype=numpy.ndarray>,
 'TCZYX')

The dimensions of the raw data are: (119, 3, 751, 150, 1488) in TCZYX format.
The dimensions of the deskewed data that works are: (119, 3, 75, 1166, 1488) in TCZYX format.

Environment

Anaconda Environment

Thanks
Pradeep

@heeler
Copy link
Member

heeler commented Apr 28, 2021

Hi @pr4deepr,

I'll look into this as soon as I have work on the 3.0 release completed.
It would be ideal to have a test file from your system that has the problematic behavior if at all possible.
It is somewhat likely that the raw/skewed data is not supported by libCZI. I might be able to patch libCZI to make that work but that's an unknown. If you can get me a small test file that would be fantastic. I'll hope to take a look at it within a week.

Thanks
@heeler

@pr4deepr
Copy link
Author

pr4deepr commented Apr 30, 2021 via email

@pr4deepr
Copy link
Author

pr4deepr commented May 3, 2021

Hi @heeler

Please find the data here. Its a WeTransfer link.
I can use the czi2tif option from here: https://github.com/cgohlke/czifile
to convert small czi files into tiff files, and can access the metadata. But, its only sensible for small files...

Cheers
Pradeep

@evamaxfield
Copy link
Collaborator

Hey @pr4deepr I just saw your talk on Dask Summit and it served as a reminder for me to check this issue 😅 (sorry for the delay)!

@heeler has unfortunately taken a new job so I will have to get myself caught up on what is going on with this issue. I am curious if you have encountered the chunking problem on other file formats. Is it just CZI or does it affect the whole aicsimageio lib?

Excited to chat at the Dask summit life sciences workshop too!

@pr4deepr
Copy link
Author

Hey @JacksonMaxfield

I just saw your talk. I really enjoyed it and I think it answered some questions that I had about processing the large datasets and memory usage!!

_

I just saw your talk on Dask Summit and it served as a reminder for me to check this issue 😅 (sorry for the delay)!

_

Apologies, didn't mean to put up the github issue like that, I just wanted to showcase my workflow and where I'm at.

Currently, I have only tried it on CZI files. I can access the CZI file and explore the metadata, subblocks using the czifile library. I get the error only when I try to read it in using aicspylibczi or aicsimageio libraries, especially as a dask array.

We mainly use Zeiss microscopes here and particularly the Zeiss Lattice in this case. I haven't tried it on other file-formats. We have a home-built lattice , so I can try it on the tiff files that it churns out? Will that work for you?

Will be great to chat with you. Which or what time will you be attending the life science workshop?

@evamaxfield
Copy link
Collaborator

Apologies, didn't mean to put up the github issue like that, I just wanted to showcase my workflow and where I'm at.

No worries at all. It was a helpful reminder for me and useful to hear about the issues.

Hmmm well normally I would say can you upgrade to aicsimageio 4.0.0.dev6 but CZI reading hasn't made it into that dev release yet. I tried, and we have benchmarks that show our peak memory used during reading files (and I manually ran some tests last night) to make sure that at least TIFFs we aren't reading more data into memory than asked - 4.0.0 benchmarks. If you click on AICSImageIO peakmem benchmarks. You can see that cached_array vs delayed_array are much different in MBs read during the process. But, I will continue to look into the memory issue.

Also note that from your talk, the Dask array jupyter / html repr that shows size isn't showing the size of bytes already read. Just the size of all the chunks combined.


Now, on to your current issue. I will manually give your file a go on the newest release of aicspylibczi and see if I can find anything.

And lastly, I will try to go to both life science workshop sessions but will for sure be at the first one. (May 19, 16:00 PST / May 19, 23:00 UTC).

@pr4deepr
Copy link
Author

Thanks a lot for looking into this..

The WeTransfer link expired, so I''m posting another link here:
DOWNLOAD

@evamaxfield
Copy link
Collaborator

Had a brief moment to look at this this morning. On both the prior and new versions of aicspylibczi it produces the error you noticed. So reproducible! Yay?

What an odd error. I will try to find a time to talk to Jamie about this and see what I can do. I assume it's something to do with typing. Because the underlying reader is written in C++ I wonder if your file has a different type return for some operation which is causing it to say it has no impl for those specific types.

@pr4deepr
Copy link
Author

Yea, I had a look and realised the reader is in C++, which is where I hit a wall!!!

So, I was comparing a raw data file and the corresponding deskewed/processed data. The latter opens in aicsimageio.
I have been playing around with using CziFile to explore the underlying data structure.

From what I understand about czi files, the data are in subblocks, which are in turn contained in subblock directories.
Using info from the code here:
https://github.com/cgohlke/czifile/blob/a70265fd430983875bf4c31955f2ad57f2592747/czifile/czifile.py#L644

I can access each subblock which contains the image data. This can be accessed using data_segment()
This is my understanding of the czi file.

so, if I look at the first subblock:

czi_raw = CziFile("RAW DATA.czi")

""""Read, decode, and copy subblock data from first subblock."""
subblock =czi_raw.filtered_subblock_directory[0].data_segment()

from tifffile import FileHandle
fh_raw=FileHandle(img_raw) #handling binary files within czi files

fh_raw.seek(subblock.data_offset) #set the files current position at this sublock; set the pointer at this subblock for reading
dtype=np.dtype(subblock.dtype)
data = fh_raw.read_array(dtype, subblock.data_size // dtype.itemsize)
czi_image_raw=data.reshape(czi_raw.filtered_subblock_directory[0].stored_shape)

What information would be valuable to compare the raw and deskewed data?

@pr4deepr
Copy link
Author

BTW, are you comfortable with me posting this in the image.sc forum?
I am in a workshop with Sebastian Rhodes from Zeiss and he mentioned about posting it there.

@evamaxfield
Copy link
Collaborator

Please do! More eyes the better probably.

@evamaxfield
Copy link
Collaborator

Hey @pr4deepr, we will try to take a deeper look at this issue soon. In fact @heeler may find some time to do soon 🎉. But other than that, no real update unfortunately, just "this issue is still on our radar" :/

@pr4deepr
Copy link
Author

Thanks for that @JacksonMaxfield and @heeler ! Appreciate you taking the time for this...

@evamaxfield
Copy link
Collaborator

Hey @pr4deepr just pinging again to say that don't worry we are still tracking this issue but no development has occurred unfortunately still. Hoping that we can look at it soon but again, no real timeline unfortunately. Apologies.

@pr4deepr
Copy link
Author

Thanks!

@toloudis
Copy link
Collaborator

Initial finding: the error message "The method or operation is not implemented." comes from the underlying Zeiss libCZI when it thinks there is an internal compression format it doesn't recognize. It recognizes "JpgXr" and "UnCompressed" according to the code. I am still looking deeper to see how it got there.

@toloudis
Copy link
Collaborator

Looks like the file contains compression mode 1001 which the libCZI library doesn't recognize and considers "invalid".

@pr4deepr
Copy link
Author

Thanks for this update.
Glad to see that you've figured out why we're getting the error.

@toloudis
Copy link
Collaborator

toloudis commented Jul 5, 2021

@pr4deepr
Copy link
Author

Hi
Just updating this thread. There was a bit of delay in getting my hands on some czi files.
With files saved using the newest version of Zen software (3.4 onwards), aicsimageio reads the czi files without a problem.
For older files, I need to resave it using Save As CZI option on Zen to be able to read it using aicsimageio library.

I really appreciate the rapid response and help in this matter.

Do let me know if there is anything else I need to provide

Cheers
Pradeep

@evamaxfield
Copy link
Collaborator

Well glad it was solved. Going to close this issue for now then. If it comes up again / if any other issues crop just let us know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants