Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

imagej_metadata failed with ValueError #111

Closed
samhitech opened this issue Jan 23, 2022 · 9 comments
Closed

imagej_metadata failed with ValueError #111

samhitech opened this issue Jan 23, 2022 · 9 comments
Labels
bug Something isn't working

Comments

@samhitech
Copy link

When opening a TiffFile the method imagej_metadata raises this exception:

imagej_metadata failed with ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Line: 15365 @ tifffile.py:

if not bytecounts:

Maybe replace with something like:

if not bytecounts.size > 0:

@cgohlke
Copy link
Owner

cgohlke commented Jan 23, 2022

Can you post a file that triggers the exception or post a complete traceback? bytecounts is not supposed to be a numpy array.

@samhitech
Copy link
Author

samhitech commented Jan 23, 2022

Unfortunately, the file is large several GB and has ~10000 frames. The interesting thing is that I have just saved a 10-frame long Tiff and the bytecounts becomes a tuple with the large file it is a NumPy array
image (18)
So not len(bytecounts) > 0 might work then?

@cgohlke
Copy link
Owner

cgohlke commented Jan 24, 2022

Thank you. I can see where the issue comes from. The file has a large number of items in the IJMetadataByteCounts tag, which are read as numpy array instead of a tuple. You can change the line to if len(bytecounts) == 0: for now or wait for the next release (no ETA).

@samhitech
Copy link
Author

Okay thanks, will do so.
By the way, another issue not related to this one (I am aware that Zarr is still experimental).
While trying to use the example image_sequence.aszarr() files need to be of equal shape, otherwise ZarrFileSequenceStore's _getitem raises an exception.

For example, I had 3 tiff files with shapes (10000, 250, 250), (10000, 250, 250), (1000, 250, 250), to be able to open sequence as Zarr I have added this conditional zero padding at line 9812:

chunk = self._imread(filename, **self._kwargs) # .tobytes()
if chunk.shape != self._chunks:
-pad_width = [(0, i - j) for i, j in zip(self._chunks, chunk.shape)]
-chunk = numpy.pad(chunk, pad_width, constant_values=0)

Also, I wonder why accessing frames from the first file was way faster than the last two.

@cgohlke
Copy link
Owner

cgohlke commented Jan 24, 2022

files need to be of equal shape

That is also true for the numpy asarray interface. Zarr chunks must have the same shape and dtype and for ZarrFileSequenceStore to work each file must contain one chunk.

I had 3 tiff files with shapes (10000, 250, 250), (10000, 250, 250), (1000, 250, 250), to be able to open sequence as Zarr

Did you expect a series of shape (21000, 250, 250)?

The only way that can currently work is for the files to be part of an OME-TIFF multi-file series. This requires all files to be parsed before use.

Otherwise this feature would need a new implementation. I am waiting for Zarr to support "shards".

why accessing frames from the first file was way faster

The zarr interface determines the chunkshape and dtype from the first file in the series and adds the chunk to a cache.

@samhitech
Copy link
Author

samhitech commented Jan 24, 2022

Did you expect a series of shape (21000, 250, 250)?

Yes.

The only way that can currently work is for the files to be part of an OME-TIFF multi-file series. This requires all files to be parsed before use.

Maybe you have an example?

I have written a simple class temporarily TiffSeqHandler (link) to open each file with chunks (1, 250, 250), and did some custom __getitem__ to retrieve frames from each file's Zarr array. Which has performed relatively faster than self.tiffSequence.aszarr(axestiled={0: 0}).

Thanks for your time!

@cgohlke
Copy link
Owner

cgohlke commented Jan 24, 2022

Maybe you have an example?

What I meant is that the files must be in multi-file OME-TIFF format, e.g. https://downloads.openmicroscopy.org/images/OME-TIFF/2016-06/tubhiswt-4D/. Those can be accessed through the ZarrTiffStore, not ZarrFileSequenceStore:

from tifffile import imread
import dask.array

with imread('tubhiswt_C0_TP0.ome.tif', aszarr=True, chunkmode=2) as store:
    da = dask.array.from_zarr(store)
    print(da)
dask.array<from-zarr, shape=(2, 43, 10, 512, 512), dtype=uint8, chunksize=(1, 1, 1, 512, 512), chunktype=numpy.ndarray>

I have written a simple class temporarily

It can be done and is a useful feature. Still, I'll wait for sharding support in zarr and see if/how it can be leveraged. Like ZarrFileSequenceStore, the implementation should be independent of TIFF and work with other multi-frame file formats and readers.

Which has performed relatively faster than self.tiffSequence.aszarr

Makes sense, depending on access pattern, since the chunks in the files are very large.

Just noticed that the chunk cache in ZarrFileSequenceStore is not really multi-thread safe. Something else to fix...

@cgohlke
Copy link
Owner

cgohlke commented Feb 3, 2022

Fixed in v2022.2.2.

@cgohlke cgohlke closed this as completed Feb 3, 2022
@cgohlke cgohlke added the bug Something isn't working label Feb 3, 2022
@FirefoxMetzger
Copy link

FirefoxMetzger commented Feb 4, 2022

I just tried this on my end with a problematic file, and I can confirm that this fixes the problem.

Before (tifffile 2022.11.2):

>>> import tifffile
>>> foo = tifffile.TiffFile("Substack (1-1023).tif") 
<tifffile.TiffPage 0 @8> imagej_metadata failed with ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
>>> foo.imagej_metadata
{'ImageJ': '1.53e', 'images': 1023, 'slices': 1023, 'unit': 'micron', 'finterval': 0.06870408356189728, 'loop': False}

After (tifffile 2022.02.2):

>>> import tifffile
>>> foo = tifffile.TiffFile("Substack (1-1023).tif")
>>> foo.imagej_metadata
{'ImageJ': '1.53e', 'images': 1023, 'slices': 1023, 'unit': 'micron', 'finterval': 0.06870408356189728, 'loop': False, 'Info': ' BitsPerPixel = 8\n DimensionOrder = XYCZT\n IsInterleaved = false\n IsRGB = false\n LittleEndian = true\n PixelType = uint8\n Series 4 Name = Series006\n SizeC = 1\n SizeT = 1765\n SizeX = 512\n SizeY = 350\n SizeZ = 1\nImage name = Series006\nImage|ATLConfocalSettingDefinition|ActiveCS_SubModeForRLD = 1000\n [... lets avoid flooding the thread ... ]\nImage|ChannelDescription|DataType = 0\nImage|ChannelDescription|IsLUTInverted = 0\nImage|ChannelDescription|LUTName = Green\nImage|ChannelDescription|Max = 2.550000e+002\nImage|ChannelDescription|Min = 0.000000e+000\nImage|ChannelDescription|Resolution = 8\nImage|ChannelScalingInfo|Automatic = 0\nImage|ChannelScalingInfo|BlackValue = 0\nImage|ChannelScalingInfo|GammaValue = 1\nImage|ChannelScalingInfo|WhiteValue = 1\nImage|DimensionDescription|BitInc = 0\nImage|DimensionDescription|BytesInc = 179200\nImage|DimensionDescription|DimID = 4\nImage|DimensionDescription|Length = 1.211940e+002\nImage|DimensionDescription|NumberOfElements = 1765\nImage|DimensionDescription|Origin = 0.000000e+000\nImage|DimensionDescription|Unit = s\nImage|TimeStampList|NumberOfTimeStamps = 1765\nLocation = D:\\Zebrafish\\190205_nanoKTP_kif5a\\190205_nanoKTP_kif5a.lif\n', 'Labels': ['t:2/1765 - Series006', 't:3/1765 - Series006', 't:4/1765 - Series006',[... lets avoid flooding the thread ... ] 't:1023/1765 - Series006', 't:1024/1765 - Series006']}
>>> 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants