Wrong page indexing for an incomplete dataset #65

ziw-liu · 2023-02-22T02:48:12Z

For certain incomplete datasets (mehta-lab/recOrder#320), the current iohub (or waveorder.io) would gather the wrong page number (the first page has page number of 511):

0    /hpc/projects/comp_micro/rawdata/falcon/zebraf...
1                                                  511
2                                                  162
Name: (0, 0, 0, 0), dtype: object

The byte offset is also wrong because MicromanagerOmeTiffReader uses hard-coded magic numbers to determine page offsets:

https://github.com/czbiohub/iohub/blob/0fdca8bad97c8a4c6a1b69e54923c576a7e6b44c/iohub/multipagetiff.py#L122-L142

This results in reading the wrong image for the first frame:

>>> reader.get_image(0,0,0,0).mean()
30643.720915794373

And TiffFile('/path/to/first/image/').asarray()[0].mean() would return average intensity of 194, which is consistent with the MM GUI readout.

The text was updated successfully, but these errors were encountered:

ziw-liu · 2023-02-22T17:00:01Z

This results in reading the wrong image for the first frame

This raises a deeper concern about the OME-TIFF reader: the custom byte offset and memory mapping scheme can read any arbitrary bytes and will not give any notice even if the region on the disk does not contain valid array data.

ziw-liu · 2023-02-22T23:14:33Z

Since the dataset did not finish acquiring, the micromanager index map may be longer than the actual tiff page list. For example in file Pos4_39.ome.tif', the channel index map (used by iohub to determine page count) is:

array([4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5,
       5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
       5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
       5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
       5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
       5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0], dtype=uint32)

This array has a shape of (512,).

However there are only 130 pages in the file. Note that is that MM actually preallocates space for the (planned) complete acquisition, so that this incomplete file has the same 4 GB size as if it was a full TIFF file with ~500 pages.

ziw-liu added the bug Something isn't working label Feb 22, 2023

ziw-liu self-assigned this Feb 22, 2023

ziw-liu mentioned this issue Feb 22, 2023

Performance comparison between tifffile and iohub's custom OME-TIFF implementation #66

Open

ziw-liu mentioned this issue Feb 22, 2023

Count valid TIFF pages when gathering index map #67

Merged

ziw-liu added this to the 0.1.0 milestone Feb 23, 2023

edyoshikun closed this as completed in #67 Feb 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wrong page indexing for an incomplete dataset #65

Wrong page indexing for an incomplete dataset #65

ziw-liu commented Feb 22, 2023

ziw-liu commented Feb 22, 2023

ziw-liu commented Feb 22, 2023

Wrong page indexing for an incomplete dataset #65

Wrong page indexing for an incomplete dataset #65

Comments

ziw-liu commented Feb 22, 2023

ziw-liu commented Feb 22, 2023

ziw-liu commented Feb 22, 2023