Update the tif reading testing and implementation #170

swkeemink · 2021-06-07T08:04:30Z

In this pull request I have updated our previous tif reading test, and added two new ones.

One reproduces exactly the way Suite2P saves their tifs (which is breaking our current tif loading, see #148), and one which is just a more general way of saving tif files. All three can have insconsistent behaviour with different ways of loading tifs (to do with being loaded as a single 3D frame or seeing every frame as a separate series), and our tif reading should be updated to pass these tests.

@scottclowe will update our current tif-reading method, and add that to this PR.

scottclowe · 2021-06-07T08:41:59Z

Great, thanks Sander! Also, thanks for linking to a stable copy of suite2p's save_tiff function. :)

scottclowe · 2021-06-07T08:54:51Z

@swkeemink

On Python 3.5, the test_tiffwriter_tiff fails while trying to write the tif file. It is not clear that there would be an issue reading the contents of that file after it was saved. On Python 3.5, this is the only unit test which fails. That means test_suite2p_tiff passes on Python 3.5.

>               tif.write(self.expected[i, :, :], contiguous=True)
E               AttributeError: 'TiffWriter' object has no attribute 'write'

On Python 3.9, test_tiffwriter_tiff passes. Meanwhile test_suite2p_tiff fails, which is currently expected.

        # assert equality
>       self.assert_equal(actual, self.expected)

Clearly at some point the write method was added to tifffile.TiffWriter, and I guess the behaviour of save was changed; and they dropped py3.5 support so we get the old version when we unit test on py3.5 and new version of tifffile on py3.9.

I think the two different tiff files need to be committed to the test resources directory so we aren't relying on being able to write the tiffs at test time. Could you make this change - generate the tiff files and commit them, instead of writing and tearing down during the unit tests?

swkeemink · 2021-06-07T09:14:26Z

@swkeemink

On Python 3.5, the test_tiffwriter_tiff fails while trying to write the tif file. It is not clear that there would be an issue reading the contents of that file after it was saved. On Python 3.5, this is the only unit test which fails. That means test_suite2p_tiff passes on Python 3.5.
>               tif.write(self.expected[i, :, :], contiguous=True)
E               AttributeError: 'TiffWriter' object has no attribute 'write'
On Python 3.9, test_tiffwriter_tiff passes. Meanwhile test_suite2p_tiff fails, which is currently expected.
        # assert equality
>       self.assert_equal(actual, self.expected)
Clearly at some point the write method was added to tifffile.TiffWriter, and I guess the behaviour of save was changed; and they dropped py3.5 support so we get the old version when we unit test on py3.5 and new version of tifffile on py3.9.

I think the two different tiff files need to be committed to the test resources directory so we aren't relying on being able to write the tiffs at test time. Could you make this change - generate the tiff files and commit them, instead of writing and tearing down during the unit tests?

Now added to the PR!

scottclowe · 2021-06-07T09:25:33Z

Great thanks. I'll move the files to the right place.

Not sure what is meant by

using tifffile.__version__ = tifffile.imsave('test_imsave.tif', data).

etc in the doc strings. Did you copy and paste the wrong thing? It looks like you meant to write the tifffile version numbers but copied code instead.

scottclowe · 2021-06-07T09:35:13Z

@swkeemink I'll hold off on implementing this until #156 is done, since they are both touching the same bits of the code base and otherwise it'll make difficulties with regards to merge conflicts for whichever PR is finished second.

- Add test tiffs files to tests/resources/tiffs - Use these pre-made tiffs for tif-loading tests, instead of building them on the fly. - This is important because the tiff writing functions have changed their behaviour across tifffile versions, and we want stable tif files to test on.

Not just methods attached to the BaseTestCase class.

codecov · 2021-06-10T08:24:10Z

Codecov Report

Merging #170 (4283130) into master (73cee3b) will decrease coverage by 0.37%.
The diff coverage is 76.47%.

@@            Coverage Diff             @@
##           master     #170      +/-   ##
==========================================
- Coverage   93.03%   92.65%   -0.38%     
==========================================
  Files           8        8              
  Lines         818      831      +13     
  Branches      161      165       +4     
==========================================
+ Hits          761      770       +9     
- Misses         29       31       +2     
- Partials       28       30       +2

Flag	Coverage Δ
unittests	`92.53% <76.47%> (-0.38%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
fissa/extraction.py	`95.87% <76.47%> (-4.13%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 73cee3b...4283130. Read the comment docs.

scottclowe

I expanded on the unit tests by programmatically generating tiffs in the following ways:

dtype varied over uint8, uint16, uint64, int16, int64, float16, float32, float64
writer parameters varied over tifffile.imsave with bigtiff=True/False, imagej=True/False; tifffile.TiffWriter().save; tifffile.TiffWriter().write with contiguous=False, contiguous=True, and a mixture of contiguous=False/True; tifffile.TiffWriter() with a mixture of write and save
shaped (6, 3, 2), (3, 2, 3, 2), (2, 1, 3, 3, 2), and a series of pages which do not stack [(4, 3, 2), (2, 3, 2)]
The code to generate the tiffs is added as fissa/tests/generate_tiffs.py, and was executed using imageio 2.9.0 and tifffile 2021.6.6 on Python 3.8.

Since the tifffile documentation describes ImageJ hyperstacks as being able to store up to 6-dimensional image data, I have made our DataHandlerTifffile loading method able to handle arbitrarily sized TIFFs of at least 2 dimensions.

Note that the way that we handle higher dimensional image data is to flatten all dimensions beyond the 2 inner most dimensions (height and width) together. This assumes all higher dimensions are time-like, and do not include spatial changes. This means that (for instance) FISSA will try to run on 4d TIFF stacks which have both time and depth dimensions in addition to height and width (TDHW) without raising an error, and will obviously do a really bad job with them, because the origin of signals is not shaped like a cylinders across the depth dimension.

Since we can't really inspect the DataHandlerPillow data variable and see if it is correct (the return is a PIL.Image handle with lazy loading, and not an ndarray), I have added tests which check that the mean over all frames is accurate using both image2array and getmean as part of the same test.

We find that DataHandlerTifffile (with the new changes) can handle all of the formats tested against, but DataHandlerPillow can not. DataHandlerPillow does not support:

uint64
int64
float16
float64
bigtiff
unstackable shapes [(4, 3, 2), (2, 3, 2)]

@swkeemink, could you review the changes I have made?

swkeemink · 2021-06-10T09:27:53Z

Question: do the expanded tests catch the bug thrown by using the suite2p tests? (only a single frame read)
Their way of saving tiffs is now not included explicitly but I'm guessing one of the tifffile.imread's functionally does the same.

swkeemink

Overall great changes and additions. See above for my question about Suite2p output.

About Pillow not passing all the tests, that might just be what it is. It does make the use-case a bit confusing 'low-memory mode'. I would actually consider taking it out given that it apparently has such poor support for different tiff files.

swkeemink · 2021-06-10T09:30:19Z

fissa/tests/generate_tiffs.py

+
+
+if __name__ == "__main__":
+    print("Using imageio {}".format(imageio.__version__))


We should print the used package versions into a text-file so it's clear which version was used for the actual tiffs stored on the repo now.

If you look at the commit (120784d) that introduced/last modified these test resources (which is an obvious thing to do if you are trying to check the provenance of the files), it says the version numbers in the commit message. I don't think we need it in a text file. To me at least, it's not obvious to expect that there should be a text file with this content to look for, and it will easily be lost in amongst a directory of 92 tiff files.

Fair enough, happy to not have that.

swkeemink · 2021-06-10T09:33:38Z

fissa/extraction.py

+                            image, page.ndim, page.shape
+                        )
+                    )
+                shp = [-1] + list(page.shape[-2:])


We should perhaps throw a warning at least when the dimensionality is higher than 3, and state that we concatenate all dimensions > 3.

Probably a good idea, yes.

I've added the warnings, and tests that make sure the warnings are warned.

However, it turns out most higher-dimensional TIFF stack really are loaded frame-by-frame when loaded page-by-page with TiffReader, so we can't always tell that the structure it was saved as had more than 3 dimensions. How I've done it is the best we can do though.

swkeemink · 2021-06-10T09:42:48Z

fissa/extraction.py

-            return tifffile.imread(image)
-        return np.array(image)
+        if not isinstance(image, basestring):
+            return np.array(image)


We could throw a TypeError() if the input is neither an array or basestring, perhaps even with the suggestion to write a new datahandling method. We ask for array-like inputs so it can also handle lists I guess, so perhaps we can have:

Suggested change

return np.array(image)

if isinstance(image, (np.ndarray, list)):

return np.array(image)

else:

raise TypeError('Wrong image input format: unfamiliar shape. Expected array_like or string.')

Although lists of things can be lots of things and probably not good to expect.

I think it is better to do duck typing - anything which is array-like should be handled in this way. It could be a list of lists. It could be a tuple instead of a list. Or it could be a pandas.DataFrame. Or any arbitrary array-like data structure.

Either way, I've not altered this bit of the code (the behaviour w.r.t. non-strings is the same as it was before), so changing it is off-topic for this PR.

Incidentally, I think it would be better to use np.asarray instead of np.array, since np.array will make a copy of the memory if the input is already an ndarary and np.asarray doesn't make a copy.

scottclowe · 2021-06-10T10:19:45Z

Question: do the expanded tests catch the bug thrown by using the suite2p tests? (only a single frame read)
Their way of saving tiffs is now not included explicitly but I'm guessing one of the tifffile.imread's functionally does the same.

Yes, of course. It is included, it's just not called "suite2p" any more. It is the "TiffWriter.save_int16.tif" test resource. I went for a description of what the save method is instead to fit in with the programatically generated suite. Also, suite2p might change their save behaviour in the future.

TIFF test resources, in fissa/tests/resources/tiffs, were built using imageio 2.9.0 and tifffile 2021.6.6.

swkeemink requested a review from scottclowe June 7, 2021 08:04

swkeemink mentioned this pull request Jun 7, 2021

Update Suite2P implementation #150

Closed

swkeemink mentioned this pull request Jun 7, 2021

API: Datahandler now functions as a class #156

Merged

swkeemink and others added 8 commits June 9, 2021 12:57

TST: Updated generated test tif to have several frames

79a33b5

TST: Renamed current tif reading test to be about one loading method

779f916

TST: Added two tif-loading-tests with different tif generation methods

b9be1bb

DOC: Fix test docstrings

c34790c

TST:RF: Refactor extraction tests

bf8decb

TST:ENH: Expose custom assertions as functions

e5e0b8b

Not just methods attached to the BaseTestCase class.

TST:MNT: Fix shebang

bd27576

scottclowe force-pushed the tiff_read_unit_test branch 3 times, most recently from 5c530de to c3184c1 Compare June 10, 2021 08:22

TST:ENH: Add generate_tiffs.py

b5b74b8

scottclowe force-pushed the tiff_read_unit_test branch from c3184c1 to 819bd66 Compare June 10, 2021 08:52

scottclowe approved these changes Jun 10, 2021

View reviewed changes

scottclowe force-pushed the tiff_read_unit_test branch from 819bd66 to 67c9634 Compare June 10, 2021 09:04

swkeemink commented Jun 10, 2021

View reviewed changes

scottclowe added 3 commits June 12, 2021 13:05

TST: Change out tiff resources and expand extraction tests

a60d15f

TIFF test resources, in fissa/tests/resources/tiffs, were built using imageio 2.9.0 and tifffile 2021.6.6.

TST: Add tests for datahandler.getmean

a42035f

BUG: Handling multipage tiffs with >3 dims

93a0843

scottclowe added 3 commits June 12, 2021 14:22

TST:MNT: Overwrite setup_class without super, for py2.7

3caef72

REL: Remove imageio from dev dependencies

aaf854a

REL: Bump minimum Pillow to 4.3.0

8c2869f

scottclowe force-pushed the tiff_read_unit_test branch 4 times, most recently from 3629d45 to fc81ebc Compare June 13, 2021 07:39

scottclowe added 8 commits June 13, 2021 09:02

TST:DOC: Add extraction test docstrings

fe59507

TST:MNT: Determine generate_tiffs output dir relative to file

5b1b949

TST: Test on 6d TIFFs

563f11f

ENH: Warn about multipage TIFFs

e25cf77

TST: Test page reshape warnings are generated

92f88a0

TST:BUG: assertWarns requires py3.2+

786a63a

TST: Comment out some formats known unsupported by Pillow

d7d6384

MNT: No needless copy; np.array -> np.asarray

1752e7e

scottclowe force-pushed the tiff_read_unit_test branch from fc81ebc to 1752e7e Compare June 13, 2021 08:03

Merge remote-tracking branch 'upstream/master' into bug_tiff-multipage3

4283130

scottclowe merged commit 3b1ba55 into rochefort-lab:master Jun 13, 2021

swkeemink deleted the tiff_read_unit_test branch June 29, 2021 15:34

swkeemink mentioned this pull request Jul 12, 2021

ValueError: Wrong ROIs input format: unfamiliar shape. #148

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update the tif reading testing and implementation #170

Update the tif reading testing and implementation #170

swkeemink commented Jun 7, 2021

scottclowe commented Jun 7, 2021

scottclowe commented Jun 7, 2021 •

edited

Loading

swkeemink commented Jun 7, 2021

scottclowe commented Jun 7, 2021

scottclowe commented Jun 7, 2021

codecov bot commented Jun 10, 2021 •

edited

Loading

scottclowe left a comment

swkeemink commented Jun 10, 2021

swkeemink left a comment

swkeemink Jun 10, 2021

scottclowe Jun 10, 2021

swkeemink Jun 10, 2021

swkeemink Jun 10, 2021

scottclowe Jun 10, 2021

scottclowe Jun 12, 2021

swkeemink Jun 10, 2021

scottclowe Jun 10, 2021

scottclowe commented Jun 10, 2021



		if __name__ == "__main__":
		print("Using imageio {}".format(imageio.__version__))

Update the tif reading testing and implementation #170

Update the tif reading testing and implementation #170

Conversation

swkeemink commented Jun 7, 2021

scottclowe commented Jun 7, 2021

scottclowe commented Jun 7, 2021 • edited Loading

swkeemink commented Jun 7, 2021

scottclowe commented Jun 7, 2021

scottclowe commented Jun 7, 2021

codecov bot commented Jun 10, 2021 • edited Loading

Codecov Report

scottclowe left a comment

Choose a reason for hiding this comment

swkeemink commented Jun 10, 2021

swkeemink left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

scottclowe commented Jun 10, 2021

scottclowe commented Jun 7, 2021 •

edited

Loading

codecov bot commented Jun 10, 2021 •

edited

Loading