New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support multiple time points per image file #89
support multiple time points per image file #89
Conversation
Initially with nplanes specified, keys for image rdds were (fileidx, pointInFile) integer pairs. However this causes problems downstream in blocks conversion. This commit changes that back to straight integer indices. The downside is that for tifs, the entire tif must be loaded into memory in order to count the number of planes, whereas previously this could be pipelined lazily.
also fixes a bug with Images properties not being set correctly when nplanes is passed
Meh. This has some overlap with PR #87, which changes the expected dimensions of images with only a single z-plane from |
This is consistent with the new behavior defined in PR 87, in the tifs-and-tif-stacks branch. New imagesloader code in this branch already anticipates this new behavior, this commit just updates seriesloader for consistency with imagesloader and changes test expectations accordingly.
@industrial-sloth you were right, merging the other |
Conflicts: python/test/test_context.py python/test/test_imagesloader.py python/thunder/rdds/fileio/imagesloader.py python/thunder/utils/context.py
Whoops, missed changing some function names in the merge, fixing. |
@@ -75,34 +76,82 @@ def fromStack(self, dataPath, dims, dtype='int16', ext='stack', startIdx=None, s | |||
If true, will recursively descend directories rooted at datapath, loading all files in the tree that | |||
have an extension matching 'ext'. Recursive loading is currently only implemented for local filesystems | |||
(not s3). | |||
|
|||
nplanes: positive integer, default None | |||
If passed, will cause a single binary stack file to be subdivided into multiple time points. Every |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's avoid the explicit references to time
. I just checked and we're generally good about this, though in fixing this would be great if you caught other users. There's certainly a notion of the images being /ordered/ both here and elsewhere, but that ordering need not relate to time specifically.
@industrial-sloth this is looking good, left a couple minor notes. One other question: how does the ordering work? Am I right in assuming that, given three (X,Y,Z) image files, call them 1, 2, 3, and setting nplanes=Z/2, you would end up with an ordering of 1_1, 1_2, 2_1, 2_2, 3_1, 3_2, where 1_1 and 1_2 are the first and second chunks of the first image, etc.. In other words, it'll be ordered first by the image files, then by the chunks. That seems like what we want, just making sure. |
@freeman-lab yes, that's the behavior. For concreteness, if we had N 10-page tif files and set |
Previously docs referred to 'time points', now describe more general 'records'.
Ok, I've changed the docstrings to talk about generic "records" rather than "time points". Looks like there's a merge conflict to resolve, will fix that momentarily. |
Conflicts: python/thunder/rdds/fileio/imagesloader.py python/thunder/utils/context.py
Have added checks that The flat binary stack file case is just fine, since there the driver knows what the dimensions are ahead of time and can do the check before launching jobs. |
Thanks @industrial-sloth , agreed that doing the check on the executors for the |
support multiple time points per image file
This PR adds an
nplanes
option to the main Images-loading methods. Ifnplanes
is specified, then a single input file will be interpreted as containing multiple image volumes, each with sizenplanes
in its final dimension. For instance, a single binary stack file loaded with argumentsdims=(x, y, 8), nplanes=2
would turn into 4 separate records in an Images RDD, each with size(x, y, 2)
. In general, images that are loaded withz
planes and a positivenplanes
argument will result inz / nplanes
time points, each withnplanes
planes.