New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More flexible collections with custom load_func. #6276
More flexible collections with custom load_func. #6276
Conversation
Small change to `ImageCollection` to make it accept non-file-name sequences as the `load_pattern`, as long as a custom `load_func` is given (to which the items in the sequence will be passed in lieu of the image numbers).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much: This is a nice addition (extension)! Can you please update the docstring accordingly?
Included an example for using ImageCollection objects with custom `load_func` and a Sequence object as the `load_pattern`.
@mkcor I updated the docstring accordingly. Happy that you like the small change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much, @thvoigtmann! Could you please add a test (or a doctest, i.e., an example under Examples
in the docstring), so CI gets to run this new functionality? We have the following image sequences available in the current data registry:
Co-authored-by: Marianne Corvellec <marianne.corvellec@ens-lyon.org>
Co-authored-by: Marianne Corvellec <marianne.corvellec@ens-lyon.org>
Co-authored-by: Marianne Corvellec <marianne.corvellec@ens-lyon.org>
Co-authored-by: Marianne Corvellec <marianne.corvellec@ens-lyon.org>
The added docstring example tests the loading of multiple images from one file using a custom `load_func`.
Hello @thvoigtmann! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:
Comment last updated at 2022-04-30 18:57:12 UTC |
Thanks for the suggestion @mkcor! I found it easier to use the multi-frame GIF file from the test data. It's the first time for me working with CI and docstring examples, so I hope it's looking good. Having looked looked at the code again, I am now confused by Specifically I think that the desired effect of the code in the example there (load a multipage TIFF file, return an object derived from
with
This works because To me this looks like one could remove I'd be happy to fix this if you have some guidance. (Fix it? Something I'm overlooking? New pull request or keep adding here? Bug report first?) |
Dear @thvoigtmann, Thank you very much for your responsiveness! 🙂 The |
That's great (even better for a toy example actually). PS: You can apply (code review) suggestions as a batch: https://docs.github.com/en/github/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/incorporating-feedback-in-your-pull-request |
Co-authored-by: Marianne Corvellec <marianne.corvellec@ens-lyon.org>
For now at least, collection = imageio.mvolread(filename, format='pillow') |
Yes @mkcor, I fully agree with your points. I was just wondering whether it's still worth fixing the discrepancy between current |
Oh, sure, that wouldn't hurt. 🙏 Continuing on collection = imageio.mvolread(filename, format='pillow') I noticed that |
@scikit-image/core the CI failure is unrelated to this PR (docs building timed out because |
Co-authored-by: Marianne Corvellec <marianne.corvellec@ens-lyon.org>
Thanks for approving!
I can reproduce that. But I noted that collection = imageio.mimread(filename) gives again list of length 24 whose elements have shape The real case for For example, I have a 51GB binary file from a high-speed camera that consists of raw sensor data (8bpp rows x cols per frame), and with this PR here merged I can use the following rough piece of code (not optimized for filehandle issues etc): class BinLoader:
def __init__(self,filename,rows,cols,dtype=np.uint8):
self.filename = filename
self.rows = rows
self.cols = cols
self.dtype = dtype
self.fd = open(filename,'rb')
self.fd.seek(0,os.SEEK_END)
self.numframes = int(self.fd.tell()/rows/cols/dtype(0).nbytes)
assert self.numframes*rows*cols*dtype(0).nbytes == self.fd.tell()
self.fd.seek(0)
def __call__(self,frame):
cnt = self.rows*self.cols
fpos = cnt*int(frame)
self.fd.seek(0)
return np.fromfile(self.fd,dtype=self.dtype,count=cnt,offset=fpos).reshape((self.rows,self.cols))
def range(self):
return range(0,self.numframes) and then do loader = BinLoader(filename, rows, cols)
ic = skimage.io.ImageCollection(loader.range(), load_func=loader) I could look into how this can be done using |
FWIW, if things like MultiImage and friends are useful, we can keep them and maintain them; no one is forcing us to offload to imageio. I like the ImageCollection example in your last comment @thvoigtmann—exactly what it was designed for! |
Thanks, @stefanv for the comment.
Based on my example, I would argue that As I discussed with @mkcor, I have a small patch ready that updates the docstring for |
That sounds like a good plan, thanks @thvoigtmann. And, yes, a new branch for that doc fix will make it easy to merge. |
This adds a test for `ImageCollection` called with a custom `load_func` and a sequence as the `load_pattern`.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @thvoigtmann! Just some small suggestions around, and then I'll merge it.
skimage/io/collection.py
Outdated
sequence, an ImageCollection of corresponding length will be created, | ||
and the individual images will be loaded by calling `load_func` with the | ||
matching element of the `load_pattern` as its first argument. In this | ||
case, the elements of the sequence do not need to be resolvable file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't understand "resolvable" here. Could we use a nice, simpler synonym? 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @alexdesiqueira, sorry for the late response. "resolvable" here means that it is a file name that exists on the local file system. I'll propose a changed wording in the next commit.
Co-authored-by: Alexandre de Siqueira <alex.desiqueira@igdore.org>
Changed wording of docstring to be simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @thvoigtmann! Looks good to me. Merging 🙂
Small change to `ImageCollection` to make it accept non-file-name sequences as the `load_pattern`, as long as a custom `load_func` is given (to which the items in the sequence will be passed in lieu of the image numbers). Included an example for using ImageCollection objects with custom `load_func` and a Sequence object as the `load_pattern`.
Small change to
ImageCollection
to make it accept non-file-namesequences as the
load_pattern
, as long as a customload_func
isgiven (to which the items in the sequence will be passed in lieu of
the image numbers).
Description
This tiny patch allows
ImageCollection
to handle the loading of images with a customload_func
more flexibly. If the latter argument is set,load_pattern
is allowed to be any python sequence (rather than just a list of file names or file-name patterns), the elements of which are to be interpreted by the user-supplied loader.This allows for a modification of the code given in the documentation (reading images from a video file):
(In difference to the documentation code, this will not need to read the entire video into memory at once.)
Another use case is (random) access to images stored sequentially in one big binary file, for example in applications using high-speed cameras.
For reviewers
later.
__init__.py
.doc/release/release_dev.rst
.example, to backport to v0.19.x after merging, add the following in a PR
comment:
@meeseeksdev backport to v0.19.x
run-benchmark
label. To rerun, the labelcan be removed and then added again. The benchmark output can be checked in
the "Actions" tab.