Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pims.open should handle lists of URLs #310

Open
mrocklin opened this issue Oct 8, 2018 · 11 comments · May be fixed by #316
Open

pims.open should handle lists of URLs #310

mrocklin opened this issue Oct 8, 2018 · 11 comments · May be fixed by #316
Labels

Comments

@mrocklin
Copy link

mrocklin commented Oct 8, 2018

I have several images on a cloud object store. I can get normal http routes to these images and I'm happy to see that pims can read these individually. However, my guess is that normal glob processing won't successfully find all of the images on the cloud object store. I'm happy to provide this list myself, but it looks like I'm unable to past a list of filenames into pims.open.

In [1]: import pims

In [2]: url = 'https://pydata.org/images/logo.png'

In [3]: pims.open(url)
Out[3]: 
<Frames>
Length: 1 frames
Frame Shape: 53 x 126
Pixel Datatype: uint8

In [4]: pims.open([url, url, url])
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-4-4b4693a3872c> in <module>()
----> 1 pims.open([url, url, url])

~/Software/anaconda/lib/python3.6/site-packages/pims/api.py in open(sequence, **kwargs)
    147     >>> frame_shape = video.frame_shape # Pixel dimensions of video
    148     """
--> 149     files = glob.glob(sequence)
    150     if len(files) > 1:
    151         # todo: test if ImageSequence can read the image type,

~/Software/anaconda/lib/python3.6/glob.py in glob(pathname, recursive)
     18     zero or more directories and subdirectories.
     19     """
---> 20     return list(iglob(pathname, recursive=recursive))
     21 
     22 def iglob(pathname, *, recursive=False):

~/Software/anaconda/lib/python3.6/glob.py in _iglob(pathname, recursive, dironly)
     38 
     39 def _iglob(pathname, recursive, dironly):
---> 40     dirname, basename = os.path.split(pathname)
     41     if not has_magic(pathname):
     42         assert not dironly

~/Software/anaconda/lib/python3.6/posixpath.py in split(p)
    103     """Split a pathname.  Returns tuple "(head, tail)" where "tail" is
    104     everything after the final slash.  Either part may be empty."""
--> 105     p = os.fspath(p)
    106     sep = _get_sep(p)
    107     i = p.rfind(sep) + 1

TypeError: expected str, bytes or os.PathLike object, not list

Note that this behavior is currently advertised in the docstring

In [5]: pims.open?
Signature: pims.open(sequence, **kwargs)
Docstring:
Read a filename, list of filenames, or directory of image files into an
iterable that returns images as numpy arrays.

Parameters
----------
sequence : string, list of strings, or glob
    The sequence you want to load. This can be a directory containing
    images, a glob ('/path/foo*.png') pattern of images,
    a video file, or a tiff stack
kwargs :
    All keyword arguments will be passed to the reader.
@mrocklin
Copy link
Author

mrocklin commented Oct 8, 2018

cc @jakirkham

@nkeim
Copy link
Contributor

nkeim commented Oct 8, 2018

Hmm. Does the pims.ImageSequence() constructor work in this case?

@mrocklin
Copy link
Author

mrocklin commented Oct 8, 2018

In [1]: import pims

In [2]: pims.ImageSequence(['https://pydata.org/images/logo.png'])
Out[2]: 
<Frames>
Source: (list of images)
Length: 1 frames
Frame Shape: (53, 126, 4)
Pixel Datatype: uint8

@nkeim nkeim added the bug label Oct 8, 2018
@nkeim nkeim changed the title Can I provide a list of addresses to pims.open? pims.open should handle lists of URLs Oct 8, 2018
@nkeim
Copy link
Contributor

nkeim commented Oct 8, 2018

Great! I just confirmed that pims.ImageSequence(['https://pydata.org/images/logo.png'] * 3) also works as intended. So we need to either hande lists as another special case in pims.open(), or clarify the docstring.

@nkeim
Copy link
Contributor

nkeim commented Oct 8, 2018

(In the meantime we have ImageSequence as a workaround.)

@mrocklin
Copy link
Author

mrocklin commented Oct 9, 2018

So we need to either hande lists as another special case in pims.open(), or clarify the docstring.

From the perspective of someone who makes tools downstream of this package the former would be preferred. It would stop me having to have branching downstream.

@jakirkham
Copy link
Contributor

Looks like a duplicate of issue ( #295 ).

@jakirkham
Copy link
Contributor

Maybe PR ( #316 ) helps?

@jakirkham
Copy link
Contributor

This may take some more work.

The issue is open supports a few different things and some of them conflict with each other. For instance, running glob on a path is fine, but running glob on a URL is not going to work.

>>> import glob
>>> glob.glob("about.png")
['about.png']
>>> glob.glob("http://www.imagexd.org/images/about.png")
[]

@nkeim
Copy link
Contributor

nkeim commented Jan 21, 2019

Thanks for identifying this issue and proposing a (starting) solution!

I agree that #316 leads to more complication. The behavior of ImageSequence has been to assume that if the argument is not a string, the user must have already used glob or whatever, so the filenames will be taken literally. This saves the user from having to worry about the presence of ?*[] in the filenames. I think this makes sense: stringing together glob expressions is just one of many ways to construct an ordered list of files, so we let the user do that themselves.

One solution is to just replicate this behavior in pims.open(), using the same type test as in ImageSequence. If the argument is any non-string iterable (including a list of URLs), it will be passed to ImageSequence directly.

@mrocklin would this address the issue?

@mrocklin
Copy link
Author

Yes, that sounds good to me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants