Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Glob sort helper doesn't work when numbers in filenames are padded with spaces #309

Open
prashnts opened this issue Sep 1, 2018 · 3 comments
Labels

Comments

@prashnts
Copy link

prashnts commented Sep 1, 2018

Hey, thanks for this library! There's an issue with the implementation of pims.utils.sort.natural_keys implementation where the sorted path names are not correct.

Reproducing

Use a glob pathspec with filenames with numbers padded with spaces.
Example: /path/to/files/img 1.tif, /path/to/files/img 42.tif etc.

You can use the following snippet to reproduce this:

>>> fake_names = [f'/data/meh/img-{x:5d}.tiff' for x in range(20)]
>>> sorted(fake_names, key=pims.utils.sort.natural_keys)
['/data/meh/img-   10.tiff',
 '/data/meh/img-   11.tiff',
 '/data/meh/img-   12.tiff',
 '/data/meh/img-   13.tiff',
 '/data/meh/img-   14.tiff',
 '/data/meh/img-   15.tiff',
 '/data/meh/img-   16.tiff',
 '/data/meh/img-   17.tiff',
 '/data/meh/img-   18.tiff',
 '/data/meh/img-   19.tiff',
 '/data/meh/img-    0.tiff',
 '/data/meh/img-    1.tiff',
 '/data/meh/img-    2.tiff',
 '/data/meh/img-    3.tiff',
 '/data/meh/img-    4.tiff',
 '/data/meh/img-    5.tiff',
 '/data/meh/img-    6.tiff',
 '/data/meh/img-    7.tiff',
 '/data/meh/img-    8.tiff',
 '/data/meh/img-    9.tiff']

Expected result:

['/data/meh/img-    0.tiff',
 '/data/meh/img-    1.tiff',
 '/data/meh/img-    2.tiff',
 '/data/meh/img-    3.tiff',
 '/data/meh/img-    4.tiff',
 '/data/meh/img-    5.tiff',
 '/data/meh/img-    6.tiff',
 '/data/meh/img-    7.tiff',
 '/data/meh/img-    8.tiff',
 '/data/meh/img-    9.tiff',
 '/data/meh/img-   10.tiff',
 '/data/meh/img-   11.tiff',
 '/data/meh/img-   12.tiff',
 '/data/meh/img-   13.tiff',
 '/data/meh/img-   14.tiff',
 '/data/meh/img-   15.tiff',
 '/data/meh/img-   16.tiff',
 '/data/meh/img-   17.tiff',
 '/data/meh/img-   18.tiff',
 '/data/meh/img-   19.tiff']

I can make a pull-request with a fix implemented if you're interested.

Note that the regular sorted call without using this key function works correctly. Tested in python 3.6 on macOS.

Thanks!

@nkeim
Copy link
Contributor

nkeim commented Sep 2, 2018 via email

@tacaswell
Copy link
Member

While I agree we should fix this bug, I strongly encourage you to switch to zero-padded file names (or at least filenames without spaces in them).

@prashnts
Copy link
Author

prashnts commented Sep 2, 2018

I agree @tacaswell -- however, this dataset was obtained by a colleague. @nkeim I'll send a PR asap, would also add a few test clauses for this. Thank you both!

@nkeim nkeim added the bug label Apr 30, 2020
EitanHemed added a commit to EitanHemed/pims that referenced this issue May 11, 2023
… docstring for a more formal description; added type hints.
EitanHemed added a commit to EitanHemed/pims that referenced this issue May 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants