Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

misleading error message #54

Closed
antiphasis opened this issue Aug 31, 2015 · 6 comments

Comments

@antiphasis
Copy link

@antiphasis antiphasis commented Aug 31, 2015

Setup to reproduce: using windows, python27 and the current version of scandir.
Create a few directories, one with german umlauts, e.g Aufträge.

mkdir c:\devel\playground\auftraege
mkdir c:\devel\playground\aufträge

Now let scandir work with the directories

C:\>python -c "import scandir;list(scandir.walk('c:/devel/playground'))"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Python27\lib\site-packages\scandir.py", line 654, in walk
    for entry in walk(new_path, topdown, onerror, followlinks):
  File "C:\Python27\lib\site-packages\scandir.py", line 594, in walk
    scandir_it = scandir(top)
TypeError: os.scandir() doesn't support bytes path on Windows, use Unicode instead

When searching scandir.py for the error message: It's thrown at

  • line 147, which belongs to checking for PY3 on Win32.
if IS_PY3 and sys.platform == 'win32':
    def scandir_generic(path=u'.'):
        if isinstance(path, bytes):
            raise TypeError("os.scandir() doesn't support bytes path on Windows, use Unicode instead")
        return _scandir_generic(path)
    scandir_generic.__doc__ = _scandir_generic.__doc__
else:
    scandir_generic = _scandir_generic
  • line 381, which belongs to checking if PY3 inside the whole if WIN32
        if IS_PY3:
            def scandir_python(path=u'.'):
                if isinstance(path, bytes):
                    raise TypeError("os.scandir() doesn't support bytes path on Windows, use Unicode instead")
                return _scandir_python(path)
            scandir_python.__doc__ = _scandir_python.__doc__
        else:
            scandir_python = _scandir_python
@benhoyt

This comment has been minimized.

Copy link
Owner

@benhoyt benhoyt commented Sep 1, 2015

Good find. I believe this is being raised by _scandir.c, however, this code. I think that on Python 2.x on Windows this should be just silently failing and convert the special characters to ? characters. This is kind of broken behaviour, but it's what os.listdir() does. I'll look into this next time I work on scandir. What you want to do is pass in a unicode string for the scandir path argument.

@antiphasis

This comment has been minimized.

Copy link
Author

@antiphasis antiphasis commented Sep 1, 2015

Darn, haven't looked at the c code...
Yepp, when passing the path as unicode it's working fine.

@antiphasis

This comment has been minimized.

Copy link
Author

@antiphasis antiphasis commented Sep 1, 2015

Almost forgotten: Haven't had these issues with the older versions (approx. a year ago).

benhoyt added a commit that referenced this issue Jan 2, 2016
I could have fixed this to exactly mimic os.walk()'s behaviour on Windows Python 2.x, but that's really broken: ASCII directories get treated as directories, some unicode directories go in the "dirs" list, some unicode directory names go into the "files" list and appear with non-ASCII chars replace with ASCII '?'. So I decided to fix it to what you want 99% of the time: if you pass a bytes path into walk() on Windows Python 2.x, it'll convert it to unicode so the walk does what you expect. This does mean that the names returned are all unicode strings, but that's almost always fine and what you want too.

Also remove unused and confusing "str" assignment
@benhoyt

This comment has been minimized.

Copy link
Owner

@benhoyt benhoyt commented Jan 2, 2016

I finally fixed this in 489cfa7. From the commit message:

I could have fixed this to exactly mimic os.walk()'s behaviour on Windows Python 2.x, but that's really broken: ASCII directories get treated as directories, some unicode directories go in the "dirs" list, some unicode directory names go into the "files" list and appear with non-ASCII chars replace with ASCII '?'. So I decided to fix it to what you want 99% of the time: if you pass a bytes path into walk() on Windows Python 2.x, it'll convert it to unicode so the walk does what you expect. This does mean that the names returned are all unicode strings, but that's almost always fine and what you want too.

@benhoyt benhoyt closed this Jan 2, 2016
@benhoyt

This comment has been minimized.

Copy link
Owner

@benhoyt benhoyt commented Jan 2, 2016

FYI, I'll release a new version on PyPI shortly.

@benhoyt

This comment has been minimized.

Copy link
Owner

@benhoyt benhoyt commented Jan 3, 2016

This is fixed now in version 1.2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.