Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows: support path longer than 260 bytes using "\\?\" prefix #62399

Closed
Voo mannequin opened this issue Jun 12, 2013 · 13 comments
Closed

Windows: support path longer than 260 bytes using "\\?\" prefix #62399

Voo mannequin opened this issue Jun 12, 2013 · 13 comments
Labels
OS-windows topic-IO type-feature A feature request or enhancement

Comments

@Voo
Copy link
Mannequin

Voo mannequin commented Jun 12, 2013

BPO 18199
Nosy @loewis, @pitrou, @vstinner, @ezio-melotti, @asmeurer, @zware, @serhiy-storchaka, @eryksun, @zooba
Files
  • filename_bug.py: minimal example
  • test.py: tests long path with \?\ notation
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2016-09-10.01:31:15.164>
    created_at = <Date 2013-06-12.15:24:41.080>
    labels = ['type-feature', 'expert-IO', 'OS-windows']
    title = 'Windows: support path longer than 260 bytes using "\\\\?\\" prefix'
    updated_at = <Date 2016-09-10.01:31:15.163>
    user = 'https://bugs.python.org/Voo'

    bugs.python.org fields:

    activity = <Date 2016-09-10.01:31:15.163>
    actor = 'steve.dower'
    assignee = 'none'
    closed = True
    closed_date = <Date 2016-09-10.01:31:15.164>
    closer = 'steve.dower'
    components = ['Windows', 'IO']
    creation = <Date 2013-06-12.15:24:41.080>
    creator = 'Voo'
    dependencies = []
    files = ['30563', '41820']
    hgrepos = []
    issue_num = 18199
    keywords = []
    message_count = 13.0
    messages = ['191033', '191035', '191036', '191037', '191043', '191121', '259662', '260068', '260076', '260122', '271913', '271922', '275530']
    nosy_count = 13.0
    nosy_names = ['loewis', 'pitrou', 'vstinner', 'ezio.melotti', 'daniel.ugra', 'Aaron.Meurer', 'santoso.wijaya', 'Voo', 'jens', 'zach.ware', 'serhiy.storchaka', 'eryksun', 'steve.dower']
    pr_nums = []
    priority = 'normal'
    resolution = 'rejected'
    stage = None
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue18199'
    versions = ['Python 3.5']

    @Voo
    Copy link
    Mannequin Author

    Voo mannequin commented Jun 12, 2013

    Python at the moment does not handle paths with more than MAX_PATH characters well under Windows.

    With Windows 7 x64, Python 3.3 32bit, the attached file fails with:
    Traceback (most recent call last):
      File ".\filename_bug.py", line 4, in <module>
        os.makedirs(dir)
      File "C:\Python33\lib\os.py", line 269, in makedirs
        mkdir(name, mode)
    FileNotFoundError: [WinError 3] The system cannot find the path specified: './aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
    bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb'

    Same things apply to os.rmdir and probably other functions.

    The problem is that in posixmodule.c:path_converter (which is used to get the wchar_t* pathname that is expected by the Win32 API) we do have the following check:
            length = PyUnicode_GET_SIZE(unicode);
            if (length > 32767) {
                FORMAT_EXCEPTION(PyExc_ValueError, "%s too long for Windows");
                Py_DECREF(unicode);
                return 0;
            }
            wide = PyUnicode_AsUnicode(unicode);
    but the documentation states:
    "The Windows API has many functions that also have Unicode versions to permit an extended-length path for a maximum total path length of 32,767 characters. This type of path is composed of components separated by backslashes, each up to the value returned in the lpMaximumComponentLength parameter of the GetVolumeInformation function (this value is commonly 255 characters). To specify an extended-length path, use the "\\?\" prefix. For example, "\\?\D:\very long path"."

    Source: http://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx

    The problem is that we never prepend "\\?\" to the pathname hence getting the old MAX_PATH limit.

    To fix this the easiest solution would be to change the unicode code path of the function to always return an absolute path (relative paths are always limited by MAX_PATH) with \\?\. For optimization we could only do this if the path is longer than 248 (CreateDir has another interesting exception there..) resp. MAX_CHAR characters.

    @Voo Voo mannequin added topic-unicode OS-windows topic-IO type-bug An unexpected behavior, bug, or error labels Jun 12, 2013
    @vstinner
    Copy link
    Member

    Using extended path ("\\?\" prefix) causes new issues.

    @vstinner
    Copy link
    Member

    The problem is that we never prepend "\\?\" to the pathname hence getting the old MAX_PATH limit.

    I would not call this a "problem". In my opinion, it is a bug in Windows: I don't understand why we should preprend something to support longer path.

    I'm not sure that low-level APIs (functions of the os module) should workaround this Windows limitation. An higher level API like pathlib may prepend "\\?\" prefix to support longer path.

    pathlib: PEP-428 and https://pypi.python.org/pypi/pathlib/

    @vstinner vstinner added type-feature A feature request or enhancement and removed type-bug An unexpected behavior, bug, or error labels Jun 12, 2013
    @Voo
    Copy link
    Mannequin Author

    Voo mannequin commented Jun 12, 2013

    In my opinion, it is a bug in Windows
    I don't think calling every complicated API a "bug" is useful. Is the Win32 API exceedingly annoying? I think everybody agrees on that, but imo it's better to fix this once in python itself and don't force all developers to think about those details.

    Fixing this in pathlib is better than not doing anything, although with the large amounts of code out there that use os, it'd be nice if we could fix it at the source (also since I assume pathlib internally is going to call the os module, it's still the same amount of work).

    If I provide unit tests for all the involved file system functions and fix the issues (more work than expected looking at the linked issues, but doesn't seem too hard), would such a patch have chances to be included?

    @vstinner
    Copy link
    Member

    it'd be nice if we could fix it at the source

    Yes, it is what we are trying to do. But it's not so simple. That's why the issue is splitted into more specific issues.

    If I provide unit tests for all the involved file system
    functions and fix the issues (more work than expected looking
    at the linked issues, but doesn't seem too hard),
    would such a patch have chances to be included?

    Sure! If adding \\?\ prefix causes new issue, you have to fix these issues first.

    @vstinner vstinner changed the title No long filename support for Windows Windows: support path longer than 260 bytes using "\\?\" prefix Jun 13, 2013
    @pitrou
    Copy link
    Member

    pitrou commented Jun 14, 2013

    Well, the problem, as you point out, is that "\\?\" only works with absolute paths, but the stdlib currently works with both absolute and relative paths.
    The only reasonable solution right now is to prepend the "\\?\" prefix yourself (after having resolved the path to absolute).

    @jens
    Copy link
    Mannequin

    jens mannequin commented Feb 5, 2016

    I also with this problems.

    I have made a test script.

    There is a problem with os.chdir(): It doesn't work with \\?\ notation.
    And there is also a problem, if you use

    import os
    import pathlib
    import tempfile
    
    with tempfile.TemporaryDirectory(prefix="path_test_") as path:
        new_path = pathlib.Path(path, "A"*255, "B"*255)
        extended_path = "\\\\?\\%s" % new_path
        os.makedirs(extended_path)
        print(len(extended_path), extended_path)
    

    os.makedirs() will work, but the cleanup will failed.
    Output is:

    567 \\?\C:\Users\jens\AppData\Local\Temp\path_test_8fe6utdz\AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA\BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
    Traceback (most recent call last):
      File "D:\test2.py", line 18, in <module>
        print(len(extended_path), extended_path)
      File "C:\Program Files (x86)\Python35-32\lib\tempfile.py", line 807, in __exit__
        self.cleanup()
      File "C:\Program Files (x86)\Python35-32\lib\tempfile.py", line 811, in cleanup
        _shutil.rmtree(self.name)
      File "C:\Program Files (x86)\Python35-32\lib\shutil.py", line 488, in rmtree
        return _rmtree_unsafe(path, onerror)
      File "C:\Program Files (x86)\Python35-32\lib\shutil.py", line 383, in _rmtree_unsafe
        onerror(os.unlink, fullname, sys.exc_info())
      File "C:\Program Files (x86)\Python35-32\lib\shutil.py", line 381, in _rmtree_unsafe
        os.unlink(fullname)
    FileNotFoundError: [WinError 3] Das System kann den angegebenen Pfad nicht finden: 'C:\\Users\\jens\\AppData\\Local\\Temp\\path_test_8fe6utdz\\AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'
    

    @eryksun
    Copy link
    Contributor

    eryksun commented Feb 11, 2016

    There is a problem with os.chdir(): It doesn't work with
    \\?\ notation.

    The process current directory is part of the Windows API, so it's subject to the MAX_PATH limit [1]. See SetCurrentDirectory [2]. Python can't do anything about this.

    As to shutil.rmtree, I agree it's an example of why the Windows path-length problem needs to be addressed more generally. Maybe there could be a __path__ special method supported by pathlib paths. On Windows this could resolve to an absolute path prefixed with "\\?\".

    [1]: Native NT relative paths are relative to a handle in the
    OBJECT_ATTRIBUTES record that's used to create or open an
    object. This isn't generally exposed in the Windows API,
    except in the registry API.
    [2]: https://msdn.microsoft.com/en-us/library/aa365530

    @jens
    Copy link
    Mannequin

    jens mannequin commented Feb 11, 2016

    I have made https://github.com/jedie/pathlib_revised to address this, see: https://github.com/jedie/pathlib_revised#windows-max_path

    The idea is to add a property (I call it 'extended_path') and this will add the \\?\ prefix on all absolute path under windows. Under non-Windows the property will return ".path"
    And i patch methods like 'chmod', 'unlink', 'rename' etc. to use the 'extended_path' property and i add more methods like 'link', 'listdir', 'copyfile' etc.

    The source code is here: https://github.com/jedie/pathlib_revised/blob/master/pathlib_revised/pathlib.py

    This is another thing: Why are not all filesystem access methods implemented in pathlib?!?
    e.g.: There is Path.unlink() but no Path.link()
    There is Path.rmdir() but no Path.chdir()
    and many more.

    And the last thing is: Why is pathlib so bad designed? It's ugly to extend it. But this address: https://bugs.python.org/issue24132

    @zooba
    Copy link
    Member

    zooba commented Feb 11, 2016

    Paths prefixed with "\\?\" also need to be normalized, not just absolute. AFAIK there are no official docs on what normalization is required, but it includes at least trimming trailing dots on directory names, removing "." and ".." sections, adjacent backslashes, and removing trailing spaces on any segment.

    Without this, you will access/create/etc. files that users cannot otherwise see or modify.

    I don't disagree that we should add the prefix for long paths, but we need to at least get most of the normalization correct so that cases like this work:

    >>> open('C:\\Dir \\file.txt.', 'r').read()
    "Content"
    >>> open('\\\\?\\C:\\Dir \\file.txt.', 'r').read()
    FileNotFoundError: [Errno 2] No such file or directory: '\\\\?\\C:\\Dir \\file.txt.'

    @zooba
    Copy link
    Member

    zooba commented Aug 3, 2016

    Just as a data point, the .NET Framework's latest version removes all of their extra path processing and lets Win32 do the validation/normalization (that is, they used to do what we're considering, but now match our behaviour).

    https://blogs.msdn.microsoft.com/dotnet/2016/08/02/announcing-net-framework-4-6-2/

    @eryksun
    Copy link
    Contributor

    eryksun commented Aug 3, 2016

    Apparently CoreFX adds the \\?\ prefix automatically:

    https://blogs.msdn.microsoft.com/jeremykuhne/2016/06/21/more-on-new-net-path-handling

    It's great that Windows 10 Anniversary Edition will be getting long path support without requiring the extended path prefix, at least for NTFS volumes. I assume that includes slash-to-backslash normalization, relative paths, and drive-relative paths. I wonder about long drive-relative paths since they depend on hidden environment variables such as "=D:". The default environment block isn't that big.

    Python 3.5 supports back to Vista, so I still think automatically handling long Unicode paths, like how CoreFX reportedly works, makes for a more friendly cross-platform development experience. There are too many pitfalls with \\?\ paths -- they have to be Unicode (except that limitation is removed in Windows 10), fully qualified, use backslash only, and UNC paths have to be translated to use the \\?\UNC DOS device.

    @zooba
    Copy link
    Member

    zooba commented Sep 10, 2016

    Given that Windows 10 already supports this without us having to do the processing ourselves (see bpo-27731), I don't see us implementing this.

    @zooba zooba closed this as completed Sep 10, 2016
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    OS-windows topic-IO type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants