New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Windows: support path longer than 260 bytes using "\\?\" prefix #62399
Comments
Python at the moment does not handle paths with more than MAX_PATH characters well under Windows. With Windows 7 x64, Python 3.3 32bit, the attached file fails with:
Traceback (most recent call last):
File ".\filename_bug.py", line 4, in <module>
os.makedirs(dir)
File "C:\Python33\lib\os.py", line 269, in makedirs
mkdir(name, mode)
FileNotFoundError: [WinError 3] The system cannot find the path specified: './aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb' Same things apply to os.rmdir and probably other functions. The problem is that in posixmodule.c:path_converter (which is used to get the wchar_t* pathname that is expected by the Win32 API) we do have the following check:
length = PyUnicode_GET_SIZE(unicode);
if (length > 32767) {
FORMAT_EXCEPTION(PyExc_ValueError, "%s too long for Windows");
Py_DECREF(unicode);
return 0;
}
wide = PyUnicode_AsUnicode(unicode);
but the documentation states:
"The Windows API has many functions that also have Unicode versions to permit an extended-length path for a maximum total path length of 32,767 characters. This type of path is composed of components separated by backslashes, each up to the value returned in the lpMaximumComponentLength parameter of the GetVolumeInformation function (this value is commonly 255 characters). To specify an extended-length path, use the "\\?\" prefix. For example, "\\?\D:\very long path"." Source: http://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx The problem is that we never prepend "\\?\" to the pathname hence getting the old MAX_PATH limit. To fix this the easiest solution would be to change the unicode code path of the function to always return an absolute path (relative paths are always limited by MAX_PATH) with \\?\. For optimization we could only do this if the path is longer than 248 (CreateDir has another interesting exception there..) resp. MAX_CHAR characters. |
I would not call this a "problem". In my opinion, it is a bug in Windows: I don't understand why we should preprend something to support longer path. I'm not sure that low-level APIs (functions of the os module) should workaround this Windows limitation. An higher level API like pathlib may prepend "\\?\" prefix to support longer path. pathlib: PEP-428 and https://pypi.python.org/pypi/pathlib/ |
Fixing this in pathlib is better than not doing anything, although with the large amounts of code out there that use os, it'd be nice if we could fix it at the source (also since I assume pathlib internally is going to call the os module, it's still the same amount of work). If I provide unit tests for all the involved file system functions and fix the issues (more work than expected looking at the linked issues, but doesn't seem too hard), would such a patch have chances to be included? |
Yes, it is what we are trying to do. But it's not so simple. That's why the issue is splitted into more specific issues.
Sure! If adding \\?\ prefix causes new issue, you have to fix these issues first. |
Well, the problem, as you point out, is that "\\?\" only works with absolute paths, but the stdlib currently works with both absolute and relative paths. |
I also with this problems. I have made a test script. There is a problem with os.chdir(): It doesn't work with \\?\ notation.
os.makedirs() will work, but the cleanup will failed.
|
The process current directory is part of the Windows API, so it's subject to the MAX_PATH limit [1]. See SetCurrentDirectory [2]. Python can't do anything about this. As to shutil.rmtree, I agree it's an example of why the Windows path-length problem needs to be addressed more generally. Maybe there could be a __path__ special method supported by pathlib paths. On Windows this could resolve to an absolute path prefixed with "\\?\". [1]: Native NT relative paths are relative to a handle in the |
I have made https://github.com/jedie/pathlib_revised to address this, see: https://github.com/jedie/pathlib_revised#windows-max_path The idea is to add a property (I call it 'extended_path') and this will add the \\?\ prefix on all absolute path under windows. Under non-Windows the property will return ".path" The source code is here: https://github.com/jedie/pathlib_revised/blob/master/pathlib_revised/pathlib.py This is another thing: Why are not all filesystem access methods implemented in pathlib?!? And the last thing is: Why is pathlib so bad designed? It's ugly to extend it. But this address: https://bugs.python.org/issue24132 |
Paths prefixed with "\\?\" also need to be normalized, not just absolute. AFAIK there are no official docs on what normalization is required, but it includes at least trimming trailing dots on directory names, removing "." and ".." sections, adjacent backslashes, and removing trailing spaces on any segment. Without this, you will access/create/etc. files that users cannot otherwise see or modify. I don't disagree that we should add the prefix for long paths, but we need to at least get most of the normalization correct so that cases like this work: >>> open('C:\\Dir \\file.txt.', 'r').read()
"Content"
>>> open('\\\\?\\C:\\Dir \\file.txt.', 'r').read()
FileNotFoundError: [Errno 2] No such file or directory: '\\\\?\\C:\\Dir \\file.txt.' |
Just as a data point, the .NET Framework's latest version removes all of their extra path processing and lets Win32 do the validation/normalization (that is, they used to do what we're considering, but now match our behaviour). https://blogs.msdn.microsoft.com/dotnet/2016/08/02/announcing-net-framework-4-6-2/ |
Apparently CoreFX adds the \\?\ prefix automatically: https://blogs.msdn.microsoft.com/jeremykuhne/2016/06/21/more-on-new-net-path-handling It's great that Windows 10 Anniversary Edition will be getting long path support without requiring the extended path prefix, at least for NTFS volumes. I assume that includes slash-to-backslash normalization, relative paths, and drive-relative paths. I wonder about long drive-relative paths since they depend on hidden environment variables such as "=D:". The default environment block isn't that big. Python 3.5 supports back to Vista, so I still think automatically handling long Unicode paths, like how CoreFX reportedly works, makes for a more friendly cross-platform development experience. There are too many pitfalls with \\?\ paths -- they have to be Unicode (except that limitation is removed in Windows 10), fully qualified, use backslash only, and UNC paths have to be translated to use the \\?\UNC DOS device. |
Given that Windows 10 already supports this without us having to do the processing ourselves (see bpo-27731), I don't see us implementing this. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: