-
-
Notifications
You must be signed in to change notification settings - Fork 30.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
urllib's request.pathname2url not compatible with extended-length Windows file paths #87773
Comments
Windows file paths are limited to 256 characters, and one of Windows's prescribed methods to address this is to prepend "\\?\" before a Windows absolute path (see: https://docs.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation) urllib.request.pathname2url raises an error on such paths as this function calls nturl2path.py's pathname2url function which explicitly checks that the number of characters before the ":" in a Windows path is precisely one, which is, of course, not the case if you are using an extended-length path (e.g. "\\?\C:\Python39"). As a result, urllib cannot handle pathname2url conversion for some valid Windows paths. |
RFC8089 doesn't specify "a mechanism for translating namespaced paths ["\\?\" and "\\.\"] to or from file URIs", and the Windows shell doesn't support them. So what's the practical benefit of supporting them in nturl2path?
Classically, normal filepaths are limited to MAX_PATH - 1 (259) characters, in most cases, except for a few cases in which the limit is even smaller. For a normal filepath, the API replaces slashes with backlashes; resolves relative paths; resolves "." and ".." components; strips trailing dots and spaces from the final path component; and, for relative paths and DOS drive-letter paths, reserves DOS device names in the final path component (e.g. CON, NUL). The kernel supports filepaths with up to 32,767 characters, but classically this was only accessible by using a verbatim \\?\ filepath, or by using workarounds based on substitute drives or filesystem mountpoints and symlinks. With Python 3.6+ in Windows 10, if long paths are enabled in the system, normal filepaths support up to the full 32,767 characters in most cases. The need for the \\?\ prefix is thus limited to the rare case when a verbatim path is required, or when a filepath has to be passed to a legacy application that doesn't support long paths. |
I really meant 255 characters not 256 because I was leaving three for "<drive name>:/". I suppose the most reasonable behavior is to strip out the "\\?\" before attempting the conversion as the path is sensible and parsable without that, as opposed to the current behavior which is to crash. The practical benefit is to permit the function to work on a wider range of inputs than currently is possible for essentially no cost. |
Okay, so you're not looking to preserve the fact that it's a \\?\ verbatim path in the URI. You just want to automatically convert from verbatim \\?\X: or \\?\UNC\server\share to normal form. Devices other than drive letters and "UNC" wouldn't be supported. |
I think that would make the most sense, yes. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: