New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-88569: add ntpath.isreserved()
#95486
Conversation
In
Also, a name that ends with a space or dot should be reserved. The system's path normalization strips trailing spaces and dots from the final path component (e.g. "spam. . ." -> "spam"). Path normalization applies to all cases except opening "\\?\" extended paths. Also, the characters In file systems that support file streams, |
Not sure if it's relevant to this function but Windows 11 has greatly simplified how reserved names work. Now |
In Windows 11, path normalization no longer special cases a DOS device name if it has an extension (e.g. "con.txt"). Also a DOS device name isn't special cased if it's the leaf component of a path -- except for the "NUL" device. For example:
However, since DOS device names are still special cased as unqualified names and still reserved by the SMB server, they should still be avoided as file names. For example, creating a file named "con" in the current working directory would have to use the path "./con". |
@eryksun what should |
Isn't that a cross-platform question? The names "." and ".." are reserved in POSIX and Windows. For ".." it always has to be exact. Otherwise trailing dots and spaces are stripped in Windows. For example:
|
That would make |
I'm referring to the base name, i.e. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea, but I think there's an extraneous function call to take out. I would also like @eryksun to make sure we are copying over the right implementation details, or make any tweaks as necessary now if possible.
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
Thinking out loud: There's probably a distinction between reserved names and invalid names, analogous to the distinction between reserved Python keywords (
|
Does this mean that you don't want to reserve names that contain the wildcard characters What about names that are illegal to create because they're reserved in other contexts? That's the case when creating a file with a DOS device name on an SMB share. SMB disallows creating a file named "nul", for example, because it causes problems with accessing the file directly on the server (e.g. as "C:\share\nul"). What about base names that contain colons? For example, in an NTFS filesystem, "spam:00" creates a file named "spam" that contains a data stream named "00". That can surprise even experienced developers, particularly if they come from a POSIX background. In a FAT filesystem, "spam:00" is an invalid name. In a VboxSharedFolderFS (VirtualBox) filesystem with a Linux host, "spam:00" is allowed as the literal name. What about a name that changes when accessed because trailing dots and spaces are stripped? For example, "spam ." -> "spam". What about "." and ".." components in "\\?\" extended paths? In Windows, "." and ".." components are handled in the user-mode API. However, normalization is skipped for a "\\?\" extended path, so the filesystem is passed a path that contains "." and ".." components. The results can be dysfunctional and surprising. For example, FAT filesystems (volume "E:" in this example) allow creating regular files named "." and "..":
The first two are the required "." and ".." entries, which are directories (16) and have no short name. The last two are regular files with the archive attribute set (32) and legacy short names. Otherwise, FAT filesystems don't support opening paths with literal "." and ".." entries. The open fails with
NTFS fails all three of the above cases with |
@eryksun whats your view on the existing |
The As to reserved characters and "." and ".." components in extended paths, I'm just putting it out there for discussion. I'm less concerned about the five wildcard characters. They're always disallowed in base file names (but not stream names). A filesystem that didn't exclude them would be broken. Creating a file named "spam?.txt" isn't going to magically succeed in a surprising way. But a filesystem might allow |
Co-authored-by: Eryk Sun <eryksun@gmail.com>
The following check would need to be removed from # UNC paths are never reserved.
self.assertIs(False, P('//my/share/nul/con/aux').is_reserved()) Though this entire pathlib test is redundant now. As you consolidate the implementations, you might want to remove redundant tests, and only keep tests that are specific to how |
Considering the rules about which paths are reserved have changed in recent Windows releases (e.g. With a suitable warning in the docs (e.g. "this is an approximation of Windows rules as of this Python release; actual paths may differ, and this function may be updated as changed rules become more widely used"), I think it's okay to have in ntpath, and would deprecated PurePath. |
Co-authored-by: Steve Dower <steve.dower@microsoft.com>
Even in Windows 11, the SMB redirector still reserves names of legacy DOS devices in the filename part of an open/create (e.g. NUL, CON, PRN, AUX, LPT1-LPT9, COM1-COM9). At least SMB has never reserved DOS device names with a file extension (e.g. "con.txt"). And at least it just denies access instead of trying to access a remote device. It would be nice if the next version of the SMB protocol allowed legacy DOS device names in the filename part of an open/create. |
I don't have any other concerns with this. However, I think we should refer to |
@brettcannon Your objection was a while ago and I believe has been addressed. You don't need to re-review unless you want to, but I think we'd like an ack that you aren't blocking the PR anymore |
@barneygale I refreshed my review and I'm not blocking this. 🙂 |
os.path.isreserved()
ntpath.isreserved()
@zooba just checking you're happy for me to merge? |
Yeah, go ahead |
Thanks everyone for the help with this |
Add `ntpath.isreserved()`, which identifies reserved pathnames such as "NUL", "AUX" and "CON". Deprecate `pathlib.PurePath.is_reserved()`. --------- Co-authored-by: Eryk Sun <eryksun@gmail.com> Co-authored-by: Brett Cannon <brett@python.org> Co-authored-by: Steve Dower <steve.dower@microsoft.com>
Add
ntpath.isreserved()
, which identifies reserved pathnames such as "NUL", "AUX" and "CON".Deprecate
pathlib.PurePath.is_reserved()
.