-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: path/filepath: add Resolve, replacing EvalSymlinks #42201
Comments
I have a couple of other reasons to vote against this proposal:
|
To me, this is probably the most important drawback of the proposal. |
I can add to @rasky's comment that some modes do not produce any result, depending on the link type and mode used, e.g. see #39786. I also have objections, which I have documented in #40180 and those objections carry over to this proposal for most part because For these reasons I would like to see a proposal which allows application code to determine which links in a path get resolved, and for that to be ported to Unix instead of attempting to sort of port |
We need a new API because EvalSymlinks is broken on Windows (how is described in #37113 & #40180), and cannot be fixed without breaking some existing callers. EvalSymlinks would be deprecated. I agree with Eric that it's not correct to always resolve all symlinks in a path, but I also think we must rely on the WinApi to implement this on Windows. I don't see a way to do both. We should include in the Resolve docs Eric's suggestions from #40180 re best practices for handling paths not created by the application which resolves them. As to the https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-getfinalpathnamebyhandlew |
There are other ways besides
I didn't get the prefix |
What code did you find that doesn't prepend Are you saying that some WinApis can't accept paths starting I assume a Volume GUID prefix wouldn't be helpful since that differs on two hosts with otherwise identical configuration. I think we've concluded that there's no such thing as a "canonical" path. |
I note that this concern could be prevented if the proposal would also include a plan to eventually deprecate EvalSymlinks. I now notice that the issue title mentions "replacing EvalSymlinks", but there is no mention of "replacing" in the proposal text. @ianlancetaylor, can you please clarify this, by making the title and the description match one way or the other? |
They are not filesystem paths in the first place. They work with the API's used in the example - volume management API's, perhaps they also work with other kernel API's that deal with devices or with namespaces, but you cannot use them with regular file system API's like
On GPT disks they are the unique identifier listed in the partition entry. They should be stable and unique unless we're talking about cloned disks. But there are paths for which asking for the volume GUID does not yield a result. The one case I am aware of is accessing files over SMB. I don't have a computer around that doesn't have/use UEFI/GPT and I haven't tested with external disks which are usually not GPT, or any disks formatted with the old school MBR partitioning scheme. If Windows offers a volume GUID at all, it'll be a constructed one which isn't unique and maybe not stable. |
So are you suggesting |
No, I agree with your previous statement.
|
Well if |
And if that returns an error too, we fall back to - what? |
Sure there will certainly be ways to make the implementation complex enough such that it can resolve any and all links that it can encounter. I would say write it down, see how it can be fitted in a universally useful API. I haven't seen anyone comment on how the Unix API's behave in the presence of Samba or other file systems - would be interesting to see a discussion about that. Whether to cater to the objections listed in #40180 is a mere choice imho. I am obviously in favour of doing that. Canonicalization will not be the end result either way however. Not on Windows. |
Hm, can we use EDIT: Or try Again, we're not concerned with "canonical" names, there's no such thing. |
I think it depends on the use case how links should be resolved. I get a different drive letter using Apps shouldn't elevate and terminate the non-elevated copy, even if it is technically possible and very handy. This is a Certification requirement for Windows Desktop Apps: 9.2 Your app s main process must be run as a standard user (asInvoker). |
I don't think we'll get agreement here on a more sophisticated API, but feel free to suggest something. Otherwise... Try If the app wants to separately lookup the volume GUID for a drive letter, so be it. |
Well, without having done the research or testing for it, nor considering all situations where there might be different requirements for resolving links, I just make up these formal declarations as I go right now, with whatever names are descriptive, regardless of whether they are suitable or consistent: // Returns the longest part of path that is a link, if there are links on path,
// or a suitable error if path is not valid, or does not contain any links.
func GetDeepestLinkPath(path string): (string result, error err)
// Returns the target of the link pointed to by path, if it points directly at a link,
// subject to the mode requested, or a suitable error if path is not valid, or is not itself a link.
func ResolveExactLinkPath(path string, mode ResolveMode): (string result, error err) These could be primitives upon which an easier API could be built, something like: // A callback function that allows applications to determine whether
// the target of the link pointed to by path should be resolved or not
// and if so, how.
type ShouldResolveFunc func(path string) bool/ResolveMode
// Resolves links on path if there exist links on path below root,
// subject to the mode requested and the return value of
// shouldResolve for each link that is encountered on path below root.
// Returns the target path obtained, or a suitable error if either root
// or path are invalid, path is not within root, or link resolution fails.
// If the callback function shouldResolve is nil, path is returned.
// Applications should conservatively and consciously decide for each link
// whether it is to be resolved and implement a suitable shouldResolve
// callback accordingly.
// The callback will receive the exact path to links on path
// in the order GetDeepestLinkPath encounters them, recursively.
// Its return value will determine whether and how links are resolved.
// Links may have been created to solve administrative problems
// of which most applications should remain unaware.
// Most applications should only resolve specific links that they
// require to resolve, use the result immediately, forget the result
// and never show the result to the user, unless they have specific
// information about the link that was resolved and whether their
// resolved target is stable and can be cached.
func Resolve(root, path string, mode ResolveMode, shouldResolve ShouldResolveFunc): (string result, error err) Obviously I just list stuff for which I am not sure where it should be declared everywhere it makes sense instead of making any choices. |
Can you add comments describing the behaviors of your API concept? And suggestions for implementing them on Windows? |
It gets to be a bit wordy, but yeah, the world is a messy place. It doesn't give any ins and outs just yet even. My earlier comment about performance applies here - strings don't carry (much) context or (any) proof of work done. Perhaps other types of arguments allow better performance and IDE's to help write code faster at the cost of (some) API 'complexity'. Types are good for both. Suggestions for how to implement them are exact Windows API's to resolve links. The How precisely this all works would have to be written down in code and tested, preferably reviewed by Windows experts, before deciding what the API actually will exactly have to look like. |
I would even say before it is actually included in any standard library, it'll have to be battle tested, especially the part that tries to summarize application scenario's for resolving links and makes the right choices for each of those. |
The Re a simple filepath.Resolve API, what do you think of my last suggestion for |
Good point about looking at other languages. Python has Java has C++ has What do those functions do on Windows? |
I think those have ported from *nix, or in the case of C++ have taken POSIX semantics (hear say), without the considerations we are having here and will have the issues we identified. Certainly the C++ version, which has been linked before, with the implementation actually here. It is pretty mangled code, but appears to use
If any attempt at canonicalizing is deemed needed for code compatibility reasons, I would port the C++ version of it. It won't work to canonicalize files accessed over SMB, nor will it yield afaik paths that can be used with other file system API's, but it might about always return some result that can at least be compared between local files and between remote files, but not the remote with the same file accessed locally and it won't substitute drive letters for the imho better alternative of volume GUID paths. So I would actually have the implementation also try
|
Perhaps I would be able to create a C# version, even to propose such a thing to become part of .NET, which would be awesome because it'd have to be reviewed by Microsoft themselves. If such a proposal would be deemed useful or recommended to expose to the general public in the first place. And if it is accepted, the cadence is one release per year in November - it could take 1 or 2 years before they get to it. ADDITION: It would also have to be portable to a wide array of Linux versions, MacOS, probably Android, with that work also reviewed by Microsoft. |
Actually, it prepends I believe we can yield a consistent, general-purpose result via BTW hardlinks are not at issue here (but are one reason the term "canonical" isn't appropriate). |
To use (Re: #17835) |
Well the STL strips it. It means 'do not parse' which means most API's shouldn't perform normalization or use forward slashes etc - which go is adept at ignoring, but it is what it is defined to mean. So unless the prefix is required it is prudent to remove them, such that a) go flakes less often and b) paths will be processed more smoothly and look nicer. |
I just don't know. I would prefer volume GUID paths, technically and esthetically. Wrt involving Microsoft... YEAH. Hire them if that's what it takes. Lots of work to be done. |
Maybe of some help - https://github.com/golang/go/blob/master/src/os/path_windows.go#L131 has a very interesting function fixLongPath (that I had to reexport for our code using //go:linkname trickery...) that is kind of trying to do some (minor) long path normalization. |
Eric and I looked at the MS C++ STL We agree that returning a path starting with a drive letter isn't at all "canonical" as a storage volume may not be mapped to one, and if it is, that can change anytime. The other options are...
Eric and I agree that the Volume GUID is the most "canonical" for resources that have one, and that a UNC path should be returned otherwise. The implementation entails first trying He has also discovered that some path/filepath APIs choke on paths prefixed with |
I'm happy to try to fix them, please ping me on that. |
|
Path | Flag | Result |
---|---|---|
n:\SOURCE.ICO |
VOLUME_NAME_DOS |
\\?\UNC\Z68\UncShare\Source.ico |
n:\SOURCE.ICO |
VOLUME_NAME_GUID |
The system cannot find the path specified. (0x80070003) |
n:\SOURCE.ICO |
VOLUME_NAME_NONE |
\Z68\UncShare\Source.ico |
n:\SOURCE.ICO |
VOLUME_NAME_NT |
\Device\Mup\Z68\UncShare\Source.ico |
c:\USERS\ERIC\SOURCE\SOURCE.ICO |
VOLUME_NAME_DOS |
The system cannot find the path specified. (0x80070003) |
c:\USERS\ERIC\SOURCE\SOURCE.ICO |
VOLUME_NAME_GUID |
\\?\Volume{607932d3-78af-41c0-9786-dd0177e78a39}\Source\Source.ico |
c:\USERS\ERIC\SOURCE\SOURCE.ICO |
VOLUME_NAME_NONE |
\Source\Source.ico |
c:\USERS\ERIC\SOURCE\SOURCE.ICO |
VOLUME_NAME_NT |
\Device\HarddiskVolume15\Source\Source.ico |
m:\PATH\FILE.EXT |
VOLUME_NAME_DOS |
\\?\M:\path\file.ext |
m:\PATH\FILE.EXT |
VOLUME_NAME_GUID |
\\?\Volume{ac96f27a-0000-0000-0000-010000000000}\path\file.ext |
m:\PATH\FILE.EXT |
VOLUME_NAME_NONE |
\path\file.ext |
m:\PATH\FILE.EXT |
VOLUME_NAME_NT |
\Device\HarddiskVolume21\path\file.ext |
Now I add S:\
to the volume.
Path | Flag | Result |
---|---|---|
c:\USERS\ERIC\SOURCE\SOURCE.ICO |
VOLUME_NAME_DOS |
\\?\S:\Source\Source.ico |
c:\USERS\ERIC\SOURCE\SOURCE.ICO |
VOLUME_NAME_GUID |
\\?\Volume{607932d3-78af-41c0-9786-dd0177e78a39}\Source\Source.ico |
c:\USERS\ERIC\SOURCE\SOURCE.ICO |
VOLUME_NAME_NONE |
\Source\Source.ico |
c:\USERS\ERIC\SOURCE\SOURCE.ICO |
VOLUME_NAME_NT |
\Device\HarddiskVolume15\Source\Source.ico |
Using a mounted folder instead of a junction.
C:\Users\Eric\Source
is a mounted folder to\\?\Volume{607932d3-78af-41c0-9786-dd0177e78a39}\
Note the additional\SOURCE
segment I now need since the mounted folder points to the root of the volume, not to theSource
folder in the root of it.
Path | Flag | Result |
---|---|---|
c:\USERS\ERIC\SOURCE\SOURCE\SOURCE.ICO |
VOLUME_NAME_DOS |
\\?\C:\Users\Eric\Source\Source\Source.ico |
c:\USERS\ERIC\SOURCE\SOURCE\SOURCE.ICO |
VOLUME_NAME_GUID |
\\?\Volume{607932d3-78af-41c0-9786-dd0177e78a39}\Source\Source.ico |
c:\USERS\ERIC\SOURCE\SOURCE\SOURCE.ICO |
VOLUME_NAME_NONE |
\Source\Source.ico |
c:\USERS\ERIC\SOURCE\SOURCE\SOURCE.ICO |
VOLUME_NAME_NT |
\Device\HarddiskVolume15\Source\Source.ico |
A summary of our analysis in addition to @networkimprov's comment above and the list of results for each mode:
|
Canonicalization will work just fine for files on the local machine and links pointing at the local machine, but for remote paths, there is as far as I have been able to determine, no way around getting just errors trying to translate paths that contain links. The only ways around this are:
In both latter cases, canonicalization will not yield a path usable with normal file API, but must either have a syntax that includes the machine name but also a volume GUID, or be a pair of values - a path and a machine name or even some unique, canonical machine identifier. Since this conclusion is largely based on behavior of SMB, I don't think this is unique to Windows, but similar problems as described will also affect other operating systems that use SMB, or other remote file systems that are designed similarly. Symlink evaluation policyThere is a global system policy in Windows called $P$G> fsutil behavior query SymlinkEvaluation
Local to local symbolic links are enabled.
Local to remote symbolic links are enabled.
Remote to local symbolic links are disabled.
Remote to remote symbolic links are disabled. This means that for paths that resolve to a UNC location on a remote machine that contain links declared on the remote machine,
I have found no mention of any way in which this policy can be circumvented by a particular process. The global policy should not be modified by any individual application, since that might break other apps. Good citizenship even mandates adhering to it. Even enabling the policy for all link types still doesn't quite offer canonicalization. Links might still not resolve with access denied or path not found, depending on the path declared as target in the reparse point:
Going around the Symlink evaluation policyThe only way around this policy as far as I know is to use This is one of the bugs with As far as I know, SMB does not provide a way to translate local paths on a remote machine to volume GUID's or to expose whether or not and which share on the remote machine might host the translated path. Obviously, such a translated path could not have been shared over SMB at all and be inaccessible completely from a remote machine, except through the path that was translated. I don't think this is any different on *nix with SMB. I have no idea whether alternatives like NFS are designed any different. SMB server and share names are not normalizedAlso, SMB server and share names are never normalized, but returned as opened (if present in the requested path), or in some random casing which appears to be stable but does not match the conventions for computer names (uppercase for NetBIOS names, lowercase for DNS) or the share name as it was defined on the remote machine (if the requested path is a drive mapping, for example), even if the |
It seems like we are not really headed for a consensus. Maybe this would be better to do in an external package to start? |
It seems clear there is no consensus here. This seems like a likely decline. |
Let's put this on hold, as more ppl are likely to raise this, and the only answer at present is, "Sorry, filepath.EvalSymlinks is broken beyond repair on Windows; call it on unix, but call x/sys/windows.GetFinalPathNameByHandle on windows" |
Yes, for local files. Combine that with #42202 for remote files. The hard part is determining |
No change in consensus, so declined. |
This is a new proposal to replace #37113, which was closed for non-technical reasons.
Paraphrasing @rsc, the proposal is a new function in the path/filepath package:
The expectation is that on Unix systems this will be essentially
filepath.Abs(filepath.EvalSymlinks(path))
and on Windows it will essentially acquire a handle for the path and callGetFinalPathNameByHandle
.Objections to this approach (in my own words, apologies if I misrepresent some position):
EvalSymlinks
work better on Windows, such thatfilepath.Abs(filepath.EvalSymlinks(path))
will suffice on both Unix and Windows systems. This may involve changingEvalSymlinks
to callGetFinalPathNameByHandle
. However, any such change toEvalSymlinks
on Windows may break programs that currently work on Windows.Resolve
function will return a canonical path, but it will not, neither on Unix nor Windows (on Unix it will not be canonical due to hard links and multiple mounts). Therefore this function will mislead people into writing buggy programs. In particular,os.SameFile
can return true for two different paths returned byResolve
.The text was updated successfully, but these errors were encountered: