std::fs::canonicalize returns UNC paths on Windows, and a lot of software doesn't support UNC paths #42869
Comments
|
In reference to your commit which referenced this PR, normalization is not the same as merely joining the path onto the current directory due to drive relative paths being relative to the current directory on the given drive. For example given a drive relative path of |
thanks, @retep998 :) it's just a hacked-together build tool that probably will eventually be replaced with something else, and I didn't intend to notify this ticket about my commit. but I guess it goes to show that a good way to get an absolute path in std would be really helpful. |
Note, the i-wrong tag is only for the |
Quick testing on Windows 10.0.15063 indicates that both |
Technically, AFAIK it is safe to strip the prefix in common simple cases (absolute path with a drive letter, no reserved names, shorter than max_path), and leave it otherwise. So I think there's no need to compromise on correctness as far as stdlib goes. The trade-off is between failing early and exposing other software that doesn't support UNC paths vs maximizing interoperability with non-UNC software. In an ideal world, I would prefer the "fail early" approach, so that limitations are quickly found and removed. However, Windows/DOS path handling has exceptionally long and messy history and decades of Microsoft bending over backwards to let old software not upgrade its path handling. If Microsoft can't push developers towards UNC, and fails to enforce this even in their own products, I have no hope of Rust shifting the Windows ecosystem to UNC. It will rather just frustrate Rust users and make Rust seem less reliable on Windows. So in this case I suggest trying to maximize interoperability instead, and canonicalize to regular paths whenever possible (using UNC only for paths that can't be handled otherwise). Also, careful stripping of the prefix done in stdlib will be much safer than other crates stripping it unconditionally (because realistically whenever someone runs into this problem, they'll just strip it unconditionally) |
@kornelski I completely agree. The current behavior is unexpected in my opinion. |
I hope this is helpful… According to Microsoft:
Source: https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx And the Ruby language uses forward slashes for File paths and that works on Windows. |
I've looked at this problem in detail. There are a few rules which need to be checked to safely strip the UNC prefix. It can be implemented as a simple state machine. I've implemented that using public APIs, but because So I'm still hoping canonicalize would do it automatically, because if it's done only for legacy-compatible paths there's no downside: all paths work for UNC-aware programs, and all paths that can work for legacy programs work too. |
Another example of this issue that I encountered in alexcrichton/cargo-vendor#71: url::URL.to_file_path() returns a non-UNC path (even if the URL was initialized with a UNC path). And std::path::Path.starts_with() doesn't normalize its arguments to UNC paths. So calling extern crate url;
use std::path::Path;
use url::Url;
fn main() {
// Path.canonicalize() returns a UNC path.
let unc_path_buf = Path::new(r"C:\Windows\System").canonicalize().expect("path");
let unc_path = unc_path_buf.as_path();
// Meanwhile, Url.to_file_path() returns a non-UNC path,
// even when initialized from a UNC path.
let file_url = Url::from_file_path(unc_path).expect("url");
let abs_path_buf = file_url.to_file_path().expect("path");
let abs_path = abs_path_buf.as_path();
// unc_path and abs_path refer to the same resource,
// and they both "start with" themselves.
assert!(unc_path.starts_with(unc_path));
assert!(abs_path.starts_with(abs_path));
// But they don't "start with" each other, so these fail.
assert!(unc_path.starts_with(abs_path));
assert!(abs_path.starts_with(unc_path));
} Arguably, Nevertheless, it does feel like something of a footgun, so it's worth at least documenting how it differs from that of some other APIs on Windows. |
Comparing canonical paths is a footgun in general because it is the wrong thing to do! Things like hard links and so on mean that such comparisons will never be entirely accurate. Please don't abuse canonicalization for this use case. If you want to tell whether two paths point to the same file, compare their file IDs! That's what |
but |
There are more ways than just |
bind mounts are equivalent to directory hardlinks.
…On Wed, May 9, 2018, 16:20 Kornel ***@***.***> wrote:
but starts_with is not for is-file-a-file comparison, but
is-file-in-a-directory check. There are no hardlinks involved (and AFAIK
apart from private implementation detail of macOS time machine, no OS
supports directory hardlinks).
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#42869 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AApc0j0s0uGzUojkJwFP7VFX8RpnGYzMks5twu0RgaJpZM4OEGyt>
.
|
Reported in rust-lang#838 Also see rust-lang/rust#42869
Reported in rust-lang#838 Also see rust-lang/rust#42869
Turns out `std::fs::canonicalize(path)` returns a special path with a '//?/' prefix on Windows, which basically throws it into a compatability mode where long paths are allowed and '/' for entering directories is not. This caused huge issues where we were unable to read any directories properly, namely `starpkg build` without a `-d` argument. Additionally, package paths on all targets are now made relative to the current working directory (not `-d`!). This resolves the above issue and gives nicer messages when we need to show a path to the user (e.g. when displaying an error or warning).
I'ive been using conditional comipiling when canonicalizing, and not doing it on Windows. |
Every time I get back into Rust I hit this, forget I've been here before, Google why canonicalize behaves differently on Windows than all the other languages' stdlib I've used, then return to this issue... Is there an official way/well known crate to:
It's extremely painful to write cross-platform code due to this issue. You can even see the steady stream of bugs linking back to here. I found: but no progress. Perhaps I should just look at how |
It looks like it should be an innocent path-cleaning fn, but it prepends combos of This feels like something where doing that's the right answer in some cases, but wrong in others. I think the answer is to have it take an enum where you have to explicitly specify whether you want the |
This continues to be both an annoyance, and a source of bugs from oversimplified workarounds. |
2337: Use dunce::canonicalize instead of fs::canonicalize r=AlveLarsson a=yancouto This will make `application_root_dir` work properly on Windows when running the exe directly. ## Description When building the release mode for Windows, reading file locations relative to the application root directory does not work properly, as `fs::canonicalize` adds a `\\?` prefix to it (UNC paths on Windows, see rust-lang/rust#42869) that is unsupported on most places (including internal stuff like reading RON config files). To make this work and be transparent to users, I'm changing `application_root_dir` to use `dunce::canonicalize`, which should have exactly the same behaviour except it doesn't use UNC paths on Windows when it doesn't need to. Otherwise every game that needs to be released on Windows will do a hack like yancouto/psycho_rust@ee72e21. Unless I'm doing something wrong? ## PR Checklist By placing an x in the boxes I certify that I have: - [x] Updated the content of the book if this PR would make the book outdated. - [ ] Added a changelog entry if this will impact users, or modified more than 5 lines of Rust that wasn't a doc comment. (Should I add a changelog entry?) - [ ] Added unit tests for new code added in this PR. - [x] Acknowledged that by making this pull request I release this code under an MIT/Apache 2.0 dual licensing scheme. If this modified or created any rs files: - [x] Ran `cargo +stable fmt --all` - [x] Ran `cargo clippy --all --features "empty"` - [x] Ran `cargo test --all --features "empty"` Co-authored-by: Yan Couto <yancouto@gmail.com>
Unless I'm somehow mistaken, it seems that one tool that doesn't support UNC paths is rustc itself: I tried to pass a UNC path to |
This improves the handling when cargo is run on windows using a UNC path as its working directory. See also: - rust-lang#7986 - rust-lang/rust#42869
Hi, I hope this is the right forum/format to register this problem, let me know if it's not.
Today I tried to use
std::fs::canonicalize
to make a path absolute so that I could execute it withstd::process::Command
.canonicalize
returns so-called "UNC paths", which look like this:\\?\C:\foo\bar\...
(sometimes the?
can be a hostname).It turns out you can't pass a UNC path as the current directory when starting a process (i.e.,
Command::new(...).current_dir(unc_path)
). In fact, a lot of other apps will blow up if you pass them a UNC path: for example, Microsoft's owncl.exe
compiler doesn't support it: alexcrichton/cc-rs#169It feels to me that maybe returning UNC paths from canonicalize is the wrong choice, given that they don't work in so many places. It'd probably be better to return a simple "absolute path", which begins with the drive letter, instead of returning a UNC path, and instead provide a separate function specifically for generating UNC paths for people who need them.
Maybe if this is too much of an incompatible change, a new function for creating absolute paths should be added to std? I'd bet, however, that making the change to
canonicalize
itself would suddenly make more software suddenly start working rather than suddenly break.The text was updated successfully, but these errors were encountered: