RFC: path reform #474

aturon · 2014-11-19T06:34:25Z

This RFC reforms the design of the std::path module in preparation for API
stabilization. The path API must deal with many competing demands, and the
current design handles many of them, but suffers from some significant problems
given in "Motivation" below. The RFC proposes a redesign modeled loosely on the
current API that addresses these problems while maintaining the advantages of
the current design.

Thanks to @kballard, who helped spark some of the initial ideas over Ike's sandwiches!

Rendered

aturon · 2014-11-19T06:34:53Z

cc @kballard @wycats @carllerche @alexcrichton @brson @SimonSapin

huonw · 2014-11-19T06:39:33Z

text/0000-path-reform.md

+impl<Sized? P> FromIterator<P> for PathBuf where P: AsPath { .. }
+impl<Sized? P> Extend<P> for PathBuf where P: AsPath { .. }
+
+impl<Sized? P> Path where P: AsPath {


Nit/implementation detail: will putting the type parameter here cause inference problems with e.g. Path::new?

@huonw Given #447, this will probably have to move inward, but at the moment this seemed the most concise way to present the API.

nikomatsakis · 2014-11-19T10:47:11Z

I don't work much with paths, but I figure if the API itself gets a sign-off from @alexcrichton, @wycats, and @kballard, it's got to be fine. And we can always grow it over time if needed.

The general approach of having a borrowed type (Path) and a mutable, owning variant (PathBuf) seems like a great idea and I have a feeling this is going to become "a thing". One thing I was curious about was the decision to make Path unsized. Is this necessary for some reason? (One reason I can imagine might be BorrowFrom, if we were to change that to use associated types.)

In general, I imagine that many of the "view" types we will create will not be unsized, so it'd be good to know where unsizedness is helpful. Of course, one could imagine allowing types to "opt in" to being unsized, perhaps using a negative impl. (And of course the more unsized types we have, of course, the more Sized? annotations are required on traits and so forth.)

Anyway, this is not criticism of this RFC, which looks great, more just thinking aloud about the trend that it signals.

japaric · 2014-11-19T12:45:01Z

text/0000-path-reform.md

+impl<Sized? P> Extend<P> for PathBuf where P: AsPath { .. }
+
+impl<Sized? P> Path where P: AsPath {
+    pub fn new(path: &str) -> &Path;


The current API has a new_opt constructor that returns Option<Path>. It seems that such option has been removed in this new API, is it intentional? See rust-lang/rust#14048

michaelsproul · 2014-11-19T16:36:58Z

Some of the bare functions in std::io::fs look very C like. Would there would be any interest in extending the PathExtensions API to include the same functionality while providing more Rustic names? We could have .contents instead of readdir, .make_dir instead of mkdir, etc. The C-style names could still be accessible through libc.

SimonSapin · 2014-11-19T19:16:01Z

text/0000-path-reform.md

+    pub fn new(path: &str) -> &Path;
+
+    pub fn as_str(&self) -> Option<&str>
+    pub fn as_str_lossy(&self) -> Cow<String, str>; // Cow will replace MaybeOwned


Should this be to_str_lossy? Not sure which of the as_ or to_ naming conventions should “win” for Cow.

... per Path reform RFC convention. rust-lang/rfcs#474

SimonSapin · 2014-11-19T20:24:45Z

text/0000-path-reform.md

+
+It is not clear how best to incorporate the
+[WTF-8 implementation](https://github.com/SimonSapin/rust-wtf8) (or how much to
+incorporate) into `libstd`.


rust-wtf8 currently duplicates some standard library code, changing the char, str, and String types to their respective supersets CodePoint, Wtf8, and Wtf8Buf. To avoid the duplication, libstd could maybe have private functions that are generic over these types, so that optimization on monomorphized functions could still take advantage e.g. of the LLVM range asserts on char.

Valloric · 2014-11-19T21:17:49Z

It might be useful to look at pathlib from Python 3 as a source of inspiration; it also has to deal with posix/windows paths etc.

mahkoh · 2014-11-19T21:18:52Z

text/0000-path-reform.md

+
+* **Where did `push_many` and friends go?** They're replaced by implementing
+  `FromIterator` and `Extend`, following a similar pattern with the `Vec`
+  type. (Some work will be needed to retain full efficiency when doing so.)


Just a reminder that Vec::push_all has not been deprecated because of ergonomics and performance problems and that, at this point, using extend is much, much slower than using push_all.

test extend ... bench: 2868 ns/iter (+/- 9) test push_all ... bench: 76 ns/iter (+/- 0)

Have any concrete plans been made to fix this?

test extend ... bench: 43 ns/iter (+/- 4) test push_all ... bench: 33 ns/iter (+/- 4)

extern crate test; use test::Bencher as B; #[bench] fn push_all(b: &mut B) { let mut vec = vec![]; let values = [1i, .. 10]; b.iter(|| { vec.push_all(&values) }) } #[bench] fn extend(b: &mut B) { let mut vec = vec![]; let values = [1i, .. 10]; b.iter(|| { vec.extend(values.iter().map(|x| *x)) }) }

The memcpy overhead will be significant at N=10.

extern crate test; #[inline(never)] fn prepare() -> (Vec<u8>, Vec<u8>) { (vec!(), Vec::from_elem(1024, 0)) } #[bench] pub fn extend(b: &mut test::Bencher) { let (mut dst, src) = prepare(); b.iter(|| { dst.clear(); dst.extend(src.iter().map(|v| *v)); test::black_box(&dst); }); } #[bench] pub fn push_all(b: &mut test::Bencher) { let (mut dst, src) = prepare(); b.iter(|| { dst.clear(); dst.push_all(src.as_slice()); test::black_box(&dst); }); }

alexcrichton · 2014-12-19T17:16:47Z

Tracking issue

l0kod · 2014-12-24T14:16:48Z

A bit late, but the Path comparison should take care if the file system is case-sensitive.

This should prevent security bugs like the CVE-2014-9390.

SimonSapin · 2014-12-24T15:33:43Z

A bit late, but the Path comparison should take care if the file system is case-sensitive.

If so, should it also account for Apple’s very own modified variant of Unicode Normalization Form D, on HSF+?

Regardless of HSF+ details, determining which filesystem a given path is on requires some system calls, which might be more expensive than a simple string comparison.

l0kod · 2014-12-24T16:27:10Z

Regardless of HSF+ details, determining which filesystem a given path is on requires some system calls, which might be more expensive than a simple string comparison.

Yes if it's automatic, but a Path optional flag could tell if the path should be considered case-sensitive or not.
This flag could be automatically set with an exists()-like method call (cf. PathExtensions trait), which should have a negligible impact.

retep998 · 2014-12-24T18:52:42Z

Regarding case sensitivity, on Windows using normal paths is case insensitive but if you use a \\?\ path there is no normalization and everything is case sensitive.
Perhaps we could have two forms of comparison, one that compares the strings, the other that compares the real paths and does system calls?

erickt · 2015-01-03T12:24:53Z

For reference, the c++ standards committee just accepted a filesystem rfc.

blaenk · 2015-01-03T20:19:56Z

Yeah, I think it would be very useful and informative to look into the combined efforts of the committee to see the motivations behind the choices they made. It could allow us to recognize something we may have overlooked in this paths reform RFC.

aturon · 2015-01-03T20:31:17Z

FWIW as I've been working on the implementation I've been moving steadily closer to what Boost did, which is presumably close to this. I will take a look ASAP.

retep998 · 2015-01-26T04:27:15Z

So I think what should happen is that whenever we call a system function with a path, we should check the length of the path. If it is safely within MAX_PATH then we go ahead and call the function. If the path is too large then we first check whether it is absolute or relative. If the path is relative we return an error. If the path is absolute then we check whether it is a \\?\ path. If it isn't then standardize the path by prepending \\?\ (if it is a UNC path that starts with \\ but not \\?\ then the path should be prefixed by \\?\UNC\), converting / to \, and accounting for .. and .. We can then pass this path to the function.

EDIT: We should also probably convert all absolute paths to \\?\ paths when turning a relative path into an absolute path or normalizing a path. Also when appending a path onto a \\?\ path, we should normalize the appended path.

As part of [RFC 474](rust-lang/rfcs#474), this commit renames `std::path` to `std::old_path`, leaving the existing path API in place to ease migration to the new one. Updating should be as simple as adjusting imports, and the prelude still maps to the old path APIs for now. [breaking-change]

Implements [RFC 474](rust-lang/rfcs#474); see that RFC for details/motivation for this change. This initial commit does not include additional normalization or platform-specific path extensions. These will be done in follow up commits or PRs.

This PR implements [path reform](rust-lang/rfcs#474), and motivation and details for the change can be found there. For convenience, the old path API is being kept as `old_path` for the time being. Updating after this PR is just a matter of changing imports to `old_path` (which is likely not needed, since the prelude entries still export the old path API). This initial PR does not include additional normalization or platform-specific path extensions. These will be done in follow up commits or PRs. [breaking-change] Closes rust-lang#20034 Closes rust-lang#12056 Closes rust-lang#11594 Closes rust-lang#14028 Closes rust-lang#14049 Closes rust-lang#10035

RFC: path reform

4a2b120

huonw reviewed Nov 19, 2014
View reviewed changes

Fix typos

77810af

japaric reviewed Nov 19, 2014
View reviewed changes

alexcrichton mentioned this pull request Nov 19, 2014

Rename some methods in the os module rust-lang/rust#19110

Closed

SimonSapin reviewed Nov 19, 2014
View reviewed changes

aturon added 2 commits November 19, 2014 12:12

as_str_lossy -> to_str_lossy

fa0f04e

enums are now namespaced

a2a1b7b

SimonSapin added a commit to SimonSapin/rust-wtf8 that referenced this pull request Nov 19, 2014

Rename Wtf8Slice to Wtf8 and Wtf8String to Wtf8Buf

a643e7a

... per Path reform RFC convention. rust-lang/rfcs#474

SimonSapin reviewed Nov 19, 2014
View reviewed changes

mahkoh reviewed Nov 19, 2014
View reviewed changes

jgallagher mentioned this pull request Jan 13, 2015

Non-idiomatic constructor rusqlite/rusqlite#13

Closed

This was referenced Jan 23, 2015

Path::new should return Option<Path> vs. failing (if there is a null byte) rust-lang/rust#14048

Closed

Path::dirname should return Path instead of [u8] rust-lang/rust#14049

Closed

aturon mentioned this pull request Jan 28, 2015

Stabilization for 1.0-alpha2 rust-lang/rust#20761

Closed

38 tasks

aturon mentioned this pull request Jan 29, 2015

Rename std::path to std::old_path; introduce new std::path rust-lang/rust#21759

Merged

SimonSapin mentioned this pull request May 4, 2015

Windows build regularly gets broken servo/rust-url#102

Closed

aturon mentioned this pull request Sep 2, 2015

RFC: Stabilize catch_panic #1236

Merged

aturon mentioned this pull request Feb 3, 2016

Path equality isn't correct rust-lang/rust#31374

Closed

Centril added the A-file Proposals relating to file systems. label Nov 23, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: path reform #474

RFC: path reform #474

aturon commented Nov 19, 2014

aturon commented Nov 19, 2014

huonw Nov 19, 2014

aturon Nov 19, 2014

nikomatsakis commented Nov 19, 2014

japaric Nov 19, 2014

michaelsproul commented Nov 19, 2014

SimonSapin Nov 19, 2014

SimonSapin Nov 19, 2014

Valloric commented Nov 19, 2014

mahkoh Nov 19, 2014

huonw Nov 19, 2014

mahkoh Nov 19, 2014

alexcrichton commented Dec 19, 2014

l0kod commented Dec 24, 2014

SimonSapin commented Dec 24, 2014

l0kod commented Dec 24, 2014

retep998 commented Dec 24, 2014

erickt commented Jan 3, 2015

blaenk commented Jan 3, 2015

aturon commented Jan 3, 2015

retep998 commented Jan 26, 2015

RFC: path reform #474

RFC: path reform #474

Conversation

aturon commented Nov 19, 2014

aturon commented Nov 19, 2014

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nikomatsakis commented Nov 19, 2014

Choose a reason for hiding this comment

michaelsproul commented Nov 19, 2014

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Valloric commented Nov 19, 2014

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alexcrichton commented Dec 19, 2014

l0kod commented Dec 24, 2014

SimonSapin commented Dec 24, 2014

l0kod commented Dec 24, 2014

retep998 commented Dec 24, 2014

erickt commented Jan 3, 2015

blaenk commented Jan 3, 2015

aturon commented Jan 3, 2015

retep998 commented Jan 26, 2015