Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revisit passing of strings: redesign and replace NixPath #221

Open
kamalmarhubi opened this issue Dec 25, 2015 · 30 comments
Open

Revisit passing of strings: redesign and replace NixPath #221

kamalmarhubi opened this issue Dec 25, 2015 · 30 comments

Comments

@kamalmarhubi
Copy link
Member

https://github.com/carllerche/nix-rust/blob/00c7905a8a85668d0993420e356a56002ab23905/src/lib.rs#L115

I'm trying to bring back mount (#85) and while I can make it work, I'm not 100% sure of what the purpose of NixPath is. Could do with a 1-2 sentence explanation of why and when to use it. Its name is confusing for mount because the filesystem type is passed as a NixPath, though it isn't really a path.

@arcnmx
Copy link
Contributor

arcnmx commented Dec 28, 2015

NixPath needs a major rehaul...

  1. It's a weird Into<CString> hybrid-like thing that also happens to use a large part of the stack and checks len against MAXPATHLEN, which isn't really helpful anyway.
  2. It doesn't have an implementation for CStr/CString/etc. so you can't even avoid the stack and memcpy cost if you wanted to.
  3. The closure with_nix_path usage is super awkward (just look at mount's current commented-out implementation!).
  4. And then there are parts of nix that use &CString!!! These need to be killed and changed to &CStr or AsRef as appropriate. AND the linked execve implementation does behind-the-scenes allocations to generate the null-terminated list of strings. That should be its own type instead so the caller can pay that cost up-front! Otherwise you're calling collect::<Vec<_>>() after a fork/clone/etc!

We should basically just use AsRef<CStr> instead. For easy integration with nix, maybe provide a impl From<std::ffi::NulError> for nix::Error when using try!(CString::new(...)). Nix can also provide a new explicit NixPath to use when you actually want the stack allocation and MAXPATHLEN check:

struct NixPath([u8; MAX_PATH]);

impl NixPath {
    fn new<T: AsRef<[u8]>>(t: T) -> Result<Self> { ... }
}

impl AsRef<CStr> for NixPath { ... }

@kamalmarhubi
Copy link
Member Author

It's a weird Into<CString> hybrid-like thing that also happens to use a large part of the stack

The stack usage gave me a lot of confusion with "disappearing" strings! (This was "my fault" for returning the result of as_ptr from the passed in closure, but not quite intuitive...)

We should basically just use AsRef<CStr> instead.

Using AsRef<CStr> seems limited. Not even CString implements it. Perhaps AsRef<OsStr> would be better. This would also allow passing Path and PathBuf, str, and string. This would require a copy to ensure null termination though... hrm.

@arcnmx
Copy link
Contributor

arcnmx commented Dec 30, 2015

The stack usage gave me a lot of confusion with "disappearing" strings! (This was "my fault" for returning the result of as_ptr from the passed in closure, but not quite intuitive...)

Mm, that's the purpose of using the closure. It's not a rare Rust idiom, but shouldn't really be used unless strictly necessary. Lifetimes would normally catch that sort of misuse, but oop pointers!

Using AsRef seems limited. Not even CString implements it. Perhaps AsRef would be better. This would also allow passing Path and PathBuf, str, and string. This would require a copy to ensure null termination though... hrm.

Yeah as soon as I realised that I submitted rust-lang/rust#30616 because that's a huge missing trait impl for libstd. It's silly, it basically means we're stuck with using our own trait instead that's just a clone of AsRef<CStr>. I'd suggest naming it something more descriptive than NixPath this time around, and when the AsRef<CStr> for CString hits upstream we can always just switch over to it.

Alternatively, the trait could look more like so:

trait ToCStr {
    type Target: AsRef<CStr>;
    fn to_cstr(&self) -> Result<Self::Target> { ... }
}

... which would retain the current behaviour of being able to pass OsStrs, Paths, etc. without the closure weirdness. I'm personally not a fan of the hidden conversions though, it feels a bit too magical for a Rust API vs just using AsRef<CStr> on its own (IMO the InvalidPath variant probably shouldn't be part of nix::Error at all, which should only cover errno).

@kamalmarhubi
Copy link
Member Author

Mm, that's the purpose of using the closure. It's not a rare Rust idiom, but shouldn't really be used unless strictly necessary. Lifetimes would normally catch that sort of misuse, but oop pointers!

Yeah! Working on mount is part of my first attempt at a Rust project, and I had a bit too much faith in the borrow checker. Took me several runs under gdb to see that I was clobbering the buffer, and to understand that the pointer doesn't have a tracked lifetime. Learning!

I'd suggest naming it something more descriptive than NixPath this time around, and when the AsRef<CStr> for CString hits upstream we can always just switch over to it.

This sounds like a good plan.

Alternatively, the trait could look more like so: [...]

Assuming this was the desired path, would Into<T: AsRef<CStr>> work? I'm still learning how stuff works, but t'd be nice to exploit std traits as much as possible.

I'm personally not a fan of the hidden conversions though, it feels a bit too magical for a Rust API vs just using AsRef<CStr> on its own

I see where this is coming from. Rust definitely favours explicitness! In #190, @carllerche's stated goal of nix is "*I do know that the goal of nix is comprehensive, safe, zero cost bindings to system APIs on nix OSes", so your feelings are in line with that.

@kamalmarhubi
Copy link
Member Author

Given the direction of discussion here, I'm changing the title of the issue :-)

@kamalmarhubi kamalmarhubi changed the title Document NixPath trait Revisit passing of strings: redesign and replace NixPath Dec 31, 2015
@kamalmarhubi
Copy link
Member Author

One additional niggle of using AsRef<CStr> is that there's no way to safely have a null CStr for functions that allow null char* args. This is already a problem with NixPath, just another thing to be aware of. Taking Option<&T> where T: AsRef<CStr> kind of works, but I can't figure out how to use None without needing to be explicit about the concrete type T.

@arcnmx
Copy link
Contributor

arcnmx commented Dec 31, 2015

Yeah, that's a pain... Option<T> with a default type would be the way to do it, but
default types influencing inference doesn't seem to be (fully?) implemented yet... Need to use Option<&CStr> instead until then. Not too big a pain when your type impls Deref<Target=CStr> (as CString does) at least. At the worst it means you have to do an .as_ref() in the call to the method.

For the most part, the advantage of AsRef<CStr> over &CStr is that it allows you to call the method with arguments without being required to & borrow all of them :P

@arcnmx
Copy link
Contributor

arcnmx commented Dec 31, 2015

In fact, until impl AsRef<CStr> for CString lands in libstd, we could really just use &CStr everywhere instead. The one thing the trait buys you is being able to move types into the call instead of requiring them to be borrowed... Making a nix-specific trait isn't a huge deal, but may not even be worth it when &CStr works almost as well thanks to deref coercion.

Assuming this was the desired path, would Into<T: AsRef> work? I'm still learning how stuff works, but t'd be nice to exploit std traits as much as possible.

The difference is that Into<T> is generic over T, while my example uses an associated type. It's a strange difference but I don't think the type inference would be able to work out the Into<T> approach without explicitly annotating parts of the function calls.

I see where this is coming from. Rust definitely favours explicitness!

I think it makes for a cleaner API, nix::Error having its InvalidPath variant is just kind of a weird hidden feature... Result<T, Errno> makes more sense to me, and these APIs take CStrings so that's what they should be passed! It makes things more verbose when your data is coming from a str, OsStr, or Path though... Which is a fairly common scenario so Idunno what other peoples' opinions are...

@carllerche: opinions?

@kamalmarhubi
Copy link
Member Author

Yeah, that's a pain... Option<T> with a default type would be the way to do it, but
default types influencing inference doesn't seem to be (fully?) implemented yet...

Yeah, as one of my multiple attempts at a semi-ergonomic mount in the current framework, I had this monstrosity:

pub fn mount<P1: ?Sized=[u8], P2: ?Sized=[u8], P3: ?Sized=[u8], P4: ?Sized=[u8]>(
        source: Option<&P1>,
        target: &P2,
        fstype: Option<&P3>,
        flags: MsFlags,
        data: Option<&P4>) -> Result<()>
where  P1: NixPath, P2: NixPath, P3: NixPath, P4: NixPath {

It didn't work.

@kamalmarhubi
Copy link
Member Author

I think it makes for a cleaner API, nix::Error having its InvalidPath variant is just kind of a weird hidden feature...

Especially for the fstype and data arguments of mount!

These APIs take CStrings, and that's what they should be passed!

This comes back to functions that allow NULL to be passed in. NULL is not a CString or CStr, and that makes things a bit difficult. Do nix maintainers want to be responsible for reading all platforms' manpages to be sure every nullable string argument is passed as an Option<&CStr>?

@arcnmx
Copy link
Contributor

arcnmx commented Dec 31, 2015

This comes back to functions that allow NULL to be passed in. NULL is not a CString or CStr, and that makes things a bit difficult. Do nix maintainers want to be responsible for reading all platforms' manpages to be sure every nullable string argument is passed as an Option<&CStr>?

Yes, that's exactly what Option<&> is for! Whether NULL is valid or not is usually well specified, though I've never actually seen an explanation for when NULL is okay in the mount parameters >_>

@kamalmarhubi
Copy link
Member Author

I've never actually seen an explanation for when NULL is okay in the mount parameters >_>

My justification for the existing (commented out) set of NULL-allowing parameters:

    // Supplying null should be fine for these arguments: source can be a dummy
    // for, eg, tmpfs; fstype and data can be ignored for bind mounts.

Still, requiring careful reading of manpages to get the nix bindings right seems a bit error prone. It's probably fine, but we'll end up with a few breaking changes as things get fixed. Platform-specific differences can be handled by conditional compilation.

@arcnmx
Copy link
Contributor

arcnmx commented Dec 31, 2015

Error prone maybe, but very necessary. In Rust, nullability is part of the type system. In C, it's just part of an API contract. It's usually pretty obvious when something can be NULL or not, but yes, it's up to nix to figure out what the proper argument types to any function should be.

My justification for the existing (commented out) set of NULL-allowing parameters

Well, there needs to be a guarantee that it won't try and read from the pointer... What if someone supplies source = None with a fstype that's not tmpfs? What if fstype = None when flags doesn't specify MS_BIND or MS_MOVE etc? Will implementations just spit out EINVAL? Will they segfault? You generally have these options:

  1. Get a guarantee from the specification that NULL is a valid argument.
  2. Pass an empty string instead, something like source.unwrap_or(CStr::from_bytes(b"\0"))
  3. Reflect these invariants in the type system itself. Instead of fsType: Option<&CStr>, make it an enum with variants for known cases where some arguments are optional.
    • You still may need to do 2. behind the scenes in the cases where you don't have the guarantee of 1.

kamalmarhubi added a commit to kamalmarhubi/nix-rust that referenced this issue Jan 1, 2016
This makes NixPath more versatile while waiting to see if there will be
a NixPath overhaul.

Refs nix-rust#221
@carllerche
Copy link
Contributor

Sorry for the delay! I need remove myself as a blocker (#236 assemble a team of maintainers). @arcnmx Interested? :) If so, please comment there and / or join the IRC channel!

Everything discussed in this thread is pretty much spot on. A lot in nix needs an overhaul. The lib started mostly as a dumping ground for bindings vs. reimplementing them in each lib that needed them. I would like to get a document that covers these kinds of questions and do a pass throughout the lib to unify API styles (see #190).

As @arcnmx mentioned, if a system API accepts NULL to represent not passing a path, then the nix version should take Option<_>. If a path is required, it should take the type directly.

Specifically regarding NixPath and AsRef<CStr>... I agree that NixPath needs to improve and the name is not ideal... AsRef<CStr> seems plausible. Could somebody summarize the pros / cons of it?

@carllerche
Copy link
Contributor

Thinking more about it, IMO it may be worth creating a separate type for path arguments vs. string arguments.... to get some type safety. Thoughts?

@arcnmx
Copy link
Contributor

arcnmx commented Jan 20, 2016

IMO it may be worth creating a separate type for path arguments vs. string arguments.... to get some type safety

Well, as far as type invariants are concerned, a path is equivalent to a C string. There's not any advantage in making a wrapper type that asserts you meant it to be a path rather than a CStr. libstd has a Path type but that's mostly so that it can provide helper methods for dealing with path components - it doesn't "protect" you from using a &str or &OsStr as a parameter to a method that takes a AsRef<Path>.

Could somebody summarize the pros / cons of it?

The main con is we move the responsibility of converting a string to a null-terminated string from nix to the caller. The nix error type no longer would have an InvalidPath variant, and would simply represent an Errno. It can make code more verbose depending on where your path strings come from. It may force you to add an enum variant to your error types where you could previously springboard off of nix's.

Another minor con is that AsRef<CStr> only landed recently in libstd but we can use our own trait in place of it for now. The main disadvantage is that external libraries need to impl our trait instead of AsRef<CStr> if they want to interop with nix methods without an explicit conversion.

The pro is that it makes for a simpler and more transparent nix design that more closely represents the underlying APIs that it's wrapping. A higher level version that accepts and deals with rust strings can be done as part of #190 discussions.


As a random data point, I've been using the new API in my own project without issue. I prefer having the cost of conversion to CString be up-front and visible and I haven't found it to be a burden for error handling. My project does the usual fork/exec/etc. dance so it's important that these conversions are done in advance in order to avoid allocations (dangerous if the process is multithreaded) or blowing up a small stack after the fork.

@kamalmarhubi
Copy link
Member Author

If we're dropping the InvalidPath variant, can we just use std::io::Error::from_raw_os_error() instead? Or even last_os_error()?

@arcnmx
Copy link
Contributor

arcnmx commented Jan 20, 2016

We could! I think nix::Errno fits well though, it has a From<Errno> for io::Error impl so that try!(errno) "just works". This way you can inspect the errno if you like as opposed to using error.raw_os_error().unwrap() or whatever, and there's not much friction with functions that return io::Error.

It's kind of tangential to dropping InvalidPath though, since that already could've been expressed through io::Error as well (and is already done so in libstd). I'd almost say io::Error would've made a bit more sense before this change than after.

The question really becomes whether we want to drop the Errno type entirely or not, which is a whole other discussion. Considering the discussions in #190 about higher-level abstractions over nix, it may make more sense to do this there, especially when errno may not be the only way for a method to fail (like in, say, string conversions).

@kamalmarhubi
Copy link
Member Author

Considering the discussions in #190 about higher-level abstractions over nix, it may make more sense to do this there

Sounds good.

kamalmarhubi added a commit that referenced this issue Jan 25, 2016
This makes NixPath more versatile while waiting to see if there will be
a NixPath overhaul.

Refs #221
@kamalmarhubi
Copy link
Member Author

A relevant comment from @mbr on #245

That currently seems a little messy and missing an implementation for &str? Interface names are not Paths and I don't want to require passing in CStrings, because at that point it stops being a Rust-Interface, I think.

kamalmarhubi added a commit to kamalmarhubi/nix-rust that referenced this issue Jan 27, 2016
This is a stop gap improvement until the NixPath reform is figured out.

refs nix-rust#221
@kamalmarhubi
Copy link
Member Author

The comment I linked makes me think we should have an opt-in module that provides an additional NixString impl that increase ergonomics:

impl<T: AsRef<OsStr>> NixString for T { ... }

AsRef<OsStr> picks up str, String, Path, PathBuf, OsString. We'd need a way to report NUL related errors back though, so I'm not sure about this yet.

kamalmarhubi added a commit to kamalmarhubi/nix-rust that referenced this issue Jan 27, 2016
This is a stop gap improvement until the NixPath reform is figured out.

refs nix-rust#221
@kamalmarhubi
Copy link
Member Author

Thinking a bit more, this complicates the plan to replace NixPath with a NixString that is equivalent to AsRef<CStr>. Two options I can think of:

  • allow the method on NixString to return a Result
  • have a parallel module hierarchy that allows using the more Rusty types

@arcnmx
Copy link
Contributor

arcnmx commented Jan 27, 2016

at that point it stops being a Rust-Interface, I think.

This is related again to #190: just how low-level is nix, how much behind-the-scenes magic will there be, etc? I'm not sure I consider nix to be a Rusty API so much as just a more ergonomic and type-safe liblibc. Abstractions can then be built on top of it.

These methods simply take a C string, it's their definition. Making it more vague is similar to say mount()'s flags argument taking an i32 instead of MsFlags and calling from_bits(etc) internally, setting aside an error variant for when it fails. It's a question of "do we take a more liberal set of types and allocate extra error variants for when they fail to convert" vs. "do we only take the infallible types and push the burden of conversion to the caller". The reason to do the former is mostly for convenience. I favour the latter as it's more KISS and composable. I'm also just not a fan in general of hidden functionality when it involves allocations.

One could also argue that encouraging people to write higher level abstractions over nix methods for convenience and create a more "rusty" API is advantageous vs providing something almost good enough to be used ad-hoc in rust applications!

AsRef picks up str, String, Path, PathBuf, OsString. We'd need a way to report NUL related errors back though, so I'm not sure about this yet.

  • allow the method on NixString to return a Result

This is what NixPath currently is, modulo length checks and perhaps the stack allocations would become heap allocations.

  • have a parallel module hierarchy that allows using the more Rusty types

This would be a layer on top of the current nix, I guess. Could have fancy things like open() returning an Fd type with read/write member methods, etc! I certainly think there's a place for this.

kamalmarhubi added a commit to kamalmarhubi/nix-rust that referenced this issue Jan 27, 2016
This is a stop gap improvement until the NixPath reform is figured out.

refs nix-rust#221
@kamalmarhubi
Copy link
Member Author

Ok, I want to get down some of my thoughts on this issue.

There are two conflicting desires I'd have for strings and nix:

  1. no allocation required if I want to do null-termination myself. Eg, if I have a String, I can turn it into a CString and use that
  2. passing Rusty string types to system libraries, eg passing a &str or &OsStr or &Path

@arcnmx is prioritizing (1) in #230 which I think is the right choice for zero-cost Rustification of system APIs.

At the same time, I want to be able to just write mount(None, "somedir", "tmpfs", MS_NOSUID, None) without having to create the CString or CStr myself. I think that is a good aim for Rustification, but it is not zero-cost.

One way to make this available in nix would be to have a conveniences* module that has a convenience impl NixString for T where T: AsRef<OsStr>** which do allocate and copy, to nul-terminate, and ensure no interior nuls. This would allow uses like (2).


* name can be bikeshedded later
** this may need some indirection through an extension trait to get around coherence and ensure it's opt-in

@kamalmarhubi
Copy link
Member Author

Wait! @arcnmx, did #230 always have ToCString?

@kamalmarhubi
Copy link
Member Author

Writing up a couple of ideas that cam up in chat with @arcnmx today (transcript).

There's a tension between ergonomics and minimizing overhead and cruft. We came up with two ideas to allow the the convenient and ergonomic impl NixString for &str etc:

  • return EINVAL if there are interior nuls or if there is no nul terminator, and log at debug level that the cause was some nul error
    • pros:
      • we don't clutter up the error type with string a string related case
    • cons:
      • we're returning and EINVAL that didn't actually come from the system
      • the user has to build with env_logger and run with RUST_LOG=nix=debug to find out what went wrong
  • add an associated error type to the NixString trait. For CStr it can be Errno, but for &str it can be an enum like Error is currently, with a case for invalid nul termination
    • pros:
      • user can opt into ergonomic impls which will be handy for prototyping, but can still switch to explicit nul handling
    • cons:
      • the error type depends on the concrete string type passed in. This can be made nicer with From impls that change the string error to EINVAL, for example.

I quite like the second idea, which @arcnmx suggested. It feels like a very clean solution to the problem.

@fiveop
Copy link
Contributor

fiveop commented Mar 15, 2016

I do not like the first idea, mainly do to the first counter argument listed.

@kamalmarhubi
Copy link
Member Author

I've rebased #230 on to current master and will experiment with the associated type approach.

@kamalmarhubi
Copy link
Member Author

In #365 we're discussing mkstemp(3), which takes and mutates a (conceptual) &mut CStr. See #365 (comment)

@kamalmarhubi
Copy link
Member Author

kamalmarhubi commented Oct 25, 2016

#444 drops support for Rust < 1.7, which means we can get the AsRef<Cstr> impls that @arcnmx added many moons ago in rust-lang/rust#30616. I will give it some thought after Rust Belt Rust, and aim to put up an RFC draft in the next few weeks. My recent revisiting of (mount #446) reminded me how horrible NixPath is...

    let res = try!(try!(try!(try!(
        source.with_nix_path(|source| {
            target.with_nix_path(|target| {
                fstype.with_nix_path(|fstype| {
                    data.with_nix_path(|data| {
                        unsafe {
                            libc::mount(source.as_ptr(),
                                       target.as_ptr(),
                                       fstype.as_ptr(),
                                       flags.bits,
                                       data.as_ptr() as *const libc::c_void)
                        }
                    })
                })
            })
    })))));

https://github.com/nix-rust/nix/blob/master/src/mount.rs#L60-L75

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants