Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

liblibc: Fix prototype of functions taking char *const argv[] #25641

Merged
merged 1 commit into from
Jun 21, 2015

Conversation

geofft
Copy link
Contributor

@geofft geofft commented May 20, 2015

The execv family of functions and getopt are prototyped somewhat strangely in C: they take a char *const argv[] (and envp, for execve), an immutable array of mutable C strings -- in other words, a char *const *argv or argv: *const *mut c_char. The current Rust binding uses *mut *const c_char, which is backwards (a mutable array of constant C strings).

That said, these functions do not actually modify their arguments. Once upon a time, C didn't have const, and to this day, string literals in C have type char * (*mut c_char). So an array of string literals has type char * [], equivalent to char ** in a function parameter (Rust *mut *mut c_char). C allows an implicit cast from T ** to T * const * (*const *mut T) but not to const T * const * (*const *const T). Therefore, prototyping execv as taking const char * const argv[] would have broken existing code (by requiring an explicit cast), despite being more correct. So, even though these functions don't need mutable data, they're prototyped as if they do.

While it's theoretically possible that an implementation could choose to use its freedom to modify the mutable data, such an implementation would break the innumerable users of execv-family functions that call them with string literals. Such an implementation would also break std::process, which currently works around this with an unsafe as *mut _ cast, and assumes that execvp secretly does not modify its argument. Furthermore, there's nothing useful to be gained by being able to write to the strings in argv themselves but not being able to write to the array containing those strings (you can't reorder arguments, add arguments, increase the length of any parameter, etc.).

So, despite the C prototype with its legacy C problems, it's simpler for everyone for Rust to consider these functions as taking *const *const c_char, and it's also safe to do so. Rust does not need to expose the mistakes of C here. This patch makes that change, and drops the unsafe cast in std::process since it's now unnecessary.

@rust-highfive
Copy link
Collaborator

r? @alexcrichton

(rust_highfive has picked a reviewer for you, use r? to override)

@geofft geofft changed the title liblibc: Fix prototype of execve() and friends liblibc: Fix prototype of functions taking char *const argv[] May 20, 2015
@geofft
Copy link
Contributor Author

geofft commented May 20, 2015

Edited to cover getopt as well as the execvs, since it also takes a char *const argv[]. Unfortunately, the argument about the function not wanting to modify its arguments is not actually true for getopt, and in particular GNU libc violates the prototype of getopt by permuting the array despite the array being declared const. PR #25642 is a followup to this one which tries to deal with that. 😞

@alexcrichton alexcrichton added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label May 27, 2015
@alexcrichton
Copy link
Member

Thanks for the PR @geofft! I think we may want to go ahead and lump in #25642 here as well. I think it's fine to make these changes in terms of an API backwards-compatible sense because *mut will coerce to *const, and your rationale makes sense to me in terms of what signatures these should indeed have.

I wonder, perhaps, though if getopt would make more sense to just always take *mut to make it easier to pass arguments to?

@geofft
Copy link
Contributor Author

geofft commented Jun 1, 2015

I'm fine with making getopt take *mut *mut c_char on all platforms for consistency. I can push that here and close #25642.

That said, is there even a reason to keep it in liblibc? It's not used in-tree, and there's no stable way to get the process arguments in the form they showed up to libc's main: it's much easier to use the getopts crate. (And there's this curious text in POSIX and echoed in Linux's man page: "The parameters argc and argv are the argument count and argument array as passed to main()" -- that is, it might not even be valid to build your own argv out of std::env::args_os().) Given that liblibc's unstable, could we just get rid of it and avoid the question?

@alexcrichton
Copy link
Member

Unfortunately all of liblibc is exported as a stable API under the libc crate, so we can't just remove it :(

@geofft
Copy link
Contributor Author

geofft commented Jun 1, 2015

It's 0.x, right? (I vaguely assumed that was on purpose.) I'd be happy with leaving getopt as is for now and removing it when you have an excuse to do a 0.2, if you want to do it that way.

@alexcrichton
Copy link
Member

It is, yeah, but this doesn't seem like enough reason to bump to 0.2 just yet

@geofft
Copy link
Contributor Author

geofft commented Jun 20, 2015

OK, I've squashed the two together and rebased on current HEAD, and changed getopt to unconditionally take *mut *mut c_char so you don't have unexpected portability problems.

Can we stick a #[deprecated] or something on getopt? Does that work on liblibc?

The execv family of functions do not modify their arguments, so they do
not need mutable pointers. The C prototypes take a constant array of
mutable C-strings, but that's a legacy quirk from before C had const
(since C string literals have type `char *`). The Rust prototypes had
`*mut` in the wrong place, anyway: to match the C prototypes, it should
have been `*const *mut c_char`. But it is safe to pass constant strings
(like string literals) to these functions.

getopt is a special case, since GNU getopt modifies its arguments
despite the `const` claim in the prototype. It is apparently only
well-defined to call getopt on the actual argc and argv parameters
passed to main, anyway. Change it to take `*mut *mut c_char` for an
attempt at safety, but probably nobody should be using it from Rust,
since there's no great way to get at the parameters as passed to main.

Also fix the one caller of execvp in libstd, which now no longer needs
an unsafe cast.

Fixes rust-lang#16290.
@alexcrichton
Copy link
Member

Nah unfortunately #[deprecated] won't wok for the liblibc on crates.io, but otherwise this looks fine to me, thanks @geofft!

@bors: r+ 058a0f0

bors added a commit that referenced this pull request Jun 21, 2015
The `execv` family of functions and `getopt` are prototyped somewhat strangely in C: they take a `char *const argv[]` (and `envp`, for `execve`), an immutable array of mutable C strings -- in other words, a `char *const *argv` or `argv: *const *mut c_char`. The current Rust binding uses `*mut *const c_char`, which is backwards (a mutable array of constant C strings).

That said, these functions do not actually modify their arguments. Once upon a time, C didn't have `const`, and to this day, string literals in C have type `char *` (`*mut c_char`). So an array of string literals has type `char * []`, equivalent to `char **` in a function parameter (Rust `*mut *mut c_char`). C allows an implicit cast from `T **` to `T * const *` (`*const *mut T`) but not to `const T * const *` (`*const *const T`). Therefore, prototyping `execv` as taking `const char * const argv[]` would have broken existing code (by requiring an explicit cast), despite being more correct. So, even though these functions don't need mutable data, they're prototyped as if they do.

While it's theoretically possible that an implementation could choose to use its freedom to modify the mutable data, such an implementation would break the innumerable users of `execv`-family functions that call them with string literals. Such an implementation would also break `std::process`, which currently works around this with an unsafe `as *mut _` cast, and assumes that `execvp` secretly does not modify its argument. Furthermore, there's nothing useful to be gained by being able to write to the strings in `argv` themselves but not being able to write to the array containing those strings (you can't reorder arguments, add arguments, increase the length of any parameter, etc.).

So, despite the C prototype with its legacy C problems, it's simpler for everyone for Rust to consider these functions as taking `*const *const c_char`, and it's also safe to do so. Rust does not need to expose the mistakes of C here. This patch makes that change, and drops the unsafe cast in `std::process` since it's now unnecessary.
@bors
Copy link
Contributor

bors commented Jun 21, 2015

⌛ Testing commit 058a0f0 with merge dedd430...

@bors bors merged commit 058a0f0 into rust-lang:master Jun 21, 2015
@geofft geofft deleted the execve-const branch June 22, 2015 20:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants