Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add size-limited command interface #74549

Closed
wants to merge 20 commits into from

Conversation

Artoria2e5
Copy link
Contributor

@Artoria2e5 Artoria2e5 commented Jul 20, 2020

As described in #40384, most systems put an upper limit to the size of commands. The std::process::Command interface should therefore be updated to handle this problem and detect when the OS finds the command-line too big.

Progress on this stuff:

  • learn to rust
  • calculate size
  • try_add_arg
    • Unix: cache envp size, invalidate on env_mut()
  • make a CommandSized trait
  • add an xargs like interface
    • get size of item (no adding)
    • something like xargs(program: S, stuff: impl IntoIterator<Item=S>, before: Vec[S], after: Vec[S]) -> Vec[Command]
  • Windows CommandExt for unescaped arg (Command does not escape arguments as expected on windows #29494)
    • get xargs to allow it
  • make sense

@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @LukasKalbertodt (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 20, 2020
@bors
Copy link
Contributor

bors commented Jul 23, 2020

☔ The latest upstream changes (presumably #74667) made this pull request unmergeable. Please resolve the merge conflicts.

@LukasKalbertodt
Copy link
Member

Hi @Artoria2e5 and thanks for the PR so far.

Would you like to have a review/feedback of the state so far? Or are you still actively working on it? In any case, just let me know!

@bors
Copy link
Contributor

bors commented Jul 28, 2020

☔ The latest upstream changes (presumably #73265) made this pull request unmergeable. Please resolve the merge conflicts.

@Artoria2e5
Copy link
Contributor Author

A few rebases may happen given the recent change in file layout, so I think I am not ready yet.

@LukasKalbertodt
Copy link
Member

Sure. Just let me know if you need any help or if I should take a look at the changes.

I still need to come up witth a place for the trait to go. Honestly
I don't know where to put the file yet.

This thing being quadratic is a problem too. I guess I will make a
little struct with the current length and argc in it to keep track.
(And yeah it only works as long as envp is unchanged.)
It looks like I really can't get away with the whole storing error
thing as they don't have Copy. Maybe I will end up using the ugly
problem enum thing like I did with Unix too. Grrr...
@Artoria2e5
Copy link
Contributor Author

Artoria2e5 commented Jul 29, 2020

@LukasKalbertodt I think I can ask for a very early review now. Most of my questions are in the commit messages, but generic pointers hints with Rust will help a lot too.

The main problem I have seems to be I really don't know where to put the trait.

I am bad at this...

pub fn get_size(&mut self) -> io::Result<usize> {
use crate::mem;
let argv = &self.argv.0;
let argv_size: usize = argv.iter().map(|x| unsafe { strlen(*x) + 1 }).sum::<usize>()
Copy link
Contributor Author

@Artoria2e5 Artoria2e5 Jul 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh great I forgot that the last element is a nullptr!

But seriously I should this without strlen... like accumulating a length in add when it's still an OsStr.

let program = rprogram.as_ref().unwrap_or(&self.program);

// Prepare and terminate the application name and the cmdline
// XXX: this won't work for 16-bit, might be preferable to do a extend_from_slice
Copy link
Contributor Author

@Artoria2e5 Artoria2e5 Jul 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better do an append to the cmdline in new(). Oh I forgot it won't work because of the PATH resolving above. Gotta leave it for now...

sys::os::set_errno(0);
let limit = libc::sysconf(libc::_SC_ARG_MAX);
let errno = sys::os::errno();
sys::os::set_errno(old_errno);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not necessary, right? I think it is expected that any function in libstd that interacts with the OS in some way can clobber errno.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I didn't know that!

@jyn514 jyn514 added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Aug 30, 2020
@@ -75,11 +76,14 @@ pub struct Command {
args: Vec<CString>,
argv: Argv,
env: CommandEnv,
env_size: Option<usize>,
arg_max: Option<isize>,
Copy link
Contributor

@pickfire pickfire Aug 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can arg_max possibly be Some(0)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wouldn't make sense to be that. Are you suggesting a naked isize?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either that or NonZero types.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks cool. Let me get it changed.

self.arg_max = limit.try_into().ok();
}
}
Ok(self.arg_max.unwrap() < 0 || self.get_size()? < (self.arg_max.unwrap() as usize))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't arg_max be usize so it won't be less than zero?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, a negative value indicates that the arg size is unlimited. If my reading of POSIX is right, that is.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC -0 of usize is still -1 for unlimited.

Comment on lines +267 to +268
let old_errno = sys::os::errno();
sys::os::set_errno(0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should we clear existing errno?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the no-error case automatically clears errno. Does it?

@Dylan-DPC-zz
Copy link

r? @Amanieu

@bors
Copy link
Contributor

bors commented Oct 2, 2020

☔ The latest upstream changes (presumably #77029) made this pull request unmergeable. Please resolve the merge conflicts.

Note that reviewers usually do not review pull requests until merge conflicts are resolved! Once you resolve the conflicts, you should change the labels applied by bors to indicate that your PR is ready for review. Post this as a comment to change the labels:

@rustbot modify labels: +S-waiting-on-review -S-waiting-on-author

@Artoria2e5
Copy link
Contributor Author

I don't really understand why it's trying to compile a Windows thing for Unix…

@Dylan-DPC-zz
Copy link

@Artoria2e5 any updates on this?

@bors
Copy link
Contributor

bors commented Oct 16, 2020

☔ The latest upstream changes (presumably #77666) made this pull request unmergeable. Please resolve the merge conflicts.

Note that reviewers usually do not review pull requests until merge conflicts are resolved! Once you resolve the conflicts, you should change the labels applied by bors to indicate that your PR is ready for review. Post this as a comment to change the labels:

@rustbot modify labels: +S-waiting-on-review -S-waiting-on-author

Copy link
Member

@jyn514 jyn514 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

error[E0432]: unresolved imports `crate::sys::process::Arg`, `crate::sys::process::Problem`, `crate::sys::process::RawArg`
  --> library/std/src/sys/windows/ext/process.rs:11:31
   |
11 | pub use crate::sys::process::{Arg, Problem, RawArg};
   |                               ^^^  ^^^^^^^  ^^^^^^ no `RawArg` in `sys::unix::process`
   |                               |    |
   |                               |    no `Problem` in `sys::unix::process`
   |                               no `Arg` in `sys::unix::process`

I don't really understand why it's trying to compile a Windows thing for Unix…

@Artoria2e5 the documentation for the standard library has a very cursed build system and shows all platforms even though the host is linux: #43348.

// FIXME: This really should be somewhere else, since it will be duplicated for unix. sys_common? I have no idea.
// The implementations should apply to unix, but describing it to the type system is another thing.

That sounds about right. The approach I would take is to have CommandSized in sys_common, but make the trait implementations specific to the OS.

library/std/src/sys/windows/ext/process.rs Outdated Show resolved Hide resolved
library/std/src/sys/windows/ext/process.rs Outdated Show resolved Hide resolved
Comment on lines 149 to 154
/// Build many commands.
fn xargs<I, S, A>(program: S, args: I, before: Vec<A>, after: Vec<A>) -> io::Result<Vec<Self>>
where
I: IntoIterator<Item = A>,
S: AsRef<OsStr> + Copy,
A: Arg;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this do that maybe_args doesn't?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like its namesake, xargs is supposed to create a new command to fit in additional parameters when there's too many of them. As a result it returns possibly many commands.

It can still fail from a single oversized parameter being impossible to fit.

Comment on lines +101 to +104
pub enum Problem {
SawNul,
Oversized,
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd make the name of the error more descriptive.

Suggested change
pub enum Problem {
SawNul,
Oversized,
}
pub enum ArgumentError {
SawNul,
Oversized,
}

Comment on lines +110 to +113
fn append_to(&self, cmd: &mut Vec<u16>, force_quotes: bool) -> Result<usize, Problem>;
#[unstable(feature = "command_sized", issue = "74549")]
fn arg_size(&self, force_quotes: bool) -> Result<usize, Problem>;
fn to_os_string(&self) -> OsString;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other functions need annotations too, I think.

@@ -133,6 +190,13 @@ impl Command {
pub fn creation_flags(&mut self, flags: u32) {
self.flags = flags;
}
#[allow(dead_code)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this for?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seem to be a lot of these, some pre-existing 🤷

@jyn514 jyn514 added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Oct 30, 2020
@Artoria2e5
Copy link
Contributor Author

I can't seem to get the nonzero stuff working: I got hit with E0423 where I want to construct it.

That sounds about right. The approach I would take is to have CommandSized in sys_common, but make the trait implementations specific to the OS.

I think I can get away with that. The unix version can call the normal maybe_arg() while the windows can go on the ext functions. The outward signature is going to stay as the Arg trait, I think?

@Dylan-DPC-zz Dylan-DPC-zz removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Nov 21, 2020
@jyn514
Copy link
Member

jyn514 commented Nov 25, 2020

@Artoria2e5 yes, the Arg trait is going to be the same, but moved into one place so we can be sure it's not different between platforms.

@Dylan-DPC-zz
Copy link

@Artoria2e5 thanks for taking the time to contribute. I have to close this due to inactivity. If you wish and you have the time you can open a new PR with these changes and we'll take it from there. Thanks

@camelid camelid added the S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. label Jan 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants