Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle special characters in commands - maybe skip the shell #158

Closed
SeanTater opened this issue Dec 21, 2020 · 8 comments · Fixed by #159
Closed

Handle special characters in commands - maybe skip the shell #158

SeanTater opened this issue Dec 21, 2020 · 8 comments · Fixed by #159

Comments

@SeanTater
Copy link

I'm using this command to do some video transcoding, and many of the filenames include parenthesis, single and double quotes and spaces.
Right now, I have to pipe them through sed because all of those are interpreted by the shell when pueue calls it. I wouldn't think much of it except that if the shell weren't involved at all, this wouldn't be a problem.
Is it reasonable to opt-out of a shell for new commands, or is the shell too central to pueue for that?

This is the specific case I'm encountering at the moment, to give an idea.

sean@bravo ~> pueue log 1515
Task 1515 failed with exit code 2
Command: av1-2 h264/Season 06/Home Improvement S06 E17 Wilson's World.m4v
Path: /home/sean
Start: Mon, 21 Dec 2020 07:33:27 -0500
End: Mon, 21 Dec 2020 07:33:27 -0500

stderr:
sh: -c: line 1: unexpected EOF while looking for matching `''
sh: -c: line 2: syntax error: unexpected end of file

This is an example how I add tasks now:

find h264 -print0 -type f | sed 's/ /\\\\ /g' | xargs -0 -n 1 pueue add av1-2

But I think it would make significantly more sense if I could skip the shell and run something more like

find h264 -print0 -type f | xargs -0 -n 1 pueue add --nosh av1-2
@Nukesor
Copy link
Owner

Nukesor commented Dec 21, 2020

The problem is, that passing arguments to commands on a shell as if it isn't on a shell can be tricky.

Right now, we simply do this:

pub fn compile_shell_command(command_string: &str) -> Command {
    let mut command = Command::new("sh");
    command.arg("-c").arg(command_string);

    command
}

This is fairly straight forward, since we don't have to do any additional argument parsing.
This is all done by the shell.

Let's take for instance this command:
pueue add --no-shell "ls /path/containing/some\ whitespaces.txt /some/other/path.txt"

In this case, we would get this command string:

ls /path/containing/some\ whitespaces.txt /some/other/path.txt

Now we have to do manual parsing of this argument-string similar to the shell's argument parsing.
That means to respect escaped characters and to somehow find out which part of the string is an actual argument

The end result should look something like this:

let mut command = Command::new("ls");
command.arg("/path/containing/some whitespaces.txt")
    .arg("/some/other/path.txt");

To be honest, I already thought about implementing something like this, but I never felt confident enough about any parsing approach.
I probably don't know enough about this topic either.

But I imagine that there are quite a few weird edge-cases, where argument parsing can become tricky, such as -l=1,2,3 parameters or other stuff I cannot even think of right now.

Anyway, I think that this is something that would be nice to have,
but we (or at least I) would definitely need to do some research on how to do this properly.

@Nukesor
Copy link
Owner

Nukesor commented Dec 21, 2020

Ah there's one thing we would have to test with your example mentioned above.

Pueue get's the command that's passed to it as a Vec<String>.
It would be super interesting, if h264/Season 06/Home Improvement S06 E17 Wilson's World.m4v is passed in as a single string of if it's split into several strings.

The --no-shell flag would then probably only work, when not passing the actual command as a single String, but rather multiple strings.

That would then look like this:

pueue add --no-shell -- ls /path/containing/some\ whitespaces.txt /some/other/path.txt

Which should result in this Vec<String>:

vec![
    "ls",
    "/path/containing/some whitespaces.txt",
    "/some/other/path.txt",
]

It feels like I'm missing something. Is it really that easy?

@Nukesor
Copy link
Owner

Nukesor commented Dec 21, 2020

There's a few other things that make this really tricky though.

Whenever we insert, for instance, an alias or whenever we edit the command, we lose any information on which part of the concatenated string belongs to which argument.

This would mean, that we wouldn't be able to use aliases or edit commands for tasks with no-shell.

Furthermore, this makes internal handling of a command pretty ugly.
Right now, it's a simple Vec<String>, which get's joined to a String pretty much at the very beginning and any other logic is then done on that one String.

If we decided to add this additional logic, we would have to change that to Vec<String>, which is actually multiple strings in case of --no-shell and a vec with only a single string in the default case. This feels kind of hacky.

@Nukesor
Copy link
Owner

Nukesor commented Dec 21, 2020

Another solution for your problem might be to offer shell-escape on given parameters for the user.

I.e. you get this String input vector:

vec![
    "ls",
    "/path/containing/some whitespaces.txt",
    "/some/other/path.txt",
]

When providing a flag --shell-escape, every special character is then escaped for each string resulting in

vec![
    "ls",
    "/path/containing/some\ whitespaces.txt",
    "/some/other/path.txt",
]

Which could then be concatenated to a working shell command:
ls /path/containing/some\ whitespaces.txt /some/other/path.txt

To be honest, I really like this approach. Are there any good shell escaping libraries out there?

@SeanTater
Copy link
Author

SeanTater commented Dec 21, 2020

Quite frankly, I think the only sane way is to require the end user to pass exactly as many strings to pueue as they would want passed to the underlying executable. So it's up to me to pass multiple strings if that's what I want to be run. That would fit your ls example from 10:13 AM (EST).

I honestly don't understand what you mean by aliases - I suppose this is a feature of pueue I have never used?
As for editing - that does depend on the idea of a single string, and wouldn't be compatible otherwise. You would need a different command line format (like pueue replace <task_id> <executable> <arg1> <argn>...). I doubt it's important though, because the major use case for --no-shell would be with xargs and it would be automatically formatted in large quantities anyway

I don't see anything that sounds like it would be very handy for shell escaping in crates.io - only for splitting into "shell words" which I suppose is unnecessary.

@Nukesor
Copy link
Owner

Nukesor commented Dec 22, 2020

For now, I feel like doing an internal rework on how commands and processes are handled is just to much work.

The escape-shell crate isn't super big in terms of "stars", but the code looks solid, simple and it should do the job quite well.

I created a draft and tested it with various scenarios. It should do the job for your use-case :)
It would be awesome, if you could also give it a try.

@SeanTater
Copy link
Author

Looks good to me, it compiles and runs but I'll wait for my current queue to finish to try it on my current use case. Thanks Arne!

@Nukesor
Copy link
Owner

Nukesor commented Dec 22, 2020

You're welcome :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants