Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add some sort of exec command. #84

Closed
fiveNinePlusR opened this issue Oct 9, 2017 · 19 comments
Closed

Add some sort of exec command. #84

fiveNinePlusR opened this issue Oct 9, 2017 · 19 comments
Milestone

Comments

@fiveNinePlusR
Copy link

I am sure this utility could improve upon find's exec command to make a common thing easily accessible and powerful. xargs is not an awesome way to do it because of limitations with spaces in filenames.

interactive exec would be nice to have too. do a fd --interactive-exec some_query and then you are prompted for a command that doesn't need any special delimiters but would pass in things like %file %path %filename etc to the command that get parsed and replaced with a properly escaped and quoted string.

Just spitballing ideas here as this is not a fully baked idea.

@Detegr
Copy link
Contributor

Detegr commented Oct 9, 2017

xargs is not an awesome way to do it because of limitations with spaces in filenames

fd supports -0 to use with xargs to support spaces in filenames:

$ fd foo -0 | xargs -0 ls
'foo bar.txt'

@indeyets
Copy link

indeyets commented Oct 9, 2017

The main point of -exec is not the escaping, but shell's limitations on number of arguments. xargs can not execute command if there's too many of those. -exec, on the other hand, is not limited by shell as it executes commands sequentially one at a time

@Ziggoto
Copy link

Ziggoto commented Oct 9, 2017

Up

@sharkdp
Copy link
Owner

sharkdp commented Oct 9, 2017

It would be great if someone could come up with a specific plan on how this would work. Do we want to clone finds -exec behavior? finds -execdir behavior? Is there anything that we want to do differently? Anything we could improve?

@gsquire
Copy link
Contributor

gsquire commented Oct 9, 2017

I reach for -exec the majority of the time and would think that is most popular. I know {} is useful for explicitly saying the command uses the match as an argument so it would be great to support that. Perhaps instead of passing a semicolon to terminate the command, fd could just use all arguments after the -exec flag?

@sharkdp
Copy link
Owner

sharkdp commented Oct 10, 2017

-exec, on the other hand, is not limited by shell as it executes commands sequentially one at a time

That's actually one thing that we have to think about when writing up a proposal on how this should be implemented.

Since fd searches files in parallel, we could (in principle) execute the necessary child processes in parallel. This would potentially lead to a significant speed-up compared to a sequential -exec but could also complicate things a lot. That would be somewhat similar to fd -0 .. | parallel -0 ....

@indeyets
Copy link

It was brought to my attention that, actually, xargs -n 100 -P 4 will execute commands in batches of 100 arguments using up to 4 parallel processes. So, according to unix-way there's no problem and case can be closed. Sorry for my ignorance.

It might be an opportunity for optimization, but, probably, it would be enough to document why -exec is not needed

@sharkdp
Copy link
Owner

sharkdp commented Oct 10, 2017

@indeyets Thank you, I also didn't know that.

I'm also starting to think that it might be better to not include -exec in fd. And I agree, it would be in line with the "unix-way" and also the goals of fd (to be a simple and easily understandable tool).

I don't want to make any decision yet, though. Please keep the discussion going 👍

@fiveNinePlusR
Copy link
Author

perhaps it would be sufficient to add in -exec with a small help description of how to use it in conjunction with xargs? I don't know enough about every single idiosyncrasies of xargs and find to know if there are other issues.

@mmstick
Copy link
Contributor

mmstick commented Oct 13, 2017

I could easily implement this within a day, and possibly parallel command executions via a job pool, too. I've a lot of experience with this kind of software in Rust. I would recommend following GNU Parallel's syntax for command generation though. It's much more flexible than find's limited command generation capabilities. Syntax is a good fit with file-based command generation, too.

{} - simple placeholder token
{.} - remove the extension
{/} - basename
{/.} - basename without extension
{//} - parent path

And bonus token:

{^abc...} - remove a custom suffix

@sharkdp
Copy link
Owner

sharkdp commented Oct 13, 2017

@mmstick That sounds good 😄

Given that you wrote a parallel-clone in Rust, why do you think it would be beneficial to add --exec to fd (instead of just piping to parallel)?

Before we implement a feature like this, I would like to see at least a short outline on how this would feature would work exactly (which command-line options would be added? what would be the syntax of the --exec argument? which new dependencies would we have to pull in? how would this interfere with other features of fd?).

I would recommend following GNU Parallel's syntax for command generation though. It's much more flexible than find's limited command generation capabilities.

I should learn more about parallel...

@fiveNinePlusR
Copy link
Author

fiveNinePlusR commented Oct 13, 2017

Interesting syntax for that... If you did add this to the utility, would it also be prudent to add in something like the following?

{basename}
{no_extension}
{basename_no_extension}

or you could do this to make parsing easier:

{{basename}}
{{no_extension}}
etc. 

etc. to make it more explicit. the other tokens would still be matched on as well.

The idea is analogous to -v and --verbose one's short and cryptic and the other is long and explicit to the reader.

@mmstick
Copy link
Contributor

mmstick commented Oct 13, 2017

It's just a rather simple feature that can easily exist within it's own standalone module. The main benefit would just be cutting out the middle man. I would think that just the --exec flag would be good enough, and syntax would look similar to GNU Parallel (minus the manual supplying of arguments and permutating inputs).

fd *.flac -type f -exec 'ffmpeg -i {} -c:a libopus {.}.opus'
fd *.flac -type f -exec ffmpeg -i {} -c:a libopus {.}.opus

How It Would Work

You could simply have it to where all arguments following -exec are treated as the command to use, and if no placeholder tokens are used, then simply add arguments to the end of the command when generating them.

When parsing the arguments, and seeing a command, you'd parse the command into a vector of string references & tokens. Something like an Option<Vec<Token<'a>>> field, which if set to Some, will signify the program to use the contents of that field to generate and execute commands.

The default could be to just execute commands serially. A job pool can easily work if we string together a Arc<Mutex<VecDeque<T>>> to share across threads. If we want to capture the results and have them printed serially, we could just create a Arc<Mutex<IntMap<usize, File>>> to store the FDs of the executed commands for the main thread to grab from and print in a serial fashion.

@mmstick
Copy link
Contributor

mmstick commented Oct 13, 2017

If there are any against it, I could also implement it as an optional feature, gated behind conditional compilation.

@sharkdp
Copy link
Owner

sharkdp commented Oct 13, 2017

@mmstick Sounds good, thank you very much for writing this up!

I'm certainly not against this feature, but I'm curious what the advantage is over using parallel/xargs?

@mmstick
Copy link
Contributor

mmstick commented Oct 13, 2017

Having to bring out parallel / xargs results in having to execute a command to execute commands. So instead of fd -> parallel -> shell procs -> commands, you'd just have fd -> shell procs -> commands. If you wanted to go a step further, you could also directly embed the Ion shell as a library, and then you'd just have fd -> commands. Put simply, you'd win benchmarks. No real advantages other than that.

@sharkdp
Copy link
Owner

sharkdp commented Oct 13, 2017

Fair enough, in this case, let's go for it 😄. The feature has been asked for by a lot of people. Your help/contribution would be very much appreciated.

fd *.flac -type f -exec 'ffmpeg -i {} -c:a libopus {.}.opus'
fd *.flac -type f -exec ffmpeg -i {} -c:a libopus {.}.opus

I think I would prefer to only support the first variant of this, i.e. just a normal command line option --exec/-e that takes a single argument. This option could appear anywhere on the command line -
before or after the pattern, just like any other option and flag.

It seems to me that the other variant would be a possible source of confusion/errors.

If there are any against it, I could also implement it as an optional feature, gated behind conditional compilation.

We could still do this afterwards if it turns out we want this to be configurable. Right now, I don't see any need for this - but thanks for the suggestion.

Another thing that just comes to my mind is platform-independence. Are there any complications that we could run into?

@mmstick
Copy link
Contributor

mmstick commented Oct 13, 2017

Another thing that just comes to my mind is platform-independence. Are there any complications that we could run into?

This can be implemented in a platform-independent manner.

@mmstick
Copy link
Contributor

mmstick commented Oct 14, 2017

I'll have a PR submitted later today, once I've documented and refactored what I've written so far.

mmstick added a commit to mmstick/fd that referenced this issue Oct 14, 2017
@sharkdp sharkdp added this to the v5.0 milestone Oct 19, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants