Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TODO #1

Closed
9 tasks done
shenwei356 opened this issue Jan 6, 2017 · 11 comments
Closed
9 tasks done

TODO #1

shenwei356 opened this issue Jan 6, 2017 · 11 comments

Comments

@shenwei356
Copy link
Owner

shenwei356 commented Jan 6, 2017

  • add example of -v
  • implement retry interval
  • add more examples on bioinformatics
  • do not send empty data
  • support continue
  • test more in windows
  • avoid mixed line from multiple process, e.g. the first half of a line is from one process and the last half of the line is from another process.
  • replacement string {^suffix} for removing suffix
  • add flag --eta
@mattn
Copy link

mattn commented Jan 23, 2017

please add automatic detection for using shell or not-use. (like this https://github.com/mmstick/parallel/blob/0dd48100e9a29d9a023826c778a5c7e70f9bf464/src/execute/exec_inputs.rs#L40-L45)

@shenwei356
Copy link
Owner Author

shenwei356 commented Jan 23, 2017

please add automatic detection for using shell or not-use.

OK. I'll use mattn/go-shellwords

@mattn
Copy link

mattn commented Jan 23, 2017

go-shellwords doesn't detect multiple commands like foo; bar, Sorry. BTW I'm guessing why go is faster than rust in this result is whether shell is spawned.

https://www.reddit.com/r/rust/comments/5penft/parallelizing_enjarify_in_go_and_rust/dcr4y7f/

@shenwei356
Copy link
Owner Author

I think running all commands using shell ($SHELL -c for *nix and %COMSPEC% /c for Windows) for both single command and multiple commands like foo; bar is fine.

@mattn
Copy link

mattn commented Jan 23, 2017

What I mean is Why rust is faster always. :)
If rush can avoid to spawn shell, rush will be faster, I guess.

@shenwei356
Copy link
Owner Author

I get it. Thanks you.

@mmstick
Copy link

mmstick commented Jan 25, 2017

@mattn Running commands within a shell has very little overhead for my Rust implementation when you follow the recommendation to install dash. Here's a comparison of times with and without the shell:

Without Shell

seq 1 10000 | time -v target/x86_64-unknown-linux-musl/release/parallel 'echo {}' > /dev/null

User time (seconds): 0.40
System time (seconds): 2.68
Percent of CPU this job got: 93%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:03.29

5489.640372 task-clock:u (msec)

With Shell

These are times when the shell is enabled (with dash-static-musl installed)

seq 1 10000 | time -v target/x86_64-unknown-linux-musl/release/parallel 'echo {}; echo {}' > /dev/null

User time (seconds): 0.35
System time (seconds): 2.56
Percent of CPU this job got: 128%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:02.27

4593.366103 task-clock:u (msec)

Believe it or not, but the shell path with dash is actually much faster than the no-shell path. That is something that I will be investigating, to see where my bottleneck is in regards to the no-shell codepath.

@shenwei356
Copy link
Owner Author

@mmstick The rust implementation is indeed faster for this test. And the go API for running a process needs to call $SHELL -c, so I did not compare case without using shell.

What made me confused was why rush_linux_amd64 had a bad performance in your two computers. In my laptop, for the test seq 1 10000 | time -v $CMD 'echo {}' > /dev/null , rust-parallel has ~4X speed of rush but was >100X faster in your computers.

Here's a fresh result:

$ for cmd in parallel rust-parallel rush; do echo $cmd; seq 1 10000 | time -v $cmd 'echo {}' > /dev/null; done
parallel
        Command being timed: "parallel echo {}"
        User time (seconds): 28.73
        System time (seconds): 30.66
        Percent of CPU this job got: 185%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:32.04

rust-parallel
        Command being timed: "rust-parallel echo {}"
        User time (seconds): 3.13
        System time (seconds): 4.82
        Percent of CPU this job got: 312%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:02.54

rush
        Command being timed: "rush echo {}"
        User time (seconds): 12.81
        System time (seconds): 24.45
        Percent of CPU this job got: 274%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:13.57

Besides, speed is not the #.1 target for rush now, especially for processes that last long. I'm using it every day in my Bioinformatics analysis and try to keep on improving the usability and stability.

@mmstick
Copy link

mmstick commented Jan 25, 2017

Do you have any AMD hardware? Both of my systems are powered with AMD so that could be one reason. It could also be the Intel CPU governor having issues of not retaining it's max frequency long enough.

Basically, before I perform my benchmarks, I ensure that all software is closed, that the CPU governor is set to performance via sudo cpupower frequency-set -g performance, and that transparent_hugepages is set to madvise via sudo sh -c "echo madvise > /sys/kernel/mm/transparent_hugepage/enabled". The Linux distribution that I am operating from is Arch Linux, and I have dash-static-musl installed because of it's high performance.

@mfasold
Copy link

mfasold commented Feb 14, 2018

Would it be possible to process a set of commands that is specified in a file, for example like the "::::" argument in GNU parallel?

@shenwei356
Copy link
Owner Author

@mfasold -i file.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants