New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ncpus argument and install2.r #62

Closed
csgillespie opened this Issue Dec 30, 2018 · 14 comments

Comments

Projects
None yet
2 participants
@csgillespie
Copy link
Contributor

csgillespie commented Dec 30, 2018

I don't think it's possible to use Ncpus with your neat install2.r script. Would you be open to a PR to include this functionality?

@eddelbuettel

This comment has been minimized.

Copy link
Owner

eddelbuettel commented Dec 30, 2018

That sounds like a great idea, and should be pretty simple. Both -n are -p are open. Preferences?

@csgillespie

This comment has been minimized.

Copy link
Contributor Author

csgillespie commented Dec 30, 2018

My preference would be for -n. What do you think about having a slight tweak to the input? I quite often set Ncpus equal to the number of cores. This would translate to:

  • Default: n = 1 -> (Ncpus = 1)
  • n = X -> (Ncpus = X)
  • n = 0 -> Ncpus = max(1, parallel::detectCores())
@eddelbuettel

This comment has been minimized.

Copy link
Owner

eddelbuettel commented Dec 30, 2018

I like the option of 'maxing out' but I am not sure I like zero for that special value. Hm. Minus one? NA? NULL?

I am indifferent between -n or -p (for "proceses") or even -c (for "cores"). All suitable.

@csgillespie

This comment has been minimized.

Copy link
Contributor Author

csgillespie commented Dec 30, 2018

I quite like the -c and -1 combination.

Do you have a preference for the default? Either Ncpus = 1 or Ncpus = getOption("Ncpus", 1L) to match install.packages()

@eddelbuettel

This comment has been minimized.

Copy link
Owner

eddelbuettel commented Dec 30, 2018

I like Ncpus = getOption("Ncpus", 1L) as we can sail with the wind of that existing option.

@eddelbuettel

This comment has been minimized.

Copy link
Owner

eddelbuettel commented Dec 30, 2018

Main question now if we call it -c for cores or cpus when we both know it processes. Maybe coroutines ;-)

@csgillespie

This comment has been minimized.

Copy link
Contributor Author

csgillespie commented Dec 30, 2018

That just occurred to me as I wrote the doc string. Back to -n ?

Also, I'll go for NULL as that matches (sort of) the repo logic

@eddelbuettel

This comment has been minimized.

Copy link
Owner

eddelbuettel commented Dec 30, 2018

Sure.

@eddelbuettel

This comment has been minimized.

Copy link
Owner

eddelbuettel commented Dec 30, 2018

Looks good. Thanks for catching the rogue = which is definitely bad style :)

I am wondering if we should fold the two branches into one. Ie detect "getOption" as you do but then default to detect the system cores as a fallback, ie

if (opt$ncpus == "getOption") {
  # parallel comes with R 2.14+
  opt$ncpus <- getOption("Ncpus", max(1L, parallel::detectCores()))
}

Thoughts? It may well be too aggressive as it would hog all cores unless told otherwise...

Similarly, two lines in the --help text is one more than every other option gets and something I'll probably clean up. Installing Rcpp as an illustration is also weird as that one does not have that many dependencies (ie: none) to benefit from Ncpus which after all only benefits multiple packages at once. Maybe ggplot2 or lme4 or ... ?

@csgillespie

This comment has been minimized.

Copy link
Contributor Author

csgillespie commented Dec 30, 2018

  • Rcpp: I didn't check the number of dep'. Just assumed it was > 1. Changing to ggplot2 makes sense.

  • If we fold the branches, we will change the default behaviour of install.packages() and also the current behaviour of install2.r (as you note). I'm happy with this change, and can't imagine why it's bad. To make it less aggressive, we could use max(1, parallel::detectCores() - 1)

  • Two lines in help: Yep I didn't like that either, but it's a side effect of having a "special" option.

@eddelbuettel

This comment has been minimized.

Copy link
Owner

eddelbuettel commented Dec 30, 2018

I think that is still too much. We share a big box at work with 36 hyperthreaded cores. If someone uses it there... For example mclapply() does a simple getOption("mc.cores", 2L) giving us 2. install.packages() defaults to 1 unless told otherwise.

We'll make it one line in --help but describe it in the main help body.

@csgillespie

This comment has been minimized.

Copy link
Contributor Author

csgillespie commented Dec 31, 2018

Updated the PR:

  • Rcpp -> ggplot2
  • Removed one of the examples in help
@eddelbuettel

This comment has been minimized.

Copy link
Owner

eddelbuettel commented Dec 31, 2018

It's good. And the double branch is probably also what we need to set options properly. No need to get too cute.

One type as you use two hyphens: --n where it should be one -n. I will clean that up.

@eddelbuettel

This comment has been minimized.

Copy link
Owner

eddelbuettel commented Dec 31, 2018

Also four space indent not two but all minor -- taken care of now.

Thanks again for this. Very nice enhancement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment