Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

add possibility to **ply to pass options for backend when working in parallel #84

Closed
ruderphilipp opened this Issue Jun 3, 2012 · 3 comments

Comments

Projects
None yet
2 participants

Hi,
I currently use plyr for doing big parallel jobs on our university's HPC. Unfortunately, some entries of the input data set need much longer than the others. Since each CPU gets the same amount of input values to work on, in bad cases some batches do not finish before timeout.

I did some reseach and at least the multicore package provides an option to disable pre-scheduling, e.g.:

a <- foreach(i=1:10, .options.multicore=list(preschedule=FALSE), .combine=c) %dopar% #...

Would it be possible to add some additional parameter to pass through an optional list of options for foreach?

For a code walkthrough of the callstack and what I want to do see https://gist.github.com/2863086

Originally, I wanted to provide a fix via pull request since it seemed not so difficult to add. But until now I had no success to pass a list of options directly into the foreach() call.

What I tried so far:

# correct call that I want to reproduce with a dynamic list of options
a <- foreach(i=1:10, .options.multicore=list(preschedule=FALSE), .combine=c)
# returns always NULL when used in %dopar% call
a <- foreach(i=1:10, NULL, .combine=c)
# foreach sees options vector as input (like i)
o <- c(.options.multicore=list(preschedule=FALSE), .options.abcdef=list(aaa=TRUE))
a <- foreach(i=1:10, o)
a <- foreach(i=1:10, .options=o)
a <- foreach(list(i=1:10, .options=o))
# foreach sees options list as input (like i)
o <- list(multicore=list(preschedule=FALSE), abcdef=list(aaa=TRUE))
a <- foreach(i=1:10, .options=o)
# foreach recognizes options but puts them under "test"
# This is useless for the purpose of providing options for other backends
a <- foreach(i=1:10, .options.test=o)

Any idea how to "flatten" the options list in the method call so that it appears to R as if I entered the parameters directly (like in the first call in the code)?

Owner

hadley commented Oct 8, 2012

What if I made .parallel that option? So if !identical(.parallel, FALSE) then parallel would be turned on.

I think the easiest way to put the options in the right place is to construct the call by hand:

i <- seq_len(10)
options <- list(.options.multicore=list(preschedule=FALSE), .combine=c)
call <- as.call(c(list(as.name("foreach"), i = i), options))
eval(call)
Owner

hadley commented Oct 8, 2012

Now I'm thinking .paropts would be a better name.

@hadley hadley added a commit that referenced this issue Oct 11, 2012

@hadley hadley Add .paropts parameter to llply. #84 eb432b6

@hadley hadley closed this in a3c8618 Oct 11, 2012

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment