Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

batchMapQuick chunks not working (on LSF) #10

Closed
mschubert opened this issue Dec 5, 2013 · 5 comments
Closed

batchMapQuick chunks not working (on LSF) #10

mschubert opened this issue Dec 5, 2013 · 5 comments
Assignees

Comments

@mschubert
Copy link

Consider the function below. Each time I have got 4 jobs that should be put in one chunk, but LSF always submits 4 individual jobs disregarding the chunks I specify.

Note: this works fine when specifying the chunks manually (last example).

square = function(x) { x*x }
library(BatchJobs)

reg = batchMapQuick(square, c(1:4), chunk.size=10)
#Saving conf: /some/dir/bmq_341e69aa7610/conf.RData
#Submitting 4 chunks / 4 jobs.
#Cluster functions: LSF.

reg = batchMapQuick(square, c(1:4), n.chunks=1)
#Saving conf: /some/dir/bmq_3c8c7b37b8cb/conf.RData
#Submitting 4 chunks / 4 jobs.
#Cluster functions: LSF.

reg = makeRegistry(id="BatchJobsExample", file.dir=tempfile(), seed=123)
batchMap(reg, square, c(1:4))
chunked = chunk(getJobIds(reg), n.chunks=1, shuffle=TRUE)
submitJobs(reg, chunked)
#Saving conf: /tmp/RtmpsWXKsa/file6bec917d96c/conf.RData
#Submitting 1 chunks / 4 jobs.
#Cluster functions: LSF.

The problem seems to be lines 55/56 in R/batchMapQuick.R:

if (!missing(chunk.size) && !missing(n.chunks))
    ids = chunk(ids, chunk.size=chunk.size, n.chunks=n.chunks, shuffle=TRUE)

Here, the conditions are only true if both chunk.size and n.chunks are given. It should rather be something like:

if (!missing(chunk.size) && !missing(n.chunks))
    stop("Providing both chunk.size and n.chunks makes no sense")
if (!missing(chunk.size))
    ids = chunk(ids, chunk.size=chunk.size, shuffle=TRUE)
if (!missing(n.chunks))
    ids = chunk(ids, n.chunks=n.chunks, shuffle=TRUE)
@ghost ghost assigned berndbischl Dec 5, 2013
@berndbischl
Copy link
Contributor

Micheal,

thank you for the detailed report. Looking into it right now.

@mschubert
Copy link
Author

I'm trying to make your life easier because I will submit at least one more ;-)

Looking at this again, it might be sufficient to change line 55 to:

if (!missing(chunk.size) || !missing(n.chunks))

The chunk error handling seems to already be provided by the chunk() function.

@berndbischl
Copy link
Contributor

Yup, I already did that.

Trying to figure out how to add a unit test for that, but will push soon.

And keep the feedback coming, it's appreciated.

@berndbischl
Copy link
Contributor

It is unit-tested and pushed.
You can close if it works for you.

@mschubert
Copy link
Author

Yep, that seems to work. Cool, thanks for fixing it so quickly! :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants