Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

job.delay on torque clusters #121

Closed
sorhawell opened this issue Dec 2, 2016 · 3 comments
Closed

job.delay on torque clusters #121

sorhawell opened this issue Dec 2, 2016 · 3 comments

Comments

@sorhawell
Copy link

Hi we're a couple of people using BatchJobs on our universitys torque cluster, thx alot for the package.
Now the support department would like us to reduce the speed of job submissions to like only one every 2 seconds or so. In submitJobs(reg,ids,job.delay), I can change the job.delay function, and it works well in an interactive session. However when the cluster function is torque, all jobs a submitted at a constant speed of 10 jobs per second, the job.delay function seems to be ignored completely.

How can I make BatchJobs delay the job submission?

printing from R linux session on torque cluster, job.delays is ignored and jobs are submitted really fast

> library(BatchJobs)
Loading required package: BBmisc                                                                                                                       Sour             cing configuration file: '/zhome/c7/0/66069/R/x86_64-pc-linux-gnu-library/3.2/BatchJobs/etc/BatchJobs_global_config.R'
Sourcing configuration file: '/zhome/c7/0/66069/.BatchJobs.R'
BatchJobs configuration:
  cluster functions: Torque
  mail.from:
  mail.to:
  mail.start: none
  mail.done: none
  mail.error: none
  default.resources:
  debug: FALSE
  raise.warnings: FALSE
  staged.queries: TRUE
  max.concurrent.jobs: Inf
  fs.timeout: NA

> reg = makeRegistry("blop")
Creating dir: /zhome/c7/0/66069/tmp/JnWaL2CbhKJ2/blop-files
Saving registry: /zhome/c7/0/66069/tmp/JnWaL2CbhKJ2/blop-files/registry.RData
> ids = batchMap(reg,function(x) x+1, 1:24)
Adding 24 jobs to DB.
> submitJobs(reg,1:24,job.delay = function(n,i) 2)
Saving conf: /zhome/c7/0/66069/tmp/JnWaL2CbhKJ2/blop-files/conf.RData
Submitting 24 chunks / 24 jobs.
Cluster functions: Torque.
Auto-mailer settings: start=none, done=none, error=none.
Writing 24 R scripts...
SubmitJobs |+++++++++++++++++++++++++++++++++++++++++++++++++| 100% (00:00:00)
Sending 24 submit messages...
Might take some time, do not interrupt this!



printing from a interactive session (windows PC), there is a delay of of 2 seconds before each submission(and execution)

> install.packages("BatchJobs")
also installing the dependency ‘RSQLite’

trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.3/RSQLite_1.1.zip'
Content type 'application/zip' length 1966046 bytes (1.9 MB)
downloaded 1.9 MB

trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.3/BatchJobs_1.6.zip'
Content type 'application/zip' length 416942 bytes (407 KB)
downloaded 407 KB

package ‘RSQLite’ successfully unpacked and MD5 sums checked
package ‘BatchJobs’ successfully unpacked and MD5 sums checked

The downloaded binary packages are in
	C:\Users\sowe\AppData\Local\Temp\RtmpsPkzxB\downloaded_packages
> library(BatchJobs)
Loading required package: BBmisc
Sourcing configuration file: 'C:/R/R-3.3.1/library/BatchJobs/etc/BatchJobs_global_config.R'
BatchJobs configuration:
  cluster functions: Interactive
  mail.from: 
  mail.to: 
  mail.start: none
  mail.done: none
  mail.error: none
  default.resources: 
  debug: FALSE
  raise.warnings: FALSE
  staged.queries: TRUE
  max.concurrent.jobs: Inf
  fs.timeout: NA

Warning message:
package ‘BatchJobs’ was built under R version 3.3.2 
> reg = makeRegistry("blop")
Loading registry: C:/Users/sowe/Documents/GitHub/fastRditijuu/blop-files/registry.RData
> ids = batchMap(reg,function(x) x+1, 1:24)
Error in batchMap(reg, function(x) x + 1, 1:24) : Registry is not empty!
> library(BatchJobs)
> reg = makeRegistry("blop")
Creating dir: C:/Users/sowe/Documents/GitHub/fastRditijuu/blop-files
Saving registry: C:/Users/sowe/Documents/GitHub/fastRditijuu/blop-files/registry.RData
> ids = batchMap(reg,function(x) x+1, 1:24)
Adding 24 jobs to DB.
Warning messages:
1: RSQLite::dbGetPreparedQuery() is deprecated, please switch to DBI::dbGetQuery(params = bind.data). 
2: Named parameters not used in query: fun_id, pars, jobname 
3: Named parameters not used in query: job_def_id, seed 
> submitJobs(reg,1:24,job.delay = function(n,i) 3)
Saving conf: C:/Users/sowe/Documents/GitHub/fastRditijuu/blop-files/conf.RData
Submitting 24 chunks / 24 jobs.
Cluster functions: Interactive.
Auto-mailer settings: start=none, done=none, error=none.
Writing 24 R scripts...
SubmitJobs |+                                                                     |   0% (00:00:00)SubmitJobs |+                                                                     |   0% (00:00:00)SubmitJobs |+++                                                                   |   4% (00:01:32)SubmitJobs |++++++                                                                |   8% (00:01:28)SubmitJobs |+++++++++                                                             |  12% (00:01:24)SubmitJobs |++++++++++++                                                          |  17% (00:01:20)SubmitJobs |+++++++++++++++                                                       |  21% (00:01:16)SubmitJobs |++++++++++++++++++                                                    |  25% (00:01:12)SubmitJobs |++++++++++++++++++++                                                  |  29% (00:01:08)SubmitJobs |+++++++++++++++++++++++                                               |  33% (00:01:04)SubmitJobs |++++++++++++++++++++++++++                                            |  38% (00:01:00)SubmitJobs |+++++++++++++++++++++++++++++                                         |  42% (00:00:56)SubmitJobs |++++++++++++++++++++++++++++++++                                      |  46% (00:00:52)SubmitJobs |+++++++++++++++++++++++++++++++++++                                   |  50% (00:00:48)SubmitJobs |++++++++++++++++++++++++++++++++++++++                                |  54% (00:00:43)SubmitJobs |+++++++++++++++++++++++++++++++++++++++++                             |  58% (00:00:39)SubmitJobs |++++++++++++++++++++++++++++++++++++++++++++                          |  62% (00:00:35)SubmitJobs |+++++++++++++++++++++++++++++++++++++++++++++++                       |  67% (00:00:31)SubmitJobs |++++++++++++++++++++++++++++++++++++++++++++++++++                    |  71% (00:00:27)SubmitJobs |++++++++++++++++++++++++++++++++++++++++++++++++++++                  |  75% (00:00:24)SubmitJobs |+++++++++++++++++++++++++++++++++++++++++++++++++++++++               |  79% (00:00:20)SubmitJobs |++++++++++++++++++++++++++++++++++++++++++++++++++++++++++            |  83% (00:00:16)SubmitJobs |+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++         |  88% (00:00:12)SubmitJobs |++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++      |  92% (00:00:07)SubmitJobs |+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++   |  96% (00:00:03)SubmitJobs |++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++| 100% (00:00:00)
Sending 24 submit messages...
Might take some time, do not interrupt this!



@mllg
Copy link
Member

mllg commented Dec 2, 2016

The delay triggers a call to Sys.sleep() on the nodes before the job gets executed. What you want is a delay on the master. In interactive sessions the master which submits the jobs also is the node, this is why you see this argument "working" in this setup.

If you are able to switch over to batchtools, you can archive a delay before each submit by setting a hook in your cluster function constructor. Your config would then include something like

cluster.functions = makeClusterFunctionsTorque([your stuff here], hooks = list(pre.submit = function(...) Sys.sleep(2))

Hope this helps.

@sorhawell
Copy link
Author

sorhawell commented Dec 5, 2016

ok thanks alot

makeClusterFunctionsTorque does not have hooks, however I can change that afterwards
I inspect makeClusterFunctionsTorque and I cannot understand how you handle the template, it seem you read the template file once template = cfReadBrewTemplate(template, "##"), and then drop it as it is not returned by the function. How is the template then added to the registry? How is the template used when submitting jobs? Does the current version of batchtools rather rely on that batchtools:::findTemplateFile always recover some template file later?

... and I can then post that @ batchtools...

@mllg
Copy link
Member

mllg commented Dec 5, 2016

I'll answer this in the batchtools repo.

@mllg mllg closed this as completed Dec 5, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants