Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nested remote Biocparallel #1

Open
hummuscience opened this issue Feb 6, 2018 · 2 comments
Open

nested remote Biocparallel #1

hummuscience opened this issue Feb 6, 2018 · 2 comments

Comments

@hummuscience
Copy link

It is me again :)

Does the nesting of the futures also work with BiocParallel.FutureParam like doFuture and Future does?

Does this locally:

library("BiocParallel")
register(MulticoreParam(36))

Translate to this when working with a remote machine:

library("BiocParallel.FutureParam")
register(FutureParam())
plan(list(tweak(remote, workers = "monster"), multicore))

With the same syntax of %->% to "peel off" futures?

@HenrikBengtsson
Copy link
Owner

Hey and congrats to the first issue posted here :)

Correct, the problem is the same as you observed with doFuture; the %<-% / future() functions of the future package does not know about the doFuture+foreach and the BiocParallel.FutureParam+BiocParallel frameworks. It only passes down the plan() stack, but it does not re-register the backends with foreach and BiocParallel per se. The workaround for now is to do this manually, which is less than ideal.

Example

Here is a minimal example showing what the problem currently is and the manual workaround:

> library("BiocParallel.FutureParam")
> register(FutureParam())
> plan(multisession)

# The BiocParallel back-end that master will use
> bpparam()
class: FutureParam
  bpisup: TRUE; bpnworkers: 4; bptasks: 0; bpjobname: BPJOB
  bplog: FALSE; bpthreshold: INFO; bpstopOnError: TRUE
  bptimeout: 2592000; bpprogressbar: FALSE
  bplogdir: NA


# The BiocParallel back-end that a worker will use
> bp %<-% bpparam()
> bp
class: MulticoreParam
  bpisup: FALSE; bpnworkers: 1; bptasks: 0; bpjobname: BPJOB
  bplog: FALSE; bpthreshold: INFO; bpstopOnError: TRUE
  bptimeout: 2592000; bpprogressbar: FALSE
  bpRNGseed: 
  bplogdir: NA
  bpresultdir: NA
  cluster type: FORK

Manual workaround:

> bp %<-% { register(FutureParam()); bpparam() }
class: FutureParam
  bpisup: TRUE; bpnworkers: 4; bptasks: 0; bpjobname: BPJOB
  bplog: FALSE; bpthreshold: INFO; bpstopOnError: TRUE
  bptimeout: 2592000; bpprogressbar: FALSE
  bplogdir: NA

Action

My plan is to add a mechanism to the future package allowing doFuture, BiocParallel.FutureParam, (your favorite higher-level future orchestration API here) to add an "onEntry" hook function that will be called whenever %<-% / future() happens. With this, then doFuture and BiocParallel.FutureParam can automatically get what they need in nested calls.

@hummuscience
Copy link
Author

Just to confirm, after fiddling around a little to understand what you meant, this code worked :)

This is an example using DESeq2, an RNAseq analysis package that is very often used. Sweeet!

library("BiocParallel.FutureParam")

plan(list(tweak(remote, workers = "monster"), multicore))

{ register(FutureParam()); MulticoreParam() }

dds.animals %<-% DESeq(dds.animals,parallel = TRUE)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants