Different plan()s for different futures #181

wlandau-lilly · 2017-11-30T21:06:15Z

Related: ropensci/drake#169. It would be amazing if a single call to future_lapply() could distribute simultaneous futures over a list of alternative pre-built plan()s. I am not quite sure about the interface, but I can picture how 5 futures might run on a local machine and another 5 might simultaneously go to SLURM.

library(future.batchtools)
plans <- list(plan1 = plan(multicore), plan2 = plan(batchtools_slurm(...)))
plan_map <- rep(c("plan1", "plan2"), each = 5)
future_lapply(
  X = 1:10,
  FUN = sqrt,
  plans = plans,
  plan_map = plan_map
)

It may seem silly to juggle plans in a single call to future_lapply(), but it would be a huge help for drake.

The text was updated successfully, but these errors were encountered:

HenrikBengtsson · 2017-12-01T06:28:40Z

This is an interesting idea. If I understand you correctly, you're looking for a "super" backend where we can throw in all your available future backends and treat it as one big pool of compute resources. I'm happy that you've allowed yourself to even consider the possibility of such a setup - I take it as the Future API has lots of potentials and many yet to be discovered :)

This is somewhat related to (non-official) ideas I have where the type of future to be used is not fixed when the Future object is created, but when it is launched. If that would be in place, one could imagine initiating a set of (lazy) futures that are ready to be launched. Only at the time of launch, the plan() settings come in play. Then it should also be possible to switch between different backends when a future is launched one after the other.

The closest we get to this today is that of a parallel-package cluster (plan(cluster, ...)) which I think could consist of a heterogeneous set of local and remote workers of different types (e.g. PSOCK, FORK, ...) - though I don't think many have used them that way. A poor man's version of the above could be to leverage such clusters. The idea would be to create another type of cluster node type, e.g. batchtools_node. Such cluster nodes could even be used in calls such as parallel::parLapply(cl, ...) calls. ... and soon we're about to reinvent the original train-of-thoughts behind the Future API. A bit inception-ish.

wlandau-lilly · 2017-12-01T15:39:15Z

Yes, that is the gist. I would like to have several simultaneous plans and send any future to any plan in any order without duplicated overhead. I am a bit concerned about lumping all the plans together in a single overarching batchtools_node() plan or something similar because I want to micromanage which futures go to which plans.

What about evaluators? What is the relationship between evaluators and plans? I tried going around future_lapply(), but it is not quite working. In development going forward, what would need to happen to extend future_lapply() this way?

HenrikBengtsson · 2017-12-01T21:49:02Z

I'm quite swamped now so I've unfortunately don't have much time to dive into your code, but is you're goal to be able to distribute work/tasks to different types of compute resources? For instance, some tasks (=futures) you'd like to run on a local machine, some on high-memory machines, and others on a small set of machines that have a certain NFS folder mounted? If so, I'm considering an Extended Future API (#172) that will support optional and/or mandatory resource requests in some standardized fashion, e.g.

f <- future({ ... }, requires = c("mount:/data/folder/", "R (>= 3.3.0)"))

and then there will be a generic underlying framework that will make sure that future will be launched on a backend worker that meets those requirements. Obviously, there's lots of work to get there. Before getting there, I am prioritizing formalizing the Core Future API and provide a generic conformance test framework such that any/all future backends can be validated against this Core Future API. I anticipate that this work will help define and explore what the Extended Future API could look like.

About future.lapply(): Any improvements will be done in a new future.apply package (Issue #159), because it is actually above and beyond what the future package should provide (which is the core Future API). I'm not sure when I'll have time to launch future.apply. It might be that what you're looking for fits in such a package, but it might also be that it's something different, just as the foreach and BiocParallel framework differs from future.apply yet being related.

wlandau-lilly · 2017-12-02T02:46:06Z

Distributing different workers/tasks to different types of computer resources is exactly what I am looking for. I had hoped to accomplish this by assigning different evaluators to different futures. Whether through future_lapply() or not, is this possible given the current state of the future package, or is it something that needs to wait until the possible Extended Future API? Just knowing that much will help me in the short term.

wlandau · 2018-02-09T05:37:21Z

Update: going forward, I think I will be more focused on individual futures than future_lapply(). I think I will supply non-default values to the evaluator argument of future(). Have you known many people to deploy futures this way? Should the evaluator always be an output object from plan(...)?

HenrikBengtsson added the pkg/future.apply label Dec 2, 2017

wlandau-lilly mentioned this issue Dec 7, 2017

Distribute different tasks to different types of compute resources HenrikBengtsson/future.apply#2

Open

wlandau-lilly changed the title ~~Multiple plans in future_lapply()~~ Different plan()s for different futures. Dec 7, 2017

wlandau-lilly changed the title ~~Different plan()s for different futures.~~ Different plan()s for different futures Dec 7, 2017

wlandau mentioned this issue Dec 6, 2018

Process targets and imports in the same graph ropensci/drake#598

Closed

HenrikBengtsson mentioned this issue Dec 6, 2018

Suggestion: future::with_plan() and/or future::local_plan() #263

Open

wlandau mentioned this issue Dec 7, 2018

config for batchtools_sge? HenrikBengtsson/future.batchtools#26

Open

HenrikBengtsson mentioned this issue Apr 16, 2019

Worker selection #301

Open

HenrikBengtsson added the feature/resources label Dec 25, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Different plan()s for different futures #181

Different plan()s for different futures #181

wlandau-lilly commented Nov 30, 2017

HenrikBengtsson commented Dec 1, 2017

wlandau-lilly commented Dec 1, 2017

HenrikBengtsson commented Dec 1, 2017

wlandau-lilly commented Dec 2, 2017

wlandau commented Feb 9, 2018

Different plan()s for different futures #181

Different plan()s for different futures #181

Comments

wlandau-lilly commented Nov 30, 2017

HenrikBengtsson commented Dec 1, 2017

wlandau-lilly commented Dec 1, 2017

HenrikBengtsson commented Dec 1, 2017

wlandau-lilly commented Dec 2, 2017

wlandau commented Feb 9, 2018