Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

worker "pool" for nested paralellization #361

Open
epruesse opened this issue Feb 21, 2020 · 2 comments
Open

worker "pool" for nested paralellization #361

epruesse opened this issue Feb 21, 2020 · 2 comments
Labels
Backend API Part of the Future API that only backend package developers rely on feature request

Comments

@epruesse
Copy link
Contributor

If I understand correctly, plan(tweak(multicore, workers=8)) means that the first nesting level gets 8 parallel threads and the second nesting level gets no parallelism. I could hard-allocate threads to each level, but that's hard to do since it means I have to know all thread usages down the tree of packages.

What I'm looking for is a "worker pool" like implementation. A naive greedy allocation using a semaphore that decrements every time a thread is forked off would be a good start. So that if I have a loop of three calling a package that has uses future.apply on a huge vector but takes very long to even get there, the NN workers can be busy for as much of the time as possible.

Interaction with in particular OMP is a problem of course. A lot of things seem to use that. IRC, Intel TBB auto-detects the number of "useful" threads to use and adjusts this value as it goes based on system load. Something like this would need extra house keeping, but the concept of "don't start more threads if all my workers/cpus are busy", or even "don't start more threads if we are at XY% memory" would be very useful to robustly run things in parallel.

@HenrikBengtsson
Copy link
Owner

If I understand correctly, plan(tweak(multicore, workers=8)) means that the first nesting level gets 8 parallel threads and the second nesting level gets no parallelism. I could hard-allocate threads to each level, but that's hard to do since it means I have to know all thread usages down the tree of packages.

Correct x 2.

What I'm looking for is a "worker pool" like implementation. So that if I have a loop of three calling a package that has uses future.apply on a huge vector but takes very long to even get there, the NN workers can be busy for as much of the time as possible.

I'm not sure I fully understand, but I can guess what you're after. Basically, if you do:

a <- future_lapply(x, function(y) {
   future_lapply(y, function(z)) {
      ...
   })
})

you want the inner and the outer "loops" to be able to pull from the same pool of "workers", correct?

This is available if you use an external job scheduler such as those available in HPC environment. Then you could use:

plan(list(outer = batchtools_slurm, inner = batchtools_slurm))

Both layers will submit their jobs (=futures) to the same job queue and it's up to the job scheduler to allocate resources as they get available.

Try to implement something similar in R is tedious but should be doable. Maybe one could build upon Gábor Csárdi's work in Multi Process Task Queue in 100 Lines of R Code, 2019-09-09. But, point is, this is not really something that should be implemented in the future package. Instead, it should/could be added asa new type of backend that futures can rely on - think:

library(future.taskqueue)
plan(list(outer=taskqueue, inner=taskqueue))
...

The future.tests package can be used to validate that it is properly implemented and meets the requirements of the future framework.

@HenrikBengtsson HenrikBengtsson added the Backend API Part of the Future API that only backend package developers rely on label Mar 22, 2020
@epruesse
Copy link
Contributor Author

Yes, that's what I meant. Though I was thinking less about nested loops in client code that are known to the user and easily configured with plan(list(...)), but about the levels hidden in library code. The docs tell package authors to stay away from plan, so I was initially assuming that there would be some kind of queue dealing with levels of nesting hidden from me.

That would be my main argument for allowing a simple queue scheduler into future - it's the simplest approach to arrive at "least surprising" behavior. A fully featured scheduler is clearly out of scope. The more packages use future themselves, though, the more complicated it becomes for the end user to set the right plan everywhere.

Another argument might be that future would be the place to place a call that can say something like "use up to 4 threads here". The knowledge what degree of parallelism is beneficial sits within the package (and preferably not in the vignette), and would ideally be hidden from consuming client code.

(I wish I could promise a PR, but it would be easier to promise that I'll never find the time...).

@HenrikBengtsson HenrikBengtsson changed the title worker "pool" for nested paralelization worker "pool" for nested paralellization Dec 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Backend API Part of the Future API that only backend package developers rely on feature request
Projects
None yet
Development

No branches or pull requests

2 participants