pattern to `rmprocs` when there's no more work to do #87

ericphanson · 2021-09-13T13:36:17Z

This isn't strictly a K8sClusterManagers.jl issue, but @omus pointed me here :).

I was running hyperparameter optimization on a model using @phyperopt from Hyperopt.jl with pmap=Parallelism.robust_pmap from Parallelism.jl. I would spin up the desired number of workers with addprocs, then essentially call pmap via these abstractions, and then that's it. When the pmap is done, the manager writes out a summary and exits, and all the processors are released.

I wanted to train 20 models this way quickly, so I did this with 20 workers and left them to train. However, some finished much faster than others, and those processors were left idling. Since this is via k8s, if we killed them, we could have in-scaled and saved lots of resources.

It would be great to have something like pmap that could automatically remove processors when they were no longer needed.

The text was updated successfully, but these errors were encountered:

ericphanson · 2021-09-13T16:51:20Z

Thinking about this slightly more, I think a nice "inversion of control" here is that the ideal pmap could return workers to the pool (in fact, I think it already does), and the pool could decide to remove idle workers. (Perhaps the pool would wait a minute or two and then if they are still idle, rm them).

ericphanson mentioned this issue Sep 20, 2021

safe-to-evict annotation should be true by default beacon-biosignals/julia_pod#13

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pattern to `rmprocs` when there's no more work to do #87

pattern to `rmprocs` when there's no more work to do #87

ericphanson commented Sep 13, 2021 •

edited

ericphanson commented Sep 13, 2021

pattern to rmprocs when there's no more work to do #87

pattern to rmprocs when there's no more work to do #87

Comments

ericphanson commented Sep 13, 2021 • edited

ericphanson commented Sep 13, 2021

pattern to `rmprocs` when there's no more work to do #87

pattern to `rmprocs` when there's no more work to do #87

ericphanson commented Sep 13, 2021 •

edited