consider queueing to multiple partitions where possible #36

rueberger · 2022-08-31T02:53:08Z

cc: brainsss1

I've noticed that jobs queue either to trc or normal. You might consider queueing both to normal and trc where possible (ie for time limits <= 2 days) and to owners for short running or checkpointable jobs (although requeuing may require some modification to the control flow logic in preprocess.py).

In general this would just be a convenience to shorten queue times and load-balance, aside from one scenario: submissions to normal are limited when the global number of cpus in use for a group exceeds 512. group partitions and owners are unaffected by cpu limits.

For instance, in this scenario yandan's moco jobs won't execute until a number of other jobs finish, but could execute immediately on trc.

Weird policy, but according to Killian "Owner groups are expected to mainly submit jobs to their own partition, as well as to the owners partition that offers them a very large pool of resources for free."

poldrack · 2022-09-01T16:13:44Z

this does sound like a good idea, though it will make testing rather more complex...

…

On Tue, Aug 30, 2022 at 7:53 PM Andrew Berger ***@***.***> wrote: cc: brainsss1 I've noticed that jobs queue either to trc or normal. You might consider queueing both to normal and trc where possible (ie for time limits <= 2 days) and to owners for short running or checkpointable jobs (although requeuing may require some modification to the control flow logic in preprocess.py). In general this would just be a convenience to shorten queue times and load-balance, aside from one scenario: submissions to normal are limited when the *global* number of cpus in use for a group exceeds 512. group partitions and owners are unaffected by cpu limits. For instance, in this scenario yandan's moco jobs won't execute until a number of other jobs finish, but could execute immediately on trc. [image: Screen Shot 2022-08-30 at 4 59 18 PM] <https://user-images.githubusercontent.com/8816362/187581030-a4392e52-5f04-4e5d-a721-382da5211391.png> Weird policy, but according to Killian "Owner groups are expected to mainly submit jobs to their own partition, as well as to the owners partition that offers them a very large pool of resources for free." — Reply to this email directly, view it on GitHub <#36>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAGUVEGUFBA4ACHKQTBBO23V33CJ5ANCNFSM6AAAAAAQA6536M> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

-- Russell A. Poldrack Albert Ray Lang Professor of Psychology Associate Director, Stanford Data Science Director, SDS Center for Open and Reproducible Science Building 420 Stanford University Stanford, CA 94305 ***@***.*** ***@***.***> http://www.poldracklab.org/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

consider queueing to multiple partitions where possible #36

consider queueing to multiple partitions where possible #36

rueberger commented Aug 31, 2022

poldrack commented Sep 1, 2022 via email

consider queueing to multiple partitions where possible #36

consider queueing to multiple partitions where possible #36

Comments

rueberger commented Aug 31, 2022

poldrack commented Sep 1, 2022 via email