You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've noticed that jobs queue either to trc or normal. You might consider queueing both to normal and trc where possible (ie for time limits <= 2 days) and to owners for short running or checkpointable jobs (although requeuing may require some modification to the control flow logic in preprocess.py).
In general this would just be a convenience to shorten queue times and load-balance, aside from one scenario: submissions to normal are limited when the global number of cpus in use for a group exceeds 512. group partitions and owners are unaffected by cpu limits.
For instance, in this scenario yandan's moco jobs won't execute until a number of other jobs finish, but could execute immediately on trc.
Weird policy, but according to Killian "Owner groups are expected to mainly submit jobs to their own partition, as well as to the owners partition that offers them a very large pool of resources for free."
The text was updated successfully, but these errors were encountered:
On Tue, Aug 30, 2022 at 7:53 PM Andrew Berger ***@***.***> wrote:
cc: brainsss1
I've noticed that jobs queue either to trc or normal. You might consider
queueing both to normal and trc where possible (ie for time limits <= 2
days) and to owners for short running or checkpointable jobs (although
requeuing may require some modification to the control flow logic in
preprocess.py).
In general this would just be a convenience to shorten queue times and
load-balance, aside from one scenario: submissions to normal are limited
when the *global* number of cpus in use for a group exceeds 512. group
partitions and owners are unaffected by cpu limits.
For instance, in this scenario yandan's moco jobs won't execute until a
number of other jobs finish, but could execute immediately on trc.
[image: Screen Shot 2022-08-30 at 4 59 18 PM]
<https://user-images.githubusercontent.com/8816362/187581030-a4392e52-5f04-4e5d-a721-382da5211391.png>
Weird policy, but according to Killian "Owner groups are expected to
mainly submit jobs to their own partition, as well as to the owners
partition that offers them a very large pool of resources for free."
—
Reply to this email directly, view it on GitHub
<#36>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAGUVEGUFBA4ACHKQTBBO23V33CJ5ANCNFSM6AAAAAAQA6536M>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
--
Russell A. Poldrack
Albert Ray Lang Professor of Psychology
Associate Director, Stanford Data Science
Director, SDS Center for Open and Reproducible Science
Building 420
Stanford University
Stanford, CA 94305
***@***.*** ***@***.***>
http://www.poldracklab.org/
cc: brainsss1
I've noticed that jobs queue either to trc or normal. You might consider queueing both to normal and trc where possible (ie for time limits <= 2 days) and to owners for short running or checkpointable jobs (although requeuing may require some modification to the control flow logic in preprocess.py).
In general this would just be a convenience to shorten queue times and load-balance, aside from one scenario: submissions to normal are limited when the global number of cpus in use for a group exceeds 512. group partitions and owners are unaffected by cpu limits.
For instance, in this scenario yandan's moco jobs won't execute until a number of other jobs finish, but could execute immediately on trc.
Weird policy, but according to Killian "Owner groups are expected to mainly submit jobs to their own partition, as well as to the owners partition that offers them a very large pool of resources for free."
The text was updated successfully, but these errors were encountered: