Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle walltime units uniformly. #209

Open
gdevenyi opened this issue May 16, 2020 · 1 comment
Open

Handle walltime units uniformly. #209

gdevenyi opened this issue May 16, 2020 · 1 comment

Comments

@gdevenyi
Copy link
Member

SLURM decided to be brain-dead and default to using minutes for unformatted numerical --time option, instead of seconds, like SGE/PBS/Torque.

This means we need to be explicit about the supported input formats for walltime.

Torque says: http://docs.adaptivecomputing.com/torque/3-0-5/2.1jobsubmission.php

walltime	seconds, or [[HH:]MM:]SS	Maximum amount of real time during which the job can be in the running state.

SGE says: https://linux.die.net/man/1/sge_types

time_specifier

A time specifier either consists of a positive decimal, hexadecimal or octal integer constant, in which case the value is interpreted to be in seconds, or is built by 3 decimal integer numbers separated by colon signs where the first number counts the hours, the second the minutes and the third the seconds. If a number would be zero it can be left out but the separating colon must remain (e.g. 1:0:1 = 1::1 means 1 hours and 1 second).

SLURM says: https://slurm.schedmd.com/sbatch.html

-t, --time=<time>
Set a limit on the total run time of the job allocation. If the requested time limit exceeds the partition's time limit, the job will be left in a PENDING state (possibly indefinitely). The default time limit is the partition's default time limit. When the time limit is reached, each task in each job step is sent SIGTERM followed by SIGKILL. The interval between signals is specified by the Slurm configuration parameter KillWait. The OverTimeLimit configuration parameter may permit the job to run longer than scheduled. Time resolution is one minute and second values are rounded up to the next minute.
A time limit of zero requests that no time limit be imposed. Acceptable time formats include "minutes", "minutes:seconds", "hours:minutes:seconds", "days-hours", "days-hours:minutes" and "days-hours:minutes:seconds".
@gdevenyi
Copy link
Member Author

Proposed solution

  1. Take input, check if it is valid format for specified job submission system, if so pass it through
  2. Also support "h", "m", "s" individual suffixes for numbers
  3. If a flat number is provided, assume seconds, convert for slurm to minutes, print warning if slurm, add more explicit documentation for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant