-
Notifications
You must be signed in to change notification settings - Fork 117
Description
Playing a bit around with Trque, one thing that initially bothered my was that there is now equivalent to sacct that we use for reliable job monitoring with Slurm. There is a tracejob command which provides the job history, but this is meant for admin usage only. However, there is a good thing with Torque that may allow us to have reliable job monitoring using the qstat command (the equivalent of Slurm's squeue). The job's standard output/error files are not created immediately in the directory the job was launched in, but rather they are copied there after the job has finished. As a matter of fact there is even a special command (qpeek) for looking at the output of a running job. So this feature (of lazy creation of job output) could be used as the regression's synchronisation point.
Internal issue: https://madra.cscs.ch/scs/reframe/issues/215