You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Nov 13, 2019. It is now read-only.
[efranz@ada7 ood_core]$ bsub -m curie < hello.sh
Verifying job submission parameters...
Verifying project account...
Account to charge: 082810573256
Balance (SUs): 4999.8694
SUs to charge: 5.3333
Job <7274791> is submitted to default queue <curie_devel>.
[efranz@ada7 ood_core]$ bjobs -m curie 7274791
Job <7274791> is not found on host/group <curie>
[efranz@ada7 ood_core]$ bjobs -m curie 7274791
Job <7274791> is not found on host/group <curie>
[efranz@ada7 ood_core]$ bjobs -m curie 7274791
JOBID STAT USER QUEUE JOB_NAME NEXEC_HOST SLOTS RUN_TIME TIME_LEFT
7274791 RUN efranz curie_deve helloWorld 1 16 0 second(s) 0:20 L
[efranz@ada7 ood_core]$
How do you reliably tell the difference between the job that has not yet appeared in the queue and a job that has failed or completed and thus exited the queue? This is about how we submit and report the status of all of our jobs.
A solution:
The text was updated successfully, but these errors were encountered:
The LSF adapter's "id" could be a string with metadata attached to it, including the submission date. So then the adapter can enforce the above statemachine by expanding the else block in this code. The problem is that any app that displays the "id" would need to "pretty print" the id. Do we add a ppid method to the base adapter?
Following on the goals of the first option, we change "id" from a string to a value object that implements to_s, to_str, etc. The value object can store extra information the adapter needs, for state. We would however, need to design the serialization of this information when storing the id in the database.
Another option is when submitting a job, after calling bsub, the adapter itself calls bjobs to verify it is in the system. If not, it waits, then checks again. The checking would just be for the state of "delaying" the return of the submit method.
Of course if this took too long it could time out the request... we should see what a type of delay is expected.
The problem:
A solution:
The text was updated successfully, but these errors were encountered: