You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fatal if unable to load script or env when building a launch job msg.
If a file that slurmctld wrote disappears (job script or job environment),
then the StateSaveLocation is in unexpected state and Slurm should fatal out
immediately before corrupting anything else. Give user the option to
explicitly ignore the error at startup at the loss of the job.
Bug 7783.
/* fatal or kill the job as it can never be recovered */
2144
+
if (!ignore_state_errors)
2145
+
fatal("%s: %s for %pJ. Check file system serving StateSaveLocation as that directory may be missing or corrupted. Start with '-i' to ignore this error and kill the afflicted jobs.",
2146
+
__func__, fail_why, job_ptr);
2147
+
2148
+
error("%s: %s for %pJ. %pJ will be killed due to system error.",
2149
+
__func__, fail_why, job_ptr, job_ptr);
2150
+
xfree(job_ptr->state_desc);
2151
+
job_ptr->state_desc=xstrdup(fail_why);
2152
+
job_ptr->state_reason_prev=job_ptr->state_reason;
2153
+
job_ptr->state_reason=FAIL_SYSTEM;
2154
+
slurm_free_job_launch_msg(launch_msg_ptr);
2155
+
/* ignore the return as job is in an unknown state anyway */
0 commit comments