Skip to content

Commit

Permalink
Only trigger job failed to start once
Browse files Browse the repository at this point in the history
Trigger the "job failed to start" state only when the
first process to do so reports. This avoids a "bounce"
effect that causes the job object to be multiply
released.

Signed-off-by: Ralph Castain <rhc@pmix.org>
  • Loading branch information
rhc54 committed Feb 16, 2024
1 parent c4b95f3 commit a386514
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion src/mca/errmgr/dvm/errmgr_dvm.c
Original file line number Diff line number Diff line change
Expand Up @@ -486,14 +486,14 @@ static void proc_errors(int fd, short args, void *cbdata)
PRTE_FLAG_SET(jdata, PRTE_JOB_FLAG_ABORTED);
/* kill the job */
_terminate_job(jdata->nspace);
PRTE_ACTIVATE_JOB_STATE(jdata, PRTE_JOB_STATE_FAILED_TO_START);
}
/* if this was a daemon, report it */
if (PMIX_CHECK_NSPACE(jdata->nspace, PRTE_PROC_MY_NAME->nspace)) {
/* output a message indicating we failed to launch a daemon */
pmix_show_help("help-errmgr-base.txt", "failed-daemon-launch",
true, prte_tool_basename);
}
PRTE_ACTIVATE_JOB_STATE(jdata, PRTE_JOB_STATE_FAILED_TO_START);
break;

case PRTE_PROC_STATE_CALLED_ABORT:
Expand Down

0 comments on commit a386514

Please sign in to comment.