-
Notifications
You must be signed in to change notification settings - Fork 932
Fix a segfault when starting under a debugger by setting the personality field in the orte_job_t. #3323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@lee218llnl This should at least fix the segfault - please check and let us know |
|
blocker to get in to 2.1.1 |
| orte_schizo_base_active_module_t *mod; | ||
|
|
||
| if (NULL == personality) { | ||
| opal_output(0, "NULL PERSONALITY"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a debugging output that was accidentally left in here? (I see 2 more similar calls to opal_output below, too)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can make it an error log - I just wanted something to tell us that a condition which should never occur happened so we can fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, thanks.
General question: It looks like these are assert-like things, right? I.e., these should never happen / are only there as a failsafe, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes - would you prefer they be an assert? Might help debugging, I suppose...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nah -- you can do it however you want. 😄 I only used "assert" to convey the spirit of what I was thinking this was. @bosilca would probably hate us putting an assert() in the middle of the code. 😉
But then again, if this case happens, will returning an error code up the stack make it exit cleanly? Or will it just cause a segv later, anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to just error_log it and return the error. Since we return an error if no module can support that operation, this is no worse than the normal error path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
|
@hppritcha Good to go when CI finishes. |
…ity field in the orte_job_t. Also, harden the schizo stubs by checking for NULL in that field and returning an error as this should never happen. Signed-off-by: Ralph Castain <rhc@open-mpi.org>
Also, harden the schizo stubs by checking for NULL in that field and returning an error as this should never happen.
Refs #3247
Signed-off-by: Ralph Castain rhc@open-mpi.org