Skip to content

Conversation

@rhc54
Copy link
Contributor

@rhc54 rhc54 commented Apr 11, 2017

Also, harden the schizo stubs by checking for NULL in that field and returning an error as this should never happen.

Refs #3247

Signed-off-by: Ralph Castain rhc@open-mpi.org

@rhc54 rhc54 added the bug label Apr 11, 2017
@rhc54 rhc54 added this to the v2.1.1 milestone Apr 11, 2017
@rhc54 rhc54 requested a review from jsquyres April 11, 2017 14:59
@rhc54
Copy link
Contributor Author

rhc54 commented Apr 11, 2017

@lee218llnl This should at least fix the segfault - please check and let us know

@lee218llnl
Copy link

@rhc54 I can confirm that this does appear to fix the seg fault in #3247. The issue in #3243 still persists. Thanks.

@hppritcha
Copy link
Member

blocker to get in to 2.1.1

orte_schizo_base_active_module_t *mod;

if (NULL == personality) {
opal_output(0, "NULL PERSONALITY");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a debugging output that was accidentally left in here? (I see 2 more similar calls to opal_output below, too)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can make it an error log - I just wanted something to tell us that a condition which should never occur happened so we can fix it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, thanks.

General question: It looks like these are assert-like things, right? I.e., these should never happen / are only there as a failsafe, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes - would you prefer they be an assert? Might help debugging, I suppose...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah -- you can do it however you want. 😄 I only used "assert" to convey the spirit of what I was thinking this was. @bosilca would probably hate us putting an assert() in the middle of the code. 😉

But then again, if this case happens, will returning an error code up the stack make it exit cleanly? Or will it just cause a segv later, anyway?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to just error_log it and return the error. Since we return an error if no module can support that operation, this is no worse than the normal error path.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@jsquyres
Copy link
Member

jsquyres commented Apr 21, 2017

@hppritcha Good to go when CI finishes.

…ity field in the orte_job_t. Also, harden the schizo stubs by checking for NULL in that field and returning an error as this should never happen.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
@hppritcha hppritcha merged commit 3a0171d into open-mpi:v2.x Apr 21, 2017
@rhc54 rhc54 deleted the cmr2x/dbgr branch May 31, 2017 14:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants