Skip to content

Conversation

@ggouaillardet
Copy link
Contributor

…and cbfunc.buffer

@ggouaillardet
Copy link
Contributor Author

@rhc54
a rml_send_t created in mca_oob_tcp_recv_handler has both NULL iov and cbfunc.buffer and that caused a crash when ORTE_RML_SEND_COMPLETE is invoked.

i am not sure whether this is the right fix, or it fixes a case that should never happen.
fwiw, i was only able to run into this when invoking mpirun on a x86_64 host, and run the ibm/dynamic/spawn program on a sparcv9 host (!)

@rhc54
Copy link
Contributor

rhc54 commented Apr 26, 2016

It shouldn't happen, but that is a very unusual use-case, so maybe it somehow gets invoked in a strange way. Regardless, there is no harm in this protection.

@rhc54 rhc54 merged commit 9511e38 into open-mpi:master Apr 26, 2016
@ggouaillardet
Copy link
Contributor Author

I think the root cause is mca_oob_{usock,tcp}_send_handler do not correctly handle zero size messages.
I think we should test
if (NULL != msg->msg->iov)
instead of
if (NULL != msg->msg->data)
I will test that from tomorrow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants