Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rml/oob: check peer param in send methods #357

Merged
merged 1 commit into from
Jan 22, 2015

Conversation

hppritcha
Copy link
Member

The rml/oob was not doing sanity checks on the input peer
parameter for the orte_rml_oob_send_nb and orte_rml_oob_send_buffer_nd.
Owing to the fact that there are places in the ompi/orte stack
where things like orte_show_help_norender are called way before
ORTE_PROC_MY_HNP, etc. are setup properly, all kinds of weird
startup failures can occur as the rml/oob tries to process send
requests where the peer is junk.

Rather than try to expand this kind of thing:

/* if we are the HNP, or the RML has not yet been setup,
 * or ROUTED has not been setup,
 * or we weren't given an HNP, or we are running in standalone
 * mode, then all we can do is process this locally
 */
if (ORTE_PROC_IS_HNP || orte_standalone_operation ||
    NULL == orte_rml.send_buffer_nb ||
    NULL == orte_routed.get_route ||
    NULL == orte_process_info.my_hnp_uri) {
    rc = show_help(filename, topic, output, ORTE_PROC_MY_NAME);
}

do the right thing in the rml level and return an error rather than
eventually failing in the send owing to peer not being valid.

@jsquyres please review

The rml/oob was not doing sanity checks on the input peer
parameter for the orte_rml_oob_send_nb and orte_rml_oob_send_buffer_nd.
Owing to the fact that there are places in the ompi/orte stack
where things like orte_show_help_norender are called way before
ORTE_PROC_MY_HNP, are setup properly, all kinds of weird
startup failures can occur as the rml/oob tries to process send
requests where the peer is junk.

Rather than try to expand this kind of thing:

    /* if we are the HNP, or the RML has not yet been setup,
     * or ROUTED has not been setup,
     * or we weren't given an HNP, or we are running in standalone
     * mode, then all we can do is process this locally
     */
    if (ORTE_PROC_IS_HNP || orte_standalone_operation ||
        NULL == orte_rml.send_buffer_nb ||
        NULL == orte_routed.get_route ||
        NULL == orte_process_info.my_hnp_uri) {
        rc = show_help(filename, topic, output, ORTE_PROC_MY_NAME);
    }

do the right thing in the rml level and return an error rather than
eventually failing in the send owing to peer not being valid.
@mellanox-github
Copy link

Refer to this link for build results (access rights to CI server needed):
http://bgate.mellanox.com/jenkins/job/gh-ompi-master-pr/185/
Test PASSed.

rhc54 pushed a commit that referenced this pull request Jan 22, 2015
rml/oob: check peer param in send methods
@rhc54 rhc54 merged commit ccb0374 into open-mpi:master Jan 22, 2015
@hppritcha hppritcha deleted the topic/oob_send_params_check branch May 15, 2015 03:37
jsquyres pushed a commit to jsquyres/ompi that referenced this pull request Sep 21, 2016
dong0321 pushed a commit to dong0321/ompi that referenced this pull request Feb 19, 2020
Ensure the spawning process always is released
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants