Skip to content

Conversation

@hjelmn
Copy link
Member

@hjelmn hjelmn commented Oct 21, 2015

This commit fixes the following bugs:

  • On send failure release newly allocated message.
  • In the destructor for udcm_message_sent_t always remove the send
    timeout event from the event base. Failure to do this can lead to
    memory corruption since the destructor may be called from an event
    callback.

Signed-off-by: Nathan Hjelm hjelmn@lanl.gov

This commit fixes the following bugs:

 - On send failure release newly allocated message.

 - In the destructor for udcm_message_sent_t always remove the send
   timeout event from the event base. Failure to do this can lead to
   memory corruption since the destructor may be called from an event
   callback.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
@hjelmn
Copy link
Member Author

hjelmn commented Oct 21, 2015

@miked-mellanox This fixes a long-outstanding issue and is targeted for 1.10.1 and 2.0.0. Without this fix I see random udcm crashes when I hit the CM with lots of requests.

@hjelmn
Copy link
Member Author

hjelmn commented Oct 21, 2015

I should also add that this is a quick fix for a real issue. I am rewriting the connection part of the openib btl to be truly asynchronous. This is needed to support truly passive one-sided communication.

hjelmn added a commit that referenced this pull request Oct 22, 2015
@hjelmn hjelmn merged commit 0b9a0c2 into open-mpi:master Oct 22, 2015
jsquyres pushed a commit to jsquyres/ompi that referenced this pull request Aug 23, 2016
Fix the 2.0 branch segfaults on finalize - we need to be in the same …
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant