-
Notifications
You must be signed in to change notification settings - Fork 68
btl/openib: fix rdmacm hang #1251
Conversation
This commit is an attempt to fix a hang in finalize of rdmacm. This fixes a path where no rdmacm client is found for an endpoint. Fixes open-mpi/ompi#1829 Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov> (cherry picked from commit open-mpi/ompi@960fcd2) Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
|
:bot🏷️bug |
|
OMPIBot error: User bharatpotnuri is not valid for issue 1251. |
|
@bharatpotnuri Please verify the fix and reply with |
|
bah, the bot pulled the +1 out of that :bot:nolabel:reviewed |
|
:bot:nolabel:pushed-back |
|
Test FAILed. |
|
Jenkins found a bug that looks like it has been in rdmacm for some time. It was hidden by us disabling the openib btl when thread multiple was in use. Should pass Jenkins now. |
|
👍 |
|
Test FAILed. |
|
Test FAILed. |
|
Build Failed with GNU compiler! Please review the log, and get in touch if you have questions. |
|
Build Failed with XL compiler! Please review the log, and get in touch if you have questions. |
|
I don't know why the IBM CI tests are not showing up in the list, but here are the failure logs: |
|
@jjhursey There was a typo that I corrected before the IBM results came back. Probably why they are not showing up. |
|
Test FAILed. |
|
Test FAILed. |
|
Mellanox failure is fixed by #1249. Both together should make Jenkins happy. |
|
Per discussion on the webex today, @hjelmn will separate this into two PRs:
|
|
@jsquyres Removed the threading bug fix. Now only has the regression fix. |
|
@hjelmn Thanks |
|
Test FAILed. |
This commit is an attempt to fix a hang in finalize of rdmacm. This fixes
a path where no rdmacm client is found for an endpoint.
Fixes open-mpi/ompi#1829
Signed-off-by: Nathan Hjelm hjelmn@lanl.gov
(cherry picked from commit open-mpi/ompi@960fcd2)
Signed-off-by: Nathan Hjelm hjelmn@lanl.gov