Skip to content

Conversation

@bwbarrett
Copy link
Member

We almost had it. I have great faith we won't need an rc7.

Signed-off-by: Brian Barrett bbarrett@amazon.com

We almost had it.  I have great faith we won't need an rc7.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
@angainor
Copy link

angainor commented May 3, 2018

@bwbarrett @xinzhao3 @jladd-mlnx I looked at rc6, and it's fine on both our systems when it comes to OSC.

However, on the FDR system there is still the problem with OpenSHMEM, as briefly discussed with @jsquyres in #5094. That is, in 3.1.0 the ucx SPML got top priority of 21 and hence is selected by default. This causes an abort on the ConnectX-3, so you might want to decrease ucx priority there also. I can file a separate bug report if you want to, but it's probably not worth it..

$ shmemrun -np 2 ./a.out 
[login-0-0.local:24269] Error spml_ucx.c:293 - mca_spml_ucx_add_procs() ucp_ep_create failed: Destination is unreachable
[login-0-0:24269] *** Process received signal ***
[login-0-0:24269] Signal: Aborted (6)
[login-0-0:24269] Signal code:  (-6)
[login-0-0.local:24265] Error spml_ucx.c:293 - mca_spml_ucx_add_procs() ucp_ep_create failed: Destination is unreachable
[login-0-0:24265] *** Process received signal ***
[login-0-0:24265] Signal: Aborted (6)
[login-0-0:24265] Signal code:  (-6)
[1525339056.999016] [login-0-0:24269:0]         select.c:312  UCX  ERROR no atomic operations on registered memory transport to <no debug data>: Unsupported operation
[1525339056.999016] [login-0-0:24265:0]         select.c:312  UCX  ERROR no atomic operations on registered memory transport to <no debug data>: Unsupported operation

@bwbarrett
Copy link
Member Author

@angainor can you file an issue rather than comment on this PR? This PR doesn't really have anything to do with release status, so your comment is likely to get lost.

@bwbarrett bwbarrett merged commit 85653cf into open-mpi:v3.1.x May 3, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants