Skip to content

Conversation

@kawashima-fj
Copy link
Member

@hjelmn Could you review?

Corresponding master PR: #3489.

(cherry picked from commit e453e42)

`ompi_group_t::grp_proc_pointers[i]` may have sentinel values even
for processes which reside in the local node because the array for
`MPI_COMM_WORLD` is set up before `ompi_proc_complete_init`, which
allocates `ompi_proc_t` objects for processes reside in the local
node, is called in `MPI_INIT`. So using `ompi_proc_is_sentinel`
against `ompi_group_t::grp_proc_pointers[i]` in order to determine
whether the process resides in a remote node is not appropriate.

This bug sometimes causes an `MPI_ERR_RMA_SHARED` error when
`MPI_WIN_ALLOCATE_SHARED` is called, where sm OSC uses
`ompi_group_have_remote_peers`.

Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
(cherry picked from commit e453e42)
@kawashima-fj kawashima-fj added this to the v2.0.3 milestone May 16, 2017
@kawashima-fj kawashima-fj requested a review from hjelmn May 16, 2017 02:30
@jsquyres
Copy link
Member

@bwbarrett @hppritcha Merging this PR means that you also need to merge the v3.0.x equivalent (#3530).

But otherwise: @hppritcha good to go

@jsquyres jsquyres merged commit 7a18858 into open-mpi:v2.0.x May 18, 2017
@kawashima-fj kawashima-fj deleted the pr/v2.0.x/group-remote-peers branch May 22, 2017 04:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants