Skip to content

Conversation

@kawashima-fj
Copy link
Member

@bosilca Please review.

ompi/mca/pml/bfo is not included in this commit because it is dropped in v2.0.x.

Signed-off-by: KAWASHIMA Takahiro t-kawashima@jp.fujitsu.com
(back-ported from commit 6510800)

According to the MPI-3.1 p.52 and p.53 (cited below), a request
created by `MPI_*_INIT` but not yet started by `MPI_START` or
`MPI_STARTALL` is inactive therefore `MPI_WAIT` or its friends
must return immediately if such a request is passed.

The current implementation hangs in `MPI_WAIT` and its friends
in such case because a persistent request is initialized as
`req_complete = REQUEST_PENDING`. This commit fixes the
initialization.

Also, this commit fixes internal requests used in `MPI_PROBE`
and `MPI_IPROBE` which was marked wrongly as persistent.

MPI-3.1 p.52:

We shall use the following terminology: A null handle is a handle
with value MPI_REQUEST_NULL. A persistent request and the handle
to it are inactive if the request is not associated with any ongoing
communication (see Section 3.9). A handle is active if it is neither
null nor inactive. An empty status is a status which is set to return
tag = MPI_ANY_TAG, source = MPI_ANY_SOURCE, error = MPI_SUCCESS, and
is also internally configured so that calls to MPI_GET_COUNT,
MPI_GET_ELEMENTS, and MPI_GET_ELEMENTS_X return count = 0 and
MPI_TEST_CANCELLED returns false. We set a status variable to empty
when the value returned by it is not significant. Status is set in
this way so as to prevent errors due to accesses of stale information.

MPI-3.1 p.53:

One is allowed to call MPI_WAIT with a null or inactive request
argument. In this case the operation returns immediately with empty
status.

Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
(back-ported from commit 6510800)
@kawashima-fj kawashima-fj added this to the v2.0.2 milestone Dec 9, 2016
@kawashima-fj kawashima-fj requested a review from bosilca December 9, 2016 06:46
@kawashima-fj
Copy link
Member Author

bot:ibm:retest

@jjhursey
Copy link
Member

jjhursey commented Dec 9, 2016

IBM CI failure are due to an unexpected networking issue on the relay system. Please disregard for now.

@hppritcha
Copy link
Member

bot:mlnx:retest

@jjhursey
Copy link
Member

jjhursey commented Dec 9, 2016

bot:ibm:retest

@hppritcha
Copy link
Member

bot:mlnx:retest

@hppritcha
Copy link
Member

Mellanox unresponsive. going ahead and merging.

@hppritcha hppritcha merged commit 497ad83 into open-mpi:v2.0.x Dec 12, 2016
@kawashima-fj kawashima-fj deleted the pr/v2.0.x/inactive-persistent-request branch December 12, 2016 03:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants