Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistent use: Section 3.8.4 and 3.9. - p. 79 - Rationale for Start and Cancel #270

Open
wesbland opened this issue Feb 19, 2020 · 6 comments
Assignees
Labels
errata Errata items for the previous MPI Standard mpi-5 For inclusion in the MPI 5.0 standard wg-p2p Point-to-Point Working Group
Projects
Milestone

Comments

@wesbland
Copy link
Member

wesbland commented Feb 19, 2020

Problem

There is a comment in MPI_Cancel about why a pointer to MPI_Request is passed. This same logic applies to MPI_Start, but is not mentioned there. Should it be for consistency (and they could cross-reference each other)?

Edit:
During the virtual meeting of 4th Nov 2020, @RolfRabenseifner correctly pointed out that the request parameter for the MPI_Start procedure is described as INOUT in the language independent specification, which means that the rationale that applies to MPI_Cancel cannot be applied verbatim to MPI_Start.

It is, arguably, an error that the request parameter for the MPI_Start procedure is described as INOUT in the language independent specification.

Suggested Fix

First, change the LIS description to specify the request as IN instead of INOUT.

Then,
copy the rationale from MPI_Cancel to the description of MPI_Start.

References

First attempted fix: https://github.com/mpi-forum/mpi-standard/pull/301 (closed)
Latest attempted fix: https://github.com/mpi-forum/mpi-standard/pull/619

@wesbland wesbland added editor pass wg-p2p Point-to-Point Working Group and removed mpi-4.x labels Feb 19, 2020
@wesbland
Copy link
Member Author

wesbland commented Apr 8, 2020

Probably an errata. Confirmation needed:

  • Editor - 0
  • Chapter committee - 3
  • Errata - 3
  • Full Proposal - 1

@wesbland wesbland added this to To Be Classified in Editor Pass Status Apr 13, 2020
@wesbland wesbland added the Chapter Committee Change Changes to be made by the respective chapter committee(s) label Apr 22, 2020
@wesbland wesbland moved this from To Be Classified to Awaiting Implementation in Editor Pass Status Apr 22, 2020
@wesbland wesbland added this to To Do in MPI 4.0 Ratification Oct 21, 2020
@dholmes-epcc-ed-ac-uk dholmes-epcc-ed-ac-uk moved this from Awaiting Implementation to Awaiting Informal Reading in Editor Pass Status Oct 21, 2020
@dholmes-epcc-ed-ac-uk dholmes-epcc-ed-ac-uk moved this from To Do to In Progress (Chapter Committee Changes) in MPI 4.0 Ratification Oct 21, 2020
@wesbland wesbland added mpi-4.1 For inclusion in the MPI 4.1 standard and removed mpi-4.0 labels Nov 4, 2020
@wesbland wesbland removed this from In Progress (Chapter Committee Changes) in MPI 4.0 Ratification Nov 4, 2020
@dholmes-epcc-ed-ac-uk dholmes-epcc-ed-ac-uk added errata Errata items for the previous MPI Standard and removed Chapter Committee Change Changes to be made by the respective chapter committee(s) editor pass labels Nov 5, 2020
@dholmes-epcc-ed-ac-uk dholmes-epcc-ed-ac-uk removed this from Awaiting Informal Reading in Editor Pass Status Nov 5, 2020
@wesbland wesbland added this to To Do in MPI 4.1 Jun 9, 2021
@wesbland wesbland moved this from To Do to In Progress in MPI 4.1 Jul 21, 2021
@Wee-Free-Scot Wee-Free-Scot added this to the September 2021 milestone Aug 18, 2021
@abouteiller
Copy link
Member

After reviewing the code, we have found that Open MPI actually uses request substituting during start, and has been doing so for many years without any user complaining. See https://github.com/open-mpi/ompi/blob/master/ompi/mca/pml/ob1/pml_ob1_start.c

The scenario where this is used is the following:

  • when starting a SEND, buffered mode sends are immediately complete (from the MPI layer perspective), while they continue to remain active from the MPI engine perspective.
  • This is achieved by substituting the original send request with a new (inactive) copy of the request, so that it can report MPI complete in wait, and be started again immediately if persistent;
  • Meanwhile the original send-request continues to be used internally to track internal completion of message fragments by the engine

Obviously this could be implemented some other way, but this is beside the point. The fox is out of the bag, and has been for many years. This code has been present in Open MPI for at least 5 years (possibly many more but I'm not going to use SVN).

Thus, the proposed change is incompatible with existing state of the practice. It would be a very bad idea to make it an errata, and we should maybe even reconsider altogether under the new information that this has been common practice in one of the major MPI implementation to use that feature for years.

@Wee-Free-Scot
Copy link

Oh.

@Wee-Free-Scot
Copy link

@abouteiller thanks for taking the time to find this implementation example. I agree that this code in Open MPI changes our perspective and restricts the scope of appropriate responses to the discrepancy between the standardised definitions of MPI_Start and MPI_Cancel.

@tonyskjellum
Copy link

tonyskjellum commented Sep 12, 2021 via email

@wesbland wesbland removed this from the September 2021 milestone Sep 13, 2021
@wesbland wesbland added this to the December 2021 milestone Sep 13, 2021
@Wee-Free-Scot
Copy link

@jdinan I'd like to review this issue in the HACC WG before the next voting meeting.

I just asked myself the question "will a replacement request handle need to be re-exported for use on a device?"

MPI_Request req;
MPI_Psend_init(..., &req); // MPI creates request A and returns a handle to it
MPI_Prequest_create(req, &preq); // MPI exports request A for use on a device
loop {
    MPI_Start(&req); // MPI creates request B and returns a handle to it (request A is consumed by MPI, preq is stale)
    :
    <<<kernel code>>>{ MPI_Pready(..., preq); } // error or no-op or UB? use of stale handle to request A
    :
    MPI_Wait(&req); // deadlock because user is waiting for request B to complete but there are no call to MPI_Pready yet
}
MPI_Request_free(&req); // only executes when loop doesn't

I'm thinking the HACC WG needs to get behind the side of this argument that prevents replacement of the request during MPI_Start or needs to point out that the request must be re-exported after every call to MPI_Start.

@Wee-Free-Scot Wee-Free-Scot added mpi-5 For inclusion in the MPI 5.0 standard and removed mpi-4.1 For inclusion in the MPI 4.1 standard labels Jul 20, 2022
@wesbland wesbland removed this from In Progress in MPI 4.1 Oct 19, 2022
@wesbland wesbland added this to To Do in MPI 5.0 via automation Oct 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
errata Errata items for the previous MPI Standard mpi-5 For inclusion in the MPI 5.0 standard wg-p2p Point-to-Point Working Group
Projects
Status: To Do
MPI 5.0
To Do
Development

No branches or pull requests

5 participants