Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MPI_Request_free bad advice to users #143

Closed
mpiforumbot opened this issue Jul 24, 2016 · 20 comments
Closed

MPI_Request_free bad advice to users #143

mpiforumbot opened this issue Jul 24, 2016 · 20 comments

Comments

@mpiforumbot
Copy link
Collaborator

mpiforumbot commented Jul 24, 2016

Originally by erezh on 2009-03-25 12:37:01 -0500


Modify MPI_Request_free advice to users## Background

The MPI_Request_free mechanism was provided for reasons of performance and convenience on the sending side (e.g., the user does not have to remember to free the request later). However the advice to users is erroneous and conflicts with implementations, leading to bad memory access.

Advice to user quote: (MPI 2.1 page 55)

-Advice to users.* Once a request is freed by a call to MPI_REQUEST_FREE, it is not possible to check for the successful completion of the associated communication with calls to MPI_WAIT or MPI_TEST. Also, if an error occurs subsequently during the communication, an error code cannot be returned to the user — such an error must be treated as fatal. _Questions arise as to how one knows when the operations have completed when using MPI_REQUEST_FREE. Depending on the program logic, there may be other ways in which the program knows that certain operations have completed and this makes usage of MPI_REQUEST_FREE practical. For example, an active send request could be freed when the logic of the program is such that the receiver sends a reply to the message sent — the arrival of the reply informs the sender that the send has completed and the send buffer can be reused._ An active receive request should never be freed as the receiver will have no way to verify that the receive has completed and the receive buffer can be reused. (End of advice to users.)

The suggestion to reuse (free) the buffer once a reply arrived seems straight forward, however it is a naïve one which might lead to access violation at best or worse data corruption. When zero copy is being use the local interconnect resource manager might still be using the send buffer even though a reply message has been received. One alternative to enable that programming paradigm is always to copy the buffer (or the tail of the buffer) when using MPI_Isend. This will prevent any misuse of the user buffer. Consider the following examples:

-Example 1: TCP interconnect*[[BR]]
Rank 0 on node A sends message x to rank 1 on node B using the following sequence,

#!c
MPI_Isend(buffer, &request)
MPI_Request_free(request)

Upon receive message x, rank 1 sends message y back to rank 0; when rank 0 receive message y it frees the buffer with the following sequence

#!c
MPI_Recv(params)
free(buffer)

This would result in access violation (seg fault) as the TCP stack still tries to touch buffer after it was freed. This would happen because although node B sends back message y, it did not piggyback the TCP acknowledgment seq numbers back with message x. As a result message y was consumed by the application and buffer was freed. Hence if the TCP stack on node A tries to resend buffer tail resulting in access violation (or memory leak, depend on the TCP stack impl.). (note that node A sent message x using zcopy)

-Example 2: TCP interconnect (2 connections)*[[BR]]
To make it easier to understand think about the above same problem, but now there are two TCP connections each to only deliver messages in one direction and TCP acknowledgment in the other. This setting decouples the reply message from the TCP acknowledgment and makes the previous example easier to competence.

-Example 3: RMA interconnect (using RMA write)*[[BR]]
In this case rank 0 on node A issue its MPI_Isend using RDMA Write. The receiver is polling on memory detecting that the write complete and sends a reply back. The reply message bypasses the hardware acknowledgment that the write was successfully complete. Rank 0 processes it and the app frees the memory which disrupts the DMA on node A and causes the send to fail.

Proposal

Remove the advice to user to reuse the buffer once a reply has arrived. There is no safe way to reuse the buffer (free), overwrite is somewhat safer.

Specifically, Page 55, lines 12-25, replace the entire advice to users with the following:

-Advice to users.* Once a request is freed by a call to MPI_REQUEST_FREE, it is not possible to check for the successful completion of the associated communication with calls to MPI_WAIT or MPI_TEST. Also, if an error occurs subsequently during the communication, an error code cannot be returned to the user — such an error must be treated as fatal. An active receive request should never be freed as the receiver will have no way to verify that the receive has completed and the receive buffer can be reused. (End of advice to users.)

Impact on existing implementations

None

Impact on applications

None; however applications are discouraged from freeing an active send request.

Entry for the Change Log

Section 3.7.3 on page 55.[[BR]]
The advice to free an active request was removed in the Advice to users for MPI_REQUEST_FREE.

@mpiforumbot
Copy link
Collaborator Author

Originally by htor on 2009-03-29 17:42:44 -0500


Reviewed - ok (there are some uncritical typos like "discureged" and "depricated"). Adding Hubert to CC as he might have an opinion to this.

Best,
Torsten

@mpiforumbot
Copy link
Collaborator Author

Originally by erezh on 2009-03-29 19:58:05 -0500


fixing typos

@mpiforumbot
Copy link
Collaborator Author

Originally by rsthakur on 2009-03-30 20:58:45 -0500


Looks ok to me. Adding Pavan to the CC list as he may know better.

@mpiforumbot
Copy link
Collaborator Author

Originally by balaji on 2009-03-30 21:07:43 -0500


Looks good to me.

@mpiforumbot
Copy link
Collaborator Author

Originally by hubertritzdorf on 2009-04-01 14:05:54 -0500


The proposal looks good to me. But I think that the
change log must contain the section and page number.

Should the "deprecated" in the change log mean that MPI_Request_free
should be removed in future MPI versions ? In this case, it should
be mentioned also in the corresponding section.

@mpiforumbot
Copy link
Collaborator Author

Originally by rsthakur on 2009-04-01 14:15:04 -0500


May be it doesn't need a change log because nothing has been changed other than deleting a few sentences in the advice to users? The text of the proposal doesn't say anything is deprecated.

@mpiforumbot
Copy link
Collaborator Author

Originally by rlgraham on 2009-04-04 17:38:42 -0500


This change is correct, and looks good to me.

@mpiforumbot
Copy link
Collaborator Author

Originally by gropp on 2009-04-05 17:56:18 -0500


Reviewed and ok. I added the page/line numbers of the change.

@mpiforumbot
Copy link
Collaborator Author

Originally by erezh on 2009-04-06 15:37:14 -0500


have 4 reviewers

@mpiforumbot
Copy link
Collaborator Author

Originally by rsthakur on 2009-06-09 11:52:47 -0500


Updated Change Log entry to more accurately reflect the change.

@mpiforumbot
Copy link
Collaborator Author

Originally by RolfRabenseifner on 2009-06-12 10:09:18 -0500


Erez and Rajeev,

your change-lkog entry does not fit to the required form.

I propose:

Section 3.7.3 on page 52.[[BR]]
The Advice to Users for MPI_REQUEST_FREE was modified, especially advices about freeing an active request were deleted.

@mpiforumbot
Copy link
Collaborator Author

Originally by erezh on 2009-06-12 10:47:27 -0500


updating the log entry per Rolf's comment.

@mpiforumbot
Copy link
Collaborator Author

Originally by RolfRabenseifner on 2009-08-03 12:35:14 -0500


Attachment added: change-log_ticket143_item7.pdf (424.5 KiB)

@mpiforumbot
Copy link
Collaborator Author

Originally by rlgraham on 2009-08-10 19:03:46 -0500


Changes implemented

@mpiforumbot
Copy link
Collaborator Author

Originally by traff on 2009-08-21 09:52:21 -0500


PDF for this ticket is missing. Change seems to have been implemented correctly

@mpiforumbot
Copy link
Collaborator Author

Originally by RolfRabenseifner on 2009-08-29 18:41:16 -0500


PDF review: okay.

@mpiforumbot
Copy link
Collaborator Author

Originally by erezh on 2009-08-29 20:42:43 -0500


I can't find the PDF implementing this ticket.

@mpiforumbot
Copy link
Collaborator Author

Originally by hubertritzdorf on 2009-09-01 11:37:36 -0500


PDF Review in master document: OK

@mpiforumbot
Copy link
Collaborator Author

Originally by gropp on 2009-09-02 14:33:08 -0500


Reviewed in master copy and ok.

@mpiforumbot
Copy link
Collaborator Author

Originally by jsquyres on 2010-09-18 05:08:05 -0500


This ticket is (long-since) complete; marking it resolved/text committed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant