Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chap.2 errata: Correcting description of nonblocking #451

Open
mpiforumbot opened this issue Jul 24, 2016 · 21 comments
Open

Chap.2 errata: Correcting description of nonblocking #451

mpiforumbot opened this issue Jul 24, 2016 · 21 comments

Comments

@mpiforumbot
Copy link
Collaborator

mpiforumbot commented Jul 24, 2016

Originally by RolfRabenseifner on 2014-08-13 10:57:48 -0500


Description

The current wording about nonblocking is wrong.
"A request is completed by a call to wait" is wrong for cases as
one-sided communication or split-collective I/O or nonblocking probe.
The proposal tries to resolves the problem of wrong description.

In other locations of the standard it is important that wording
"all nonblocking operations" can be be used instead of listing the special cases.
For the current and future members of the MPI forum, for the implementors
and the users, it is helpful to find a list of the different types of
nonblocking routines.

Therefore, the wrong wording is resolved by describing nonblocking in
its different ways that exist within the MPI standard.

Level is errata quality.

History

This small inconsistency was overseen when finishing MPI-3.0.
This ticket #422 was split into its parts. This ticket is one part B of #422.

Extended Scope

None. (No need to add these changes to the erratas.)

Proposed Solution

-MPI-3.0 page 11 lines 25-35 read*

nonblocking
A procedure is nonblocking if the procedure may return before the operation
completes, and before the user is allowed to reuse resources (such as buffers)
specified in the call. A nonblocking request is started by the call that initiates it,
e.g., MPI_ISEND.
The word complete is used with respect to operations, requests,
and communications.
An operation completes when the user is allowed to reuse
resources, and any output buffers have been updated;
i.e., a call to MPI_TEST will
return flag = true. A request is completed by a call to wait, which returns, or
a test or get status call which returns flag = true. This completing call has two effects:
the status is extracted from the request; in the case of test and wait, if the
request was nonpersistent, it is freed, and becomes inactive if it was persistent.
A communication completes when all participating operations complete.

-but should read*

nonblocking
A procedure is nonblocking if it may return before the associated operation
completes, and before the user is allowed to reuse resources (such as buffers)
specified in the call.
The word complete is used with respect to operations and any associated requests
and/or communications.
An operation completes when the user is allowed to reuse
resources, and any output buffers have been updated.

-Remark: The following textblock shows the same solution, but all changes are highlighted:*

nonblocking
A procedure is nonblocking if the procedure it may return before the associated operation
completes, and before the user is allowed to reuse resources (such as buffers)
specified in the call. A nonblocking request is started by the call that initiates it,
e.g., MPI_ISEND.

The word complete is used with respect to operations~~,~~ and any associated requests~~,~~
and__/or__ communications.
An operation completes when the user is allowed to reuse
resources, and any output buffers have been updated~~;
i.e., a call to MPI_TEST will
return flag = true. A request is completed by a call to wait, which returns, or
a test or get status call which returns flag = true. This completing call has two effects:
the status is extracted from the request; in the case of test and wait, if the
request was nonpersistent, it is freed, and becomes inactive if it was persistent.
A communication completes when all participating operations complete~~.

Alternative Solutions

None.

Impact on Implementations

None required.

Impact on Applications / Users

None.

Entry for the Change Log

None.

Voting category

Single-vote in category MPI-3.0-errata.

Although it is MPI-3.0 errata, i.e. corrections of inconsistencies
in the MPI-3.0 document, it is not necessary to publish this item in
the MPI-3.0 errata document due to the minor priority of this item.
It is enough to have them in the MPI-3.1 document.

@mpiforumbot
Copy link
Collaborator Author

Originally by RolfRabenseifner on 2014-08-13 18:32:34 -0500


Formatting corrected.

@mpiforumbot
Copy link
Collaborator Author

Originally by gropp on 2014-08-14 09:44:37 -0500


There is no need to describe 4 categories of nonblocking routines here - it just makes it harder to maintain the document and the description itself can be misleading. For example, into which of those 4 groups does MPI_Rput belong?

A better fix might be to separate out the text that talks about requests and move that somewhere else, since it isn't relevant to this discussion of nonblocking. That will need to be done in a way that is consistent with the chapter, and should not be done with a tiny snippet of text with no context.

@mpiforumbot
Copy link
Collaborator Author

Originally by RolfRabenseifner on 2014-08-14 10:29:49 -0500


Replying to gropp:

There is no need to describe 4 categories of nonblocking routines here - it just makes it harder to maintain the document and the description itself can be misleading. For example, into which of those 4 groups does MPI_Rput belong?

I expect that MPI_Rput is a request based routine, isn't it?

For me, the existing text does not define "nonblocking" in a way that it is general enough to be correct for all 4 types of nonblocking methods in MPI.

A better fix might be to separate out the text that talks about requests and move that somewhere else, since it isn't relevant to this discussion of nonblocking. That will need to be done in a way that is consistent with the chapter,

Please, can you make an alternative proposal - I was not able to do it in such simple & perfect & consistent & general way. You may inculde it in the alternative solution section
and remove my comment there.

and should not be done with a tiny snippet of text with no context.

The context of my changes is always whole MPI-3.0.
I copied the whole nonblocking section. I did not want to modify other chapters.
Your alternative solution may span more than one section.

For example, the standard has no wording about the nonblocking semantics
of MPI_IMPROBE (or I did not find it), except that the flag may imply that the routine should not wait until MPI_Finalize of the source processes to tell flag=0 that there is no message until MPI_Finalize.

@mpiforumbot
Copy link
Collaborator Author

Originally by jdinan on 2014-09-05 10:32:50 -0500


", i.e., locally and remote finished." should be removed from the RMA text. This is not guaranteed by all synchronization operations (e.g. MPI_WIN_FLUSH_LOCAL).

@mpiforumbot
Copy link
Collaborator Author

Originally by jdinan on 2014-09-05 10:37:48 -0500


It seems like a little bit of a stretch to state that Iprobe and Improbe are procedures that "may return before the operation completes". Is the idea that one has to call these routines repeatedly in order to "complete" the equivalent blocking operation? If that is the case, we could add a sentence to IPROBE/IMPROBE like: "Repeated calls must be made to these functions in order to achieve the same functionality as their blocking counterparts."

@mpiforumbot
Copy link
Collaborator Author

Originally by jsquyres on 2014-09-05 11:03:17 -0500


I agree that the current text is not correct, but I am against enumerating all the possibilities here. This seems like the wrong place for a forward reference to every single possible non-blocking operation that will be described in great detail later.

@mpiforumbot
Copy link
Collaborator Author

Originally by RolfRabenseifner on 2014-09-06 00:56:45 -0500


Replying to jdinan:

", i.e., locally and remote finished." should be removed from the RMA text. This is not guaranteed by all synchronization operations (e.g. MPI_WIN_FLUSH_LOCAL).

Yes, I agree and removed these word from the RMA text (item 2 of the list).

@mpiforumbot
Copy link
Collaborator Author

Originally by RolfRabenseifner on 2014-09-06 01:13:30 -0500


Replying to jdinan:

It seems like a little bit of a stretch to state that Iprobe and Improbe are procedures that "may return before the operation completes". ...

Yes, you are right. I removed Iprobe and Improbe from the list and put them in a note:

Note that MPI_IPROBE and MPI_IMPROBE return immediately, i.e., they are not blocking,
but these routines are not nonblocking routines in the sense of this definition.

@mpiforumbot
Copy link
Collaborator Author

Originally by RolfRabenseifner on 2014-09-06 01:32:09 -0500


Replying to jsquyres:

I agree that the current text is not correct, but I am against enumerating all the possibilities here. This seems like the wrong place for a forward reference to every single possible non-blocking operation that will be described in great detail later.

The problem is, that often in the standard, we need to refere to "nonblocking routines".
Should this mean that the reader of the standard always scans the whole standard to find all locations where a routine has the characteristics of being nonblocking? Often this information is hidden, i.e., without "I" in the name. In case of Iprobe, the I may give a wrong impression. All these nonblocking routines are allowed to modify the application's data while the application is running. We had a lot of Fortran trouble therefore.

A wrong or incomplete definition of nonblocking does not help in these cases. Forum members and implementors must be aware of all these locations before they add new functionality (e.g. with fault tolerance) that has a special relation to nonblocking routines. The Fortran subcommittee has overseen several locations and needed several errata to fix it later.

I did not want to remove the existing text (for backward compatibility reasons).
Therefore the only way of correcting was to

  • to correct the existing text, and
  • to add similar quality for the other two Areas, and
  • to show the difference with Iprobe and Improbe.

@mpiforumbot
Copy link
Collaborator Author

Originally by gropp on 2014-09-06 03:36:29 -0500


In response to

The problem is, that often in the standard, we need to refere to
"nonblocking routines".
Should this mean that the reader of the standard always scans the whole
standard to find all locations where a routine has the characteristics of
being nonblocking?

The answer is yes. This document is a standard, and the reader is expected to be familiar with the entire document. Further, for most users, the question is not "what other nonblocking routines are there" but "what does nonblocking mean?". Finally, scanning the electronic form of the document for nonblocking is easy.

@mpiforumbot
Copy link
Collaborator Author

Originally by RolfRabenseifner on 2014-09-06 08:26:37 -0500


Replying to jsquyres:

I agree that the current text is not correct, but I am against enumerating all the possibilities here. This seems like the wrong place for a forward reference to every single possible non-blocking operation that will be described in great detail later.

and replying to [comment:10 gropp]:

In response to

The problem is, that often in the standard, we need to refere to
"nonblocking routines".
Should this mean that the reader of the standard always scans the whole
standard to find all locations where a routine has the characteristics of
being nonblocking?

The answer is yes. This document is a standard, and the reader is expected to be familiar with the entire document. Further, for most users, the question is not "what other nonblocking routines are there" but "what does nonblocking mean?". Finally, scanning the electronic form of the document for nonblocking is easy.

Yes, scanning of the more than 320 location is easy and the scanning can not catch that he contrary of blocking and nonblocking are not identical.

Yes, the current text is not correct. I hope to see a proposal that is

  • backward compatible with the orrect information of the current text
  • and totally correct.

I would hope that this proposal also may tell in a condensed form that the 320 location of nonblocking are mainly in nonblocking communication (pt-to-pt and collective) and handling (Idup) routines with persistent and nonpersistent request handles, RMA, and split-collective MPI-I/O. The text may also clarify that it is not valid for the not blocking versions of probe.

You may use the section Alternative Solutions.

@mpiforumbot
Copy link
Collaborator Author

Originally by gropp on 2014-09-06 18:20:15 -0500


You are missing the point, Rolf. We agree that the statement needs to be corrected, but several (most?) of us strongly disagree that it should attempt to enumerate all routines to which it applies. I still see no value in doing that, and much unnecessary added overhead in maintaining the document.

@mpiforumbot
Copy link
Collaborator Author

Originally by gropp on 2014-09-07 16:42:39 -0500


Here is my proposed alternative solution. This eliminates the enumeration of nonblocking types, which caused the problem in the first place (i.e., the original definition was correct for MPI-1, but became out of date as more nonblocking routines were added to MPI). This is a clear case where the correct fix is to remove the unnecessary parts of the definition that can go out of date as MPI evolves.

Replace the current description with

nonblocking

    A procedure is nonblocking if the procedure may return before the operation completes, and before the user is allowed to reuse resources (such as buffers) specifed in the call. A nonblocking request is started by the call that initiates it. The word complete is used with respect to operations, requests, and communications. An operation completes when the user is allowed to reuse resources, and any output buffers have been updated.

@mpiforumbot
Copy link
Collaborator Author

Originally by RolfRabenseifner on 2014-09-08 02:48:37 -0500


Bill Gropp's proposal included as Solution A. The existing text is now Solution B.

@mpiforumbot
Copy link
Collaborator Author

Originally by schulzm on 2014-09-15 00:39:48 -0500


Discussion at MPI Forum Meeting in Kobe, Sep. 2014

Majority prefers something in the direction of solution A

Text variant suggested:

A procedure is nonblocking if it may return before the associated operation completes, and before the user is allowed to reuse resources (such as buffers) specified in the call. The word complete is used with respect to operations and any associated requests and/or communications. An operation completes when the user is allowed to reuse resources, and any output buffers have been updated.

Or even shorter to get rid of redundancy (preference of the forum at the end of the discussion):

A procedure is nonblocking if it may return before the associated operation completes, and before the user is allowed to reuse resources (such as buffers) specified in the call.

Reason: requests were only mentioned in one sentence, which stood alone

@mpiforumbot
Copy link
Collaborator Author

Originally by RolfRabenseifner on 2014-09-16 11:42:02 -0500


A procedure is nonblocking if it may return before the associated operation completes, and before the user is allowed to reuse resources (such as buffers) specified in the call. The word complete is used with respect to operations and any associated requests and/or communications. An operation completes when the user is allowed to reuse resources, and any output buffers have been updated.
I took this version, because it is an errata that wanted to remove and correct the wrong parts and to kep the correct parts

@mpiforumbot
Copy link
Collaborator Author

Originally by RolfRabenseifner on 2014-09-16 11:43:39 -0500


Additional typo correction

@mpiforumbot
Copy link
Collaborator Author

Originally by RolfRabenseifner on 2014-11-29 11:18:59 -0600


To show all the changes, I added a textblock with all changes highlighted.

@mpiforumbot
Copy link
Collaborator Author

Originally by RolfRabenseifner on 2015-02-04 08:54:55 -0600


Attachment added: ticket451_diff_r1921.txt (1.7 KiB)
For pdf review

@mpiforumbot
Copy link
Collaborator Author

Originally by RolfRabenseifner on 2015-02-04 08:55:19 -0600


Attachment added: ticket451_terms-2.pdf (149.1 KiB)
For pdf review

@mpiforumbot
Copy link
Collaborator Author

Originally by RolfRabenseifner on 2015-02-04 08:56:03 -0600


svn revision 1921, see attached files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant