Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify semantics of one-sided semantics when changing synchronization mode #37

Closed
mpiforumbot opened this issue Jul 24, 2016 · 33 comments

Comments

@mpiforumbot
Copy link
Collaborator

mpiforumbot commented Jul 24, 2016

Originally by traff on 2008-10-14 06:54:19 -0500


Description

The specification in the one-sided semantics of when updates become visible in private and public copies of a window is complicated, and it is not entirely clear how they allow to switch between the various active and passive synchronization modes.

History

Goes back to 2003, see

http://www.cs.uiuc.edu/homes/wgropp/projects/parallel/MPI/mpi-errata/discuss/waitpost/

Proposed Solution

Add to the standard the following statement.

p. 352, line 22 (before (End of advice to users.):

"The RMA synchronization operations define when updates are guaranteed to become visible in public and private windows.
Updates may become visible earlier, but such behavior is implementation dependent."

p.352, after line 23, add:

"The semantics are illustrated by the following examples:

Example 1 (rule 5).

#!c
Process A:              Process B:
                        window location X

                        MPI_Win_lock(EXCLUSIVE,B)
                        store X /* local update to private copy of B */
                        MPI_Win_unlock(B) 
                        /* now visible in public window copy */

MPI_Barrier             MPI_Barrier

MPI_Win_lock(EXCLUSIVE,B)
MPI_Get(X) /* ok, read from public window */
MPI_Win_unlock(B)

Example 2 (rule 6).

#!c
Process A:              Process B:
                        window location X

MPI_Win_lock(EXCLUSIVE,B)
MPI_Put(X) /* update to public window */
MPI_Win_unlock(B)

MPI_Barrier             MPI_Barrier

                        MPI_Win_lock(EXCLUSIVE,B) 
                        /* now visible in private copy of B */
                        load X
                        MPI_Win_unlock(B)

Note that the private copy of X has not necessarily been updated
after the barrier, so omitting the lock-unlock at process B may lead to
the load returning an obsolete value.

Example 3.
The rules do not guarantee that process A in the following sequence will
see the value of X as updated by the local store by B before the lock.

#!c
Process A:              Process B:
                        window location X

                        store X /* update to private copy of B */
                        MPI_Win_lock(SHARED,B)
MPI_Barrier             MPI_Barrier

MPI_Win_lock(SHARED,B)
MPI_Get(X) /* X may not be in public window copy */
MPI_Win_unlock(B)
                        MPI_Win_unlock(B) 
                        /* update on X now visible in public window */

Example 4.
In the following sequence

#!c
Process A:              Process B:
window location X
window location Y

store Y
MPI_Win_post(A,B) /* Y visible in public window */
MPI_Win_start(A)        MPI_Win_start(A)

store X /* update to private window */

MPI_Win_complete        MPI_Win_complete
MPI_Win_wait
/* update on X may not yet visible in public window */

MPI_Barrier             MPI_Barrier

                        MPI_Win_lock(EXCLUSIVE,A)
                        MPI_Get(X) /* may return an obsolete value */ 
                        MPI_Get(Y)
                        MPI_Win_unlock(A)

it is not guaranteed that process B reads the value of X as per the local
update by process A, because neither MPI_Win_wait nor
MPI_Win_complete calls by process A ensure visibility in the public window copy. To allow B to read the value of X stored by A the local store must be replaced by a local MPI_Put that updates the public window copy. Note that by this replacement X
may become visible in the private copy in process memory of A
only after the MPI_Win_wait call in process A.
The update on Y made before the MPI_Win_post
call is visible in the public window after the MPI_Win_post call and therefore correctly gotten
by process B. The MPI_Get(Y) call could be moved to the epoch started by the MPI_Win_start operation, and process B would still get the value stored by A.

Example 5.
Finally, in the following sequence

#!c
Process A:              Process B:
                        window location X

MPI_Win_lock(EXCLUSIVE,B)
MPI_Put(X) /* update to public window */
MPI_Win_unlock(B)

MPI_Barrier             MPI_Barrier

                        MPI_Win_post(B)
                        MPI_Win_start(B)

                        load X /* access to private window */
                               /* may return an obsolete value */

                        MPI_Win_complete
                        MPI_Win_wait

rules (5,6) do not guarantee that the private copy of X at B has been
updated before the load takes place. To ensure that the value
put by process A is read, the local load must be replaced with a local MPI_Get operation, or must be
placed after the call to MPI_Win_wait.

Impact on Implementations

None. It is only a clarification

@mpiforumbot
Copy link
Collaborator Author

Originally by rsthakur on 2008-12-08 13:41:17 -0600


It would be good to have these examples in the text. These things are not obvious from simply reading the rules.

@mpiforumbot
Copy link
Collaborator Author

Originally by rsthakur on 2009-02-03 16:44:34 -0600


Looks good to me.

@mpiforumbot
Copy link
Collaborator Author

Originally by traff on 2009-02-10 08:19:49 -0600


Reviewers - reviews are needed!! I imagine some discussion may still be needed on this

Jesper

@mpiforumbot
Copy link
Collaborator Author

Originally by tipparajuv on 2009-02-10 11:41:43 -0600


would it make sense to merge this with advise to users on p.351

@mpiforumbot
Copy link
Collaborator Author

Originally by brbarret on 2009-02-10 16:43:02 -0600


looks good to me.

@mpiforumbot
Copy link
Collaborator Author

Originally by traff on 2009-03-04 03:15:50 -0600


Reviewer's (Rolf, Bill): please think about this, and add a review statement.

To Vinod: the examples come right after the advice to user's, so I see no need to merge, I
think this would also not be in the style of the MPI document.

Jesper

@mpiforumbot
Copy link
Collaborator Author

Originally by RolfRabenseifner on 2009-04-04 06:08:44 -0500


  1. Your sentence is clearly wrong. MPI_PUT after a barrier-like fence implementation is allowed to update private copies everywhere, but it need not!

I propose that you substitute

Add to the standard the following statement. 

{{{}}}p. 350, after line 47: 

"Only the RMA synchronization operations have an effect on when updates become 
visible in public and private windows." 

Possibly add the following examples to the standard, after p.352, line 22: 

by

Add a new paragraph before the *(End of advice to users)* on p350.22:

    Only the RMA synchronization operations define when updates 
    become necessarily visible in public and private windows.
    Whether updates are earlier visible is implementation dependent.

Add the following examples to the standard, after p352.23: 

Reasons:

  • You may never get 4 reviews for a new text in the official part of 1-sided.
    It is okay to have such new text in an advice to users.
  • The new text is a good intro for the following examples.
  1. Examples 1 and 2:[[BR]]
    • Please substitute 4x lock(B) by lock(EXCLUSIVE, B)
    • Please make linebreaks in all examples that no line is longer than
      the "store X /* ... */" line.
      This would be very helpful because the examples are then printable with
      the Internet Explorer.
  2. Example 3:[[BR]]
    • Please substitute "(B, SHARED)" by "(SHARED, B)

    • Please add on the MPI_Get(X) line the comment:

      /* wrong, i.e., this call may not get the value that was stored in process B */

  3. Examples 4+5 are wrong.
    • See MPI-2.1 p340.29-30
    • If A makes a Win_Post with a group (A,B) then A and B must call
      Win_start(A) and Win_complete.

@mpiforumbot
Copy link
Collaborator Author

Originally by gropp on 2009-04-05 16:57:40 -0500


The key point here is that the updates happen at unlock/fence for those two modes but the "scalable sync" splits the update to post and wait. This should be described first, then (fixed) examples can illustrate the point (these are rules 5 and 6 on page 350).

@mpiforumbot
Copy link
Collaborator Author

Originally by traff on 2009-04-06 03:51:04 -0500


Rolf, thanks for the comments. I've tried to take them into account.

I hope the ticket can be rereviewed, and still be in the MPI 2.2 process

@mpiforumbot
Copy link
Collaborator Author

Originally by traff on 2009-04-07 02:43:09 -0500


Reviewers - please have a look at this, last chance (if not already passed) for MPI 2.2

Jesper

@mpiforumbot
Copy link
Collaborator Author

Originally by rsthakur on 2009-04-07 16:51:37 -0500


In this proposed text:

"The RMA synchronization operations define when updates must have become visible in public and private windows. Updates may become visible earlier, but such behavior is implementation dependent."

I would suggest a slight rephrase of the first sentence to
"The RMA synchronization operations define when updates are guaranteed to become visible in public and private windows."

In Example 4, in the sentence "To allow B to read the value of X stored by A the local store must be replaced by an MPI_Put that updates the public window copy." I would replace "an MPI_Put" by "a local MPI_Put" just to avoid any confusion.

Similarly, in Example 5, in the sentence "To ensure that the value put by process A is read, the local load must be replaced with an MPI_Get operation." I would replace "an MPI_Get" with "a local MPI_Get" just to avoid any confusion that you get it remotely.

@mpiforumbot
Copy link
Collaborator Author

Originally by RolfRabenseifner on 2009-04-08 09:26:51 -0500


First about Rajeevs review: I'm in favor for these changes.

My review:

  • text-addition on p352.22: okay.

  • example 1: okay

  • example 2: okay

  • example 3: okay

  • example 4: The example is now okay, but it would be good
    if the following minor changes would be done:

    Example codes, that are intended to be wrong, should indicate
    this clearly within the example itself to prohibit
    copying of the example before reading the surrounding description.
    Therefore, I propose to add on the MPI_Get(X) line the following
    comment:[[BR]]
    MPI_Get(X) */* may return an obsolete value /

    In the text, after the sentence with the proposal to replace
    "store X" by "MPI_Put", I propose to comment also the local
    behavior of this change. Please add therefore after this sentence
    and before "The update on Y...":[[BR]]
    '''Such a MPI_Put(X) will become visible also in the process memory of A
    after the MPI_Win_wait in process A.'''

    With this, it is clear, that the code modification does not
    only solve the problem in B, but also does not make the code
    incorrect in A.

    Additionally, I have a minor editing proposal:
    The one group argument in MPI_Win_post should be indicated
    not by two rank arguments. It would be better to write[[BR]]
    MPI_Win_post((A,B),...)[[BR]]

    I made all changes bold for better reading of this review.
    In the solution text, they should not be bold.

  • example 5: not yet fully reviewed

    Same with MPI_Win_post(B,A,...). Additionally,
    the sequence in the group is
    irrelevant and therefore misleading
    (reader asks him/herself, why must I invert the sequence)
    Therefore, it should read[[BR]]
    MPI_Win_post((A,B),...)

Sentences about fence instead of barrier:

  • Example 4: Okay.

  • Example 5: Okay.

    Please add after the sentence
    "In examples 4 and 5, replacing the barrier with a fence would lead
    to the desired behavior.":

    '''In this case all synchronization after the fence can be removed.
    If the examples would be additonally started with a fence,
    and the barrier is substituted by a fence,
    then also the synchronizations between the two fences could be
    removed.'''

  • Example 2 - please change to

    In example 2, a fence instead of the barrier would alleviate the
    need for lock-unlock at process B to access the process local
    location X''', and together with an additional fence at the beginning,
    both lock-unlock can be removed'''.

  • Example 3: The sentence that the fence is forbidden, seems
    to be okay based on p333.35.

REVIEW on EXAMPLE 5: I try to do it later.

@mpiforumbot
Copy link
Collaborator Author

Originally by RolfRabenseifner on 2009-04-08 11:00:41 -0500


Decision at the meeting: I should add the review comments now, to get it to the first Reading before end of the meeting. --> Write-toke at me.

@mpiforumbot
Copy link
Collaborator Author

Originally by RolfRabenseifner on 2009-04-08 11:10:15 -0500


I included last two reviews.

@mpiforumbot
Copy link
Collaborator Author

Originally by RolfRabenseifner on 2009-04-24 15:38:41 -0500


Late review for example 5:

  1. In process A, the Win_start and Win_complete is not necessary.
    Therefore, it is missleading for the reader, because he/she may believe
    that these calls are necessary for the corrected case, i.e., for the case that
    "load X" is substituted by "MPI_Get(X)" in Process B.

Because the example is still correct (and only a little bit misleading),
I propose to remove these calls, together with the "A" in the group-argument
of MPI_Win_post.

I would treat this modification as a minor change, provided ticket author and
reviewers (Jesper Traeff, Rajeev Thakur and Bill Gropp)
agree with this modification.

  1. There are two ways to heal the example:

    • Already mentioned: Tosubstitute "load X" by "MPI_Get(X)"
    • To move "load X" after MPI_Win_Wait (based on rule 6).

    Therefore, I would add

    "or must be moved after the call to MPI_Win_wait"

This is again a minor modification provided ticket author and
reviewers (Jesper Traeff, Rajeev Thakur and Bill Gropp)
agree with this modification.

The modified example would be:

Example 5.
Finally, in the following sequence

#!c
Process A:              Process B:
                        location X

MPI_Win_lock(EXCLUSIVE,B)
MPI_Put(X) /* update to public window */
MPI_Win_unlock(B)

MPI_Barrier             MPI_Barrier

                        MPI_Win_post((B),...)
                        MPI_Win_start(B)

                        load X /* access to private window */
                               /* may return an obsolete value */

                        MPI_Win_complete
                        MPI_Win_wait

rules (5,6) do not guarantee that the private copy of X at B has been
updated before the load takes place. To ensure that the value
put by process A is read, the local load must be replaced with a local
MPI_Get operation or must be moved after the call to MPI_Win_wait.

@mpiforumbot
Copy link
Collaborator Author

Originally by gropp on 2009-04-26 21:21:09 -0500


I agree with Rolf's minor change to example 5. (I note that the start/complete are still present in the example - is that what you intended?)

@mpiforumbot
Copy link
Collaborator Author

Originally by traff on 2009-05-07 13:52:56 -0500


I have done the changes to example 5 suggested by Rolf, and also reviewed the changes introduced during the last forum meeting.

  • there were cases where the right (typewriter) font was not used
  • for clarity I have changed "location X" to "window location X" throughout
  • in Example 1 a comment said "private copy of A" which should be "private copy of B"
  • I took out the ... from the MPI_Win_start(,...) calls; this was not used consistently. Now the calls only give the ranks, therefore I think also the extra brackets in (A,B) can also safely be eliminated; this I did
  • Bill, in Example 5 process B posts the window, therefore process B must also start and complete it
  • I changed a little bit the wording for example 4 and introduced one clarifying sentence
  • I changed a little bit the wording of the text after the examples (I think this was mostly not my text), and hope it is clearer now

The spirit of the ticket has not been changed, therefore I think the status of the ticket can still count as "has 1st reading"

Reviewers, please have a look; I hope you agree - such that this can progress for 2.2

Best,

Jesper

@mpiforumbot
Copy link
Collaborator Author

Originally by rsthakur on 2009-05-07 14:18:17 -0500


I don't think this sentence is right because fence is not synchronizing

In examples 4 and 5, replacing the barrier with a fence operation would lead to the desired behavior.

In example 4, the fence on process B (in place of barrier) could return even before store X was called on Process A. Similar situation in Example 5.

@mpiforumbot
Copy link
Collaborator Author

Originally by traff on 2009-05-07 14:33:52 -0500


Actually, I'd be most happy with removing the whole last paragraph, which I think might create more confusion than clarity.

And, yes, you're right (unless there have been some previous fence calls, etc.)

So - shall I drop this whole discussion?

Jesper

@mpiforumbot
Copy link
Collaborator Author

Originally by rsthakur on 2009-05-07 14:36:27 -0500


Yes, I have been uneasy about that paragraph for a while. I wouldn't mind if it is not there.

@mpiforumbot
Copy link
Collaborator Author

Originally by traff on 2009-05-07 14:46:56 -0500


Let's wait a few days and see what the others say. But I'd also be for dropping this paragraph.

@mpiforumbot
Copy link
Collaborator Author

Originally by traff on 2009-06-01 17:18:14 -0500


Following Rajeevs comment and suggestion, I have removed the following, last paragraph from the proposal. It is partly wrong, and potentially confuses more than it clarifies. Since this change does not add anything new to the ticket, I think/hope it can still count as having been read.

Jesper

Removed:

In examples 4 and 5, replacing the barrier with a fence operation would lead to the desired behavior.
In this case all synchronization after the fence could be removed.
If the two examples would in addition be started with a fence then also the synchronizations between the two fences could be removed.
In Example 2, a fence instead of the
barrier would alleviate the need for the lock-unlock calls at process B to access
the process local location X.
With an additional fence at the beginning of the example
also the lock-unlock calls at process A could be removed.
In Example 3, a fence instead of the barrier
would be illegal, because epochs must be disjoint.

@mpiforumbot
Copy link
Collaborator Author

Originally by RolfRabenseifner on 2009-06-08 09:19:10 -0500


I checked the discussion and all modifications since my last review:

  • All late reviews are implemented.
  • All modifications in the ticket's description are helpful and correct.
  • Especially the the removal of the last paragraph.
  • I agree that the modifications are minor and therefore, the status should be kept on "Had 1st reading".

@mpiforumbot
Copy link
Collaborator Author

Originally by traff on 2009-08-11 02:56:41 -0500


Attachment added: ticket37.pdf (43.2 KiB)

@mpiforumbot
Copy link
Collaborator Author

Originally by traff on 2009-08-11 03:01:09 -0500


PDF ready, please review

Editorial change: "neither the MPI_Win_wait nor MPI_Win_complete calls" to
"neither MPI_Win_wait nor MPI_Win_complete calls" in Example 4. Numbering of examples follows numbering of other examples in chapter.

I'm not entirely happy with the typography (verbatim environment); this could be improved

@mpiforumbot
Copy link
Collaborator Author

Originally by RolfRabenseifner on 2009-08-29 17:42:36 -0500


PDF review: okay.

@mpiforumbot
Copy link
Collaborator Author

Originally by traff on 2009-08-30 05:46:00 -0500


Dear reviewers - please do check this (complex) ticket! So far it
has only one review (thanks, Rolf!)

Jesper

@mpiforumbot
Copy link
Collaborator Author

Originally by htor on 2009-08-30 08:57:53 -0500


Review ok, some minor comments though (NO show-stoppers!):

  1. I would not \emph{} obsolete (I would not set it in quotes either ;)
  2. beginning of p403 is not in red (?)

@mpiforumbot
Copy link
Collaborator Author

Originally by rsthakur on 2009-08-30 17:25:49 -0500


pdf review ok.

@mpiforumbot
Copy link
Collaborator Author

Originally by traff on 2009-08-31 03:06:05 -0500


I implemented Torsten's suggestion, and asked explicitly for more reviews - one or two are still needed

@mpiforumbot
Copy link
Collaborator Author

Originally by tipparajuv on 2009-08-31 23:37:03 -0500


review ok.

@mpiforumbot
Copy link
Collaborator Author

Originally by traff on 2009-09-01 02:27:48 -0500


Reluctantly setting status to "complete" (Brian, review?)

@mpiforumbot
Copy link
Collaborator Author

Originally by jsquyres on 2010-09-18 05:00:29 -0500


This ticket is (long-since) complete; marking it resolved/text committed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant