New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-blocking communicator/file constructors/destructors/etc, respectively to "Groups, Contexts, Communicators, Caching," "Topology," and "I/O" chapters #78

Open
tonyskjellum opened this Issue Feb 3, 2018 · 40 comments

Comments

@tonyskjellum

tonyskjellum commented Feb 3, 2018

Problem

The standard functionality is not complete; some non-blocking now, some not.

Proposal

I. As a baseline goal, these functions will be proposed for addition to MPI-3.x:
Groups, Contexts, Communicators, and Caching Chapter:
Non-blocking:
1. both intra-comm and inter-comm
* MPI_COMM_ISPLIT
* MPI_COMM_ISPLIT_TYPE
* MPI_INTERCOMM_IMERGE
* MPI_INTERCOMM_ICREATE
* MPI_COMM_ICREATE
2. intra-comm only:
* MPI_COMM_ICREATE_GROUP
3. Destructor category:
* MPI_COMM_IFREE (see also MPI_COMM_IDISCONNECT)
4. Other:
* MPI_COMM_ISET_INFO

II. Topology chapter variants to be added to comment.
* MPI_CART_ICREATE
* MPI_GRAPH_ICREATE
* MPI_DIST_GRAPH_ICREATE
* MPI_DIST_GRAPH_ICREATE_ADJACENT

III. I/O Chapter

  1. Non-block constructor:
  • MPI_FILE_IOPEN
  1. Non-blocking destructor:
  • MPI_FILE_ICLOSE
  1. Other file-level operations:
  • MPI_FILE_IDELETE
  • MPI_FILE_ISETSIZE
  • MPI_FILE_IPREALLOCATE
  1. Maybe (because they are collective):
  • MPI_FILE_ISET_INFO
  • MPI_FILE_ISET_VIEW

IV. Add non-blocking destructor to the "Process Creation and Management Chapter" (DPM)

  • MPI_COMM_IDISCONNECT
    and clarify that both MPI_COMM_DISCONNECT and MPI_COMM_IDISCONNECT work on intracommunicators as well as intercommunicators.

Note: Dynamic process management functions (MPI_COMM_ACCEPT, MPI_COMM_CONNECT, MPI_COMM_SPAWN, MPI_COMM_SPAWN_MULTIPLE) are defined in the related Ticket #81. However, there is no proposed nonblocking equivalent for MPI_JOIN; see #13 for the proposal to deprecate MPI_COMM_JOIN, which is why we don't offer the nonblocking version in Ticket #81.

Other than the intentional duplication of MPI_COMM_IDISCONNECT with Ticket #81, this ticket is complementary to the remainder of Ticket #81 and #82

Changes to the Text

The chapters will be modified to provide explanations, definitions, and rationale for these added functions.

Impact on Implementations

This will require implementations to add these new functions; they are analogous to other functions already in the standard, and each should be incremental work.

Note: other proposals, such as for fault-tolerance, are exploring analogs to functions in this Chapter to be added by this proposal but missing in current MPI. A recent Fault Tolerance WG discussion revealed these omissions and opportunities for making MPI support fully nonblocking libraries and components more fully.

Impact on Users

Users will be able to write better and more completely nonblocking MPI programs and libraries. The design of the MPI standard will be more "orthogonal."

References

https://github.com/mpi-forum/mpi-standard

The associated PR is at: mpi-forum/mpi-standard#48

The RMA form of this proposal is Ticket #82

@tonyskjellum tonyskjellum changed the title from Add missing communicator operations (mostly non-blocking) to "Groups, Contexts, Communicators, and Caching" chapter to Add missing communicator operations (mostly non-blocking) to "Groups, Contexts, Communicators, and Caching" and "Topology" chapters Feb 3, 2018

@dholmes-epcc-ed-ac-uk

This comment has been minimized.

Member

dholmes-epcc-ed-ac-uk commented Feb 4, 2018

The MPI_COMM_IDUP_WITH_INFO function was added by issue #53 (pull request mpi-forum/mpi-standard#14).

@tonyskjellum

This comment has been minimized.

tonyskjellum commented Feb 4, 2018

Removing MPI_COMM_IDUP_WITH_INFO from this set to remove duplication. We will combine issue #53 and this issue, as appropriate, in discussions and presentations.

@tonyskjellum

This comment has been minimized.

tonyskjellum commented Feb 4, 2018

Here is a preview of all the functionality proposed. It is not a pull request yet because we have not discussed it in either the Collective WG nor in a plenary. Comments most welcome.

mpi32-report-04feb18.pdf

@htorst

This comment has been minimized.

htorst commented Feb 4, 2018

I would like to remind us that the MPI-3 forum decided consciously to not standardize all these functions in their nonblocking version even though it is straight-forward and an obvious gap. The reasoning was to lower the barrier for implementer of MPI libraries and foster MPI-3's adoption. So we only standardized the most crucial functions comm_dup that enables to implement nonlocking libraries on top of MPI. All others can build on this and thus seem to be syntactic sugar or can you elaborate what is different from MPI-3?

Also, some of your function names exceed F77's maximum character limit ;-).

@jeffhammond

This comment has been minimized.

Member

jeffhammond commented Feb 4, 2018

@jeffhammond

This comment has been minimized.

Member

jeffhammond commented Feb 4, 2018

@tonyskjellum

This comment has been minimized.

tonyskjellum commented Feb 4, 2018

@tonyskjellum tonyskjellum changed the title from Add missing communicator operations (mostly non-blocking) to "Groups, Contexts, Communicators, and Caching" and "Topology" chapters to Add missing communicator operations (mostly non-blocking) to "Groups, Contexts, Communicators, Caching," "Topology", and "One-Sided Communication" chapters Feb 10, 2018

@tonyskjellum

This comment has been minimized.

tonyskjellum commented Feb 10, 2018

1-sided Windows non-blocking constructors and destructor functionality now added as well.

mpi32-report-10feb18.pdf

@dholmes-epcc-ed-ac-uk

This comment has been minimized.

Member

dholmes-epcc-ed-ac-uk commented Feb 13, 2018

What about MPI_Files? MPI_File_iopen, MPI_File_iclose, and so on?

@dholmes-epcc-ed-ac-uk

This comment has been minimized.

Member

dholmes-epcc-ed-ac-uk commented Feb 13, 2018

The PDF that includes nonblocking window creation functions is a bit messed up - there are two chapters (11 & 12) entitled "One-sided Communication" with similar but different content.

The text for MPI_WIN_ICREATE in chapter 12 describes the MPI_WIN_ICREATE_DYNAMIC operation.

The text for MPI_WIN_ICREATE in chapter 11 includes:

It is erroneous to use the window win as an input argument to other MPI functions before the MPI_WIN_ICREATE operation completes.

This is a little ambiguous - it could be taken to mean "before the MPI_WIN_ICREATE function returns", which is too early. I guess this is supposed to mean "before the request is completed by a successful call to MPI_WAIT[_ALL] or MPI_TEST[_ALL] that returned flag = true" or some such similar wording. Counter-argument: this ambiguous wording is copy-pasted from MPI_COMM_IDUP.

There are a lot of cross-references that appear as ?? throughout.

@dholmes-epcc-ed-ac-uk

This comment has been minimized.

Member

dholmes-epcc-ed-ac-uk commented Feb 13, 2018

Should there also be MPI_WIN_IFENCE (etc, for other synchronisation methods)?
Same reason as MPI_IBARRIER?

@jeffhammond

This comment has been minimized.

Member

jeffhammond commented Feb 13, 2018

@dholmes-epcc-ed-ac-uk "completed" means "waited upon (or equivalent)" in many other places in the standard, e.g.:

These operations are nonblocking: the call initiates the transfer, but the transfer may continue after the call returns. The transfer is completed, at the origin or both the origin and the target, when a subsequent synchronization call is issued by the caller on the involved window object.

Please don't try to add nonblocking RMA synchronization here. That needs to be handled by the RMA WG. We've been down that path before. Pavan and friends wrote a paper on it. I support the features but RMA needs to drive if after thinking through all the details.

@tonyskjellum

This comment has been minimized.

tonyskjellum commented Feb 13, 2018

@tonyskjellum

This comment has been minimized.

tonyskjellum commented Feb 13, 2018

@tonyskjellum

This comment has been minimized.

tonyskjellum commented Feb 13, 2018

@jeffhammond

This comment has been minimized.

Member

jeffhammond commented Feb 13, 2018

@tonyskjellum I asked for "RMA equivalents" to communicator constructors and destructors, not synchronization operations, meaning MPI_Win_{create,allocate,free}*.

@jeffhammond

This comment has been minimized.

@tonyskjellum tonyskjellum changed the title from Add missing communicator operations (mostly non-blocking) to "Groups, Contexts, Communicators, Caching," "Topology", and "One-Sided Communication" chapters to Add missing communicator operations (mostly non-blocking) to "Groups, Contexts, Communicators, Caching," "Topology," and "I/O" chapters Feb 13, 2018

@tonyskjellum

This comment has been minimized.

tonyskjellum commented Feb 13, 2018

After discussions with @dholmes-epcc-ed-ac-uk and Puri Bangalore, we have added the prospective non-blocking constructor, destructor, and collective operations that appear most needful of non-blocking variants. The proposed document will soon be updated with those APIs (as well as we-adding the parts of the API from 1-sided we removed this AM due to LaTeX bug).

@tonyskjellum tonyskjellum changed the title from Add missing communicator operations (mostly non-blocking) to "Groups, Contexts, Communicators, Caching," "Topology," and "I/O" chapters to Add missing communicator operations (mostly non-blocking) to "Groups, Contexts, Communicators, Caching," "Topology," "One-Sided," and "I/O" chapters Feb 13, 2018

@tonyskjellum

This comment has been minimized.

tonyskjellum commented Feb 14, 2018

Please see this update:

i) reincorporates the 1-sided proposed functions
ii) Has the proposed I/O operations for the first time
iii) amends the change log description for all operations in this ticket/issue

mpi32-report-13feb18-2144.pdf

@tonyskjellum tonyskjellum changed the title from Add missing communicator operations (mostly non-blocking) to "Groups, Contexts, Communicators, Caching," "Topology," "One-Sided," and "I/O" chapters to Add missing, mostly non-blocking communicator/file/win operations, respectively to "Groups, Contexts, Communicators, Caching," "Topology," "One-Sided," and "I/O" chapters Feb 14, 2018

@dholmes-epcc-ed-ac-uk

This comment has been minimized.

Member

dholmes-epcc-ed-ac-uk commented Feb 14, 2018

Title of issue should now be more general than "Add missing [communicator] operations ..." - perhaps just "Add missing object constructors and destructors" with the chapter references moved into the "Changes to the text" section of the description?

@dholmes-epcc-ed-ac-uk

This comment has been minimized.

Member

dholmes-epcc-ed-ac-uk commented Feb 14, 2018

The only new blocking routine MPI_SPLIT_WITH_INFO is an odd-one-out on this issue because it adds new functionality (the possibility of supplying an MPI_INFO to the split operation). Additional justification is needed for adding this new function (strictly function-pair, because you propose a nonblocking version too). Should that be a different issue? I can see why it is included in this issue - it is a missing communicator constructor. Either way, I feel it should be called out as different to the others and separately justified - what can be done with MPI_INFO that cannot be done with "color" and "key"? Why are you not proposing full orthogonality, i.e. a "_WITH_INFO" version of all communicator constructors? All Window and File constructors already have MPI_INFO arguments - why is MPI_COMM_SPLIT as special as MPI_COMM_DUP whereas the others are not?

@tonyskjellum

This comment has been minimized.

tonyskjellum commented Feb 14, 2018

@tonyskjellum tonyskjellum changed the title from Add missing, mostly non-blocking communicator/file/win operations, respectively to "Groups, Contexts, Communicators, Caching," "Topology," "One-Sided," and "I/O" chapters to Non-blocking communicator/file/win constructors/destructors, respectively to "Groups, Contexts, Communicators, Caching," "Topology," "One-Sided," and "I/O" chapters Feb 14, 2018

@tonyskjellum

This comment has been minimized.

tonyskjellum commented Feb 28, 2018

Specific feedback came from the plenary presentation of Ticket #78, and one-on-one discussions thereafter:

  1. Split the RMA functions to a separate ticket so that the RMA working group can weigh in. We will make the separate ticket and hand it off for their study.

  2. Put the MPI_IDISCONNECT into Ticket #81. [We decided to duplicate it there.]

  3. Consider again if we need stronger MPI_COMM_DESTROY/IDESTROY functions for even
    stronger statement of resource recovery etc. Would be a separate ticket if so.

  4. Additionally, during the reading of Ticket #25, it was pointed out that MPI_COMM_SET_INFO is collective. Therefore, we need to consider adding MPI_COMM_ISET_INFO to this ticket.

Our goal is to review Ticket #78 at a virtual meeting (and allow Ticket #81 to trail along), and read Ticket #78 (and possibly #81) at the June, 2018 meeting in Austin. The others (not named yet), would follow in due course if they gain adherents and proponents.

@tonyskjellum tonyskjellum changed the title from Non-blocking communicator/file/win constructors/destructors, respectively to "Groups, Contexts, Communicators, Caching," "Topology," "One-Sided," "I/O", and "Process Creation and Management" chapters to Non-blocking communicator/file/win constructors/destructors, respectively to "Groups, Contexts, Communicators, Caching," "Topology," "I/O", and "Process Creation and Management" chapters Mar 1, 2018

@tonyskjellum

This comment has been minimized.

tonyskjellum commented Mar 1, 2018

The latest draft attached

  1. MPI_COMM_ISET_INFO added
  2. DPM functions removed
  3. Updated and corrected the changes chapter

mpi32-report.pdf

@tonyskjellum tonyskjellum changed the title from Non-blocking communicator/file/win constructors/destructors, respectively to "Groups, Contexts, Communicators, Caching," "Topology," "I/O", and "Process Creation and Management" chapters to Non-blocking communicator/file/win constructors/destructors/etc, respectively to "Groups, Contexts, Communicators, Caching," "Topology," "I/O", and "Process Creation and Management" chapters Mar 1, 2018

@dholmes-epcc-ed-ac-uk dholmes-epcc-ed-ac-uk self-assigned this Mar 1, 2018

@tonyskjellum tonyskjellum changed the title from Non-blocking communicator/file/win constructors/destructors/etc, respectively to "Groups, Contexts, Communicators, Caching," "Topology," "I/O", and "Process Creation and Management" chapters to Non-blocking communicator/file/win constructors/destructors/etc, respectively to "Groups, Contexts, Communicators, Caching," "Topology," and "I/O" chapters Mar 7, 2018

@tonyskjellum

This comment has been minimized.

tonyskjellum commented May 23, 2018

Today, the associated pull request has been added for a plan to read at Austin meeting; pull request ahead of 2-week deadline. We may still update PR further before the two-week deadline.

@tonyskjellum

This comment has been minimized.

tonyskjellum commented Jun 6, 2018

mpi32-report-ticket78.pdf

This is the public copy.

@tonyskjellum

This comment has been minimized.

tonyskjellum commented Jun 14, 2018

We did the first reading attempt in Austin on June 13, 2018. There were a few issues raised, and quality improvements needed. We will fix these and present a new reading in Barcelona in September.

@tonyskjellum tonyskjellum added the wg-io label Sep 24, 2018

@tonyskjellum tonyskjellum changed the title from Non-blocking communicator/file/win constructors/destructors/etc, respectively to "Groups, Contexts, Communicators, Caching," "Topology," and "I/O" chapters to Non-blocking communicator/file constructors/destructors/etc, respectively to "Groups, Contexts, Communicators, Caching," "Topology," and "I/O" chapters Sep 24, 2018

@tonyskjellum tonyskjellum changed the title from Non-blocking communicator/file constructors/destructors/etc, respectively to "Groups, Contexts, Communicators, Caching," "Topology," and "I/O" chapters to Non-blocking communicator/file/win constructors/destructors/etc, respectively to "Groups, Contexts, Communicators, Caching," "Topology," and "I/O" chapters Sep 24, 2018

@tonyskjellum tonyskjellum changed the title from Non-blocking communicator/file/win constructors/destructors/etc, respectively to "Groups, Contexts, Communicators, Caching," "Topology," and "I/O" chapters to Non-blocking communicator/file constructors/destructors/etc, respectively to "Groups, Contexts, Communicators, Caching," "Topology," and "I/O" chapters Sep 24, 2018

@tonyskjellum

This comment has been minimized.

tonyskjellum commented Dec 12, 2018

At the Persistence/Collective joint meeting today, we agreed that we need to justify the utility of these functions and establish community best practices.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment