Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-blocking communicator/file constructors/destructors/etc, respectively to "Groups, Contexts, Communicators, Caching," "Topology," and "I/O" chapters #78

Open
tonyskjellum opened this issue Feb 3, 2018 · 44 comments
Assignees
Labels
mpi-5 For inclusion in the MPI 5.0 standard wg-collectives Collectives Working Group wg-io I/O Working Group
Projects

Comments

@tonyskjellum
Copy link

tonyskjellum commented Feb 3, 2018

Problem

The standard functionality is not complete; some non-blocking now, some not.

Proposal

I. As a baseline goal, these functions will be proposed for addition to MPI-3.x:
Groups, Contexts, Communicators, and Caching Chapter:
Non-blocking:
1. both intra-comm and inter-comm
* MPI_COMM_ISPLIT
* MPI_COMM_ISPLIT_TYPE
* MPI_INTERCOMM_IMERGE
* MPI_INTERCOMM_ICREATE
* MPI_COMM_ICREATE
2. intra-comm only:
* MPI_COMM_ICREATE_GROUP
3. Destructor category:
* MPI_COMM_IFREE (see also MPI_COMM_IDISCONNECT)
4. Other:
* MPI_COMM_ISET_INFO

II. Topology chapter variants to be added to comment.
* MPI_CART_ICREATE
* MPI_GRAPH_ICREATE
* MPI_DIST_GRAPH_ICREATE
* MPI_DIST_GRAPH_ICREATE_ADJACENT

III. I/O Chapter

  1. Non-block constructor:
  • MPI_FILE_IOPEN
  1. Non-blocking destructor:
  • MPI_FILE_ICLOSE
  1. Other file-level operations:
  • MPI_FILE_IDELETE
  • MPI_FILE_ISETSIZE
  • MPI_FILE_IPREALLOCATE
  1. Maybe (because they are collective):
  • MPI_FILE_ISET_INFO
  • MPI_FILE_ISET_VIEW

IV. Add non-blocking destructor to the "Process Creation and Management Chapter" (DPM)

  • MPI_COMM_IDISCONNECT
    and clarify that both MPI_COMM_DISCONNECT and MPI_COMM_IDISCONNECT work on intracommunicators as well as intercommunicators.

Note: Dynamic process management functions (MPI_COMM_ACCEPT, MPI_COMM_CONNECT, MPI_COMM_SPAWN, MPI_COMM_SPAWN_MULTIPLE) are defined in the related Ticket #81. However, there is no proposed nonblocking equivalent for MPI_JOIN; see #13 for the proposal to deprecate MPI_COMM_JOIN, which is why we don't offer the nonblocking version in Ticket #81.

Other than the intentional duplication of MPI_COMM_IDISCONNECT with Ticket #81, this ticket is complementary to the remainder of Ticket #81 and #82

Changes to the Text

The chapters will be modified to provide explanations, definitions, and rationale for these added functions.

Impact on Implementations

This will require implementations to add these new functions; they are analogous to other functions already in the standard, and each should be incremental work.

Note: other proposals, such as for fault-tolerance, are exploring analogs to functions in this Chapter to be added by this proposal but missing in current MPI. A recent Fault Tolerance WG discussion revealed these omissions and opportunities for making MPI support fully nonblocking libraries and components more fully.

Impact on Users

Users will be able to write better and more completely nonblocking MPI programs and libraries. The design of the MPI standard will be more "orthogonal."

References

https://github.com/mpi-forum/mpi-standard

The associated PR is at: mpi-forum/mpi-standard#48

The RMA form of this proposal is Ticket #82

@tonyskjellum tonyskjellum changed the title Add missing communicator operations (mostly non-blocking) to "Groups, Contexts, Communicators, and Caching" chapter Add missing communicator operations (mostly non-blocking) to "Groups, Contexts, Communicators, and Caching" and "Topology" chapters Feb 3, 2018
@dholmes-epcc-ed-ac-uk
Copy link
Member

The MPI_COMM_IDUP_WITH_INFO function was added by issue #53 (pull request https://github.com/mpi-forum/mpi-standard/pull/14).

@tonyskjellum
Copy link
Author

Removing MPI_COMM_IDUP_WITH_INFO from this set to remove duplication. We will combine issue #53 and this issue, as appropriate, in discussions and presentations.

@tonyskjellum
Copy link
Author

Here is a preview of all the functionality proposed. It is not a pull request yet because we have not discussed it in either the Collective WG nor in a plenary. Comments most welcome.

mpi32-report-04feb18.pdf

@htorst
Copy link
Member

htorst commented Feb 4, 2018

I would like to remind us that the MPI-3 forum decided consciously to not standardize all these functions in their nonblocking version even though it is straight-forward and an obvious gap. The reasoning was to lower the barrier for implementer of MPI libraries and foster MPI-3's adoption. So we only standardized the most crucial functions comm_dup that enables to implement nonlocking libraries on top of MPI. All others can build on this and thus seem to be syntactic sugar or can you elaborate what is different from MPI-3?

Also, some of your function names exceed F77's maximum character limit ;-).

@jeffhammond
Copy link
Member

jeffhammond commented Feb 4, 2018 via email

@jeffhammond
Copy link
Member

jeffhammond commented Feb 4, 2018 via email

@tonyskjellum
Copy link
Author

tonyskjellum commented Feb 4, 2018 via email

@tonyskjellum tonyskjellum changed the title Add missing communicator operations (mostly non-blocking) to "Groups, Contexts, Communicators, and Caching" and "Topology" chapters Add missing communicator operations (mostly non-blocking) to "Groups, Contexts, Communicators, Caching," "Topology", and "One-Sided Communication" chapters Feb 10, 2018
@tonyskjellum
Copy link
Author

1-sided Windows non-blocking constructors and destructor functionality now added as well.

mpi32-report-10feb18.pdf

@dholmes-epcc-ed-ac-uk
Copy link
Member

What about MPI_Files? MPI_File_iopen, MPI_File_iclose, and so on?

@dholmes-epcc-ed-ac-uk
Copy link
Member

The PDF that includes nonblocking window creation functions is a bit messed up - there are two chapters (11 & 12) entitled "One-sided Communication" with similar but different content.

The text for MPI_WIN_ICREATE in chapter 12 describes the MPI_WIN_ICREATE_DYNAMIC operation.

The text for MPI_WIN_ICREATE in chapter 11 includes:

It is erroneous to use the window win as an input argument to other MPI functions before the MPI_WIN_ICREATE operation completes.

This is a little ambiguous - it could be taken to mean "before the MPI_WIN_ICREATE function returns", which is too early. I guess this is supposed to mean "before the request is completed by a successful call to MPI_WAIT[_ALL] or MPI_TEST[_ALL] that returned flag = true" or some such similar wording. Counter-argument: this ambiguous wording is copy-pasted from MPI_COMM_IDUP.

There are a lot of cross-references that appear as ?? throughout.

@dholmes-epcc-ed-ac-uk
Copy link
Member

Should there also be MPI_WIN_IFENCE (etc, for other synchronisation methods)?
Same reason as MPI_IBARRIER?

@jeffhammond
Copy link
Member

@dholmes-epcc-ed-ac-uk "completed" means "waited upon (or equivalent)" in many other places in the standard, e.g.:

These operations are nonblocking: the call initiates the transfer, but the transfer may continue after the call returns. The transfer is completed, at the origin or both the origin and the target, when a subsequent synchronization call is issued by the caller on the involved window object.

Please don't try to add nonblocking RMA synchronization here. That needs to be handled by the RMA WG. We've been down that path before. Pavan and friends wrote a paper on it. I support the features but RMA needs to drive if after thinking through all the details.

@tonyskjellum
Copy link
Author

tonyskjellum commented Feb 13, 2018 via email

@tonyskjellum
Copy link
Author

tonyskjellum commented Feb 13, 2018 via email

@tonyskjellum
Copy link
Author

tonyskjellum commented Feb 13, 2018 via email

@jeffhammond
Copy link
Member

@tonyskjellum I asked for "RMA equivalents" to communicator constructors and destructors, not synchronization operations, meaning MPI_Win_{create,allocate,free}*.

@jeffhammond
Copy link
Member

@tonyskjellum tonyskjellum changed the title Add missing communicator operations (mostly non-blocking) to "Groups, Contexts, Communicators, Caching," "Topology", and "One-Sided Communication" chapters Add missing communicator operations (mostly non-blocking) to "Groups, Contexts, Communicators, Caching," "Topology," and "I/O" chapters Feb 13, 2018
@tonyskjellum
Copy link
Author

tonyskjellum commented Feb 13, 2018

After discussions with @dholmes-epcc-ed-ac-uk and Puri Bangalore, we have added the prospective non-blocking constructor, destructor, and collective operations that appear most needful of non-blocking variants. The proposed document will soon be updated with those APIs (as well as we-adding the parts of the API from 1-sided we removed this AM due to LaTeX bug).

@tonyskjellum tonyskjellum changed the title Add missing communicator operations (mostly non-blocking) to "Groups, Contexts, Communicators, Caching," "Topology," and "I/O" chapters Add missing communicator operations (mostly non-blocking) to "Groups, Contexts, Communicators, Caching," "Topology," "One-Sided," and "I/O" chapters Feb 13, 2018
@tonyskjellum
Copy link
Author

tonyskjellum commented Feb 14, 2018

Please see this update:

i) reincorporates the 1-sided proposed functions
ii) Has the proposed I/O operations for the first time
iii) amends the change log description for all operations in this ticket/issue

mpi32-report-13feb18-2144.pdf

@tonyskjellum tonyskjellum changed the title Add missing communicator operations (mostly non-blocking) to "Groups, Contexts, Communicators, Caching," "Topology," "One-Sided," and "I/O" chapters Add missing, mostly non-blocking communicator/file/win operations, respectively to "Groups, Contexts, Communicators, Caching," "Topology," "One-Sided," and "I/O" chapters Feb 14, 2018
@dholmes-epcc-ed-ac-uk
Copy link
Member

Title of issue should now be more general than "Add missing [communicator] operations ..." - perhaps just "Add missing object constructors and destructors" with the chapter references moved into the "Changes to the text" section of the description?

@dholmes-epcc-ed-ac-uk
Copy link
Member

The only new blocking routine MPI_SPLIT_WITH_INFO is an odd-one-out on this issue because it adds new functionality (the possibility of supplying an MPI_INFO to the split operation). Additional justification is needed for adding this new function (strictly function-pair, because you propose a nonblocking version too). Should that be a different issue? I can see why it is included in this issue - it is a missing communicator constructor. Either way, I feel it should be called out as different to the others and separately justified - what can be done with MPI_INFO that cannot be done with "color" and "key"? Why are you not proposing full orthogonality, i.e. a "_WITH_INFO" version of all communicator constructors? All Window and File constructors already have MPI_INFO arguments - why is MPI_COMM_SPLIT as special as MPI_COMM_DUP whereas the others are not?

@tonyskjellum
Copy link
Author

tonyskjellum commented Feb 14, 2018 via email

@tonyskjellum tonyskjellum changed the title Add missing, mostly non-blocking communicator/file/win operations, respectively to "Groups, Contexts, Communicators, Caching," "Topology," "One-Sided," and "I/O" chapters Non-blocking communicator/file/win constructors/destructors, respectively to "Groups, Contexts, Communicators, Caching," "Topology," "One-Sided," and "I/O" chapters Feb 14, 2018
@tonyskjellum tonyskjellum changed the title Non-blocking communicator/file/win constructors/destructors/etc, respectively to "Groups, Contexts, Communicators, Caching," "Topology," "I/O", and "Process Creation and Management" chapters Non-blocking communicator/file/win constructors/destructors/etc, respectively to "Groups, Contexts, Communicators, Caching," "Topology," and "I/O" chapters Mar 7, 2018
@tonyskjellum tonyskjellum added scheduled reading Reading is scheduled for the next meeting and removed not ready labels May 23, 2018
@tonyskjellum
Copy link
Author

Today, the associated pull request has been added for a plan to read at Austin meeting; pull request ahead of 2-week deadline. We may still update PR further before the two-week deadline.

@tonyskjellum
Copy link
Author

mpi32-report-ticket78.pdf

This is the public copy.

@tonyskjellum
Copy link
Author

We did the first reading attempt in Austin on June 13, 2018. There were a few issues raised, and quality improvements needed. We will fix these and present a new reading in Barcelona in September.

@tonyskjellum tonyskjellum added not ready and removed scheduled reading Reading is scheduled for the next meeting labels Sep 5, 2018
@tonyskjellum tonyskjellum added the wg-io I/O Working Group label Sep 24, 2018
@tonyskjellum tonyskjellum changed the title Non-blocking communicator/file/win constructors/destructors/etc, respectively to "Groups, Contexts, Communicators, Caching," "Topology," and "I/O" chapters Non-blocking communicator/file constructors/destructors/etc, respectively to "Groups, Contexts, Communicators, Caching," "Topology," and "I/O" chapters Sep 24, 2018
@tonyskjellum tonyskjellum changed the title Non-blocking communicator/file constructors/destructors/etc, respectively to "Groups, Contexts, Communicators, Caching," "Topology," and "I/O" chapters Non-blocking communicator/file/win constructors/destructors/etc, respectively to "Groups, Contexts, Communicators, Caching," "Topology," and "I/O" chapters Sep 24, 2018
@tonyskjellum tonyskjellum changed the title Non-blocking communicator/file/win constructors/destructors/etc, respectively to "Groups, Contexts, Communicators, Caching," "Topology," and "I/O" chapters Non-blocking communicator/file constructors/destructors/etc, respectively to "Groups, Contexts, Communicators, Caching," "Topology," and "I/O" chapters Sep 24, 2018
@tonyskjellum
Copy link
Author

At the Persistence/Collective joint meeting today, we agreed that we need to justify the utility of these functions and establish community best practices.

@tonyskjellum
Copy link
Author

We will need to retarget for MPI-4.x

@tonyskjellum tonyskjellum added mpi-5 For inclusion in the MPI 5.0 standard and removed mpi-4.0 labels Mar 25, 2020
@tonyskjellum
Copy link
Author

Decided to focus this for MPI-5.0.

This orthogonalizes the standard more, and provides a path to useful new functionality. So, we want to bring this back.

@dholmes-epcc-ed-ac-uk : when we split between orthogonalization and key new functionality, we will explore splitting ticket, but not before.

@tonyskjellum
Copy link
Author

@wesbland @Wee-Free-Scot Hi, I am interested in pursuing this with Dan for MPI-5.0. We are re-reading the comments, and I will close the loop with him to see how to proceed. Can we make a ticket like this that cross-cuts many chapters, or should we do piecemeal?

@wesbland
Copy link
Member

@wesbland @Wee-Free-Scot Hi, I am interested in pursuing this with Dan for MPI-5.0. We are re-reading the comments, and I will close the loop with him to see how to proceed. Can we make a ticket like this that cross-cuts many chapters, or should we do piecemeal?

If you're the one planning to do the work across many chapters, than one ticket is great. If you are planning to rally the CCs from each chapter and having each one does as its own PR, then one ticket per chapter/PR is best.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mpi-5 For inclusion in the MPI 5.0 standard wg-collectives Collectives Working Group wg-io I/O Working Group
Projects
Status: To Do
MPI 5.0
To Do
Development

No branches or pull requests

8 participants