Meeting Notes
We went over the proposal document and started discussions on topics like window allocation and thread-support. Particularly, we discussed whether we should include an explicit handle for network resources (like OSHMEM contexts) or allow for window duplication, with multiple duplicated windows pointing to the same memory region. No decision has been made yet. The discussion also centered around the question of dynamic windows. One solution to the current problem would be an explicit network handle that allows direct access to the remote memory.
We talked about a way forward for changes in RMA beyond 4.1. We will use a shared document to structure the document text and continue collecting change ideas on Github. The process likely will be virtual, given the ongoing travel restrictions. A discussion ensued about the value PSCW. Eventually, it became clear that we need efficient P2P signalling, which PSCW was meant for but did not fulfill. Send/recv suffers from a disconnect from RMA communication. An RMA-based signalling mechanism should be included in the future.
We quickly reviewed the readings at the MPI Forum (mostly uneventful, smaller changes to https://github.com/mpi-forum/mpi-standard/pull/708) and discussed how to fix MPI_MODE_NOSTORE
in the face of shared memory and the extension of shared memory to regular windows. No decision has been made yet.
We discussed the question of using attributes to query support for strong progress (https://github.com/mpi-forum/mpi-issues/issues/635) and found that there are too many dimensions to map to a single attribute. This may be a good use case for the MPI tools interface to provide a playground for exposing such information to the application. We also talked about the collecting applications using MPI RMA in the wiki (https://github.com/mpiwg-rma/mpi-standard/wiki/Application-Overview, restricted access) to build a base of feature used and challenges encountered for future directions. Everyone with access is encouraged to add applications worth further investigating.
We discussed the possibility of adding a new function MPI_WIN_SHARED_QUERY_AT
used to query shared memory from dynamic windows. However, we decided that we should not add new features for dynamic windows in MPI 4.1. We also quickly reviewed action items from the progress discussion last week. Querying progress guarantees from MPI is tricky because some operations may progress (put/get) while others may not (floating point accumulates) without a software agent. Also, future extensions of MPI may provide dynamic progress APIs, for which an attribute would be too static. We need to clarify progress in MPI_WIN_SYNC
though (https://github.com/mpi-forum/mpi-issues/issues/634).
We discussed the changes to allow calling MPI_WIN_SHARED_QUERY
on windows other than those created through MPI_WIN_ALLOCATE_SHARED
. Joseph will move the PR to the Forum for 4.1. We also discussed deprecating MPI_PROD
and MPI_M**LOC
in RMA. It appears that there is no good use case for MPI_PROD
so it may be sensible to deprecate it in 4.1 (and potentially undeprecate it if users complain). There appear to be users for MPI_M**LOC
in RMA. However, it is unlikely that this feature will ever be available in network hardware. It could be maintained for usability reasons.
Note: we moved from a biweekly to a weekly schedule.
We discussed some of the open tickets for MPI 4.1. It is unclear whether we have sufficient time to find consensus on some of the issues (atomic operations, progress semantics) and who would drive them. It seems that there is a general lack of bandwidth for the (rather short) 4.1 cycles. A few points on the semantic terms ticket were also discussed.
We discussed the question of whether communication calls may block waiting for preceding synchronization to complete if the synchronization return before the synchronization is complete. This is currently allowed for PSCW. We will try to disallow this in the PSCW section through a separate ticket in order to make sure that all calls to communication operations never block.
We discussed the upcoming FoRMA workshop and a possible solution for the discrepancy between communication operations being nonblocking and being allowed to block waiting for deferred synchronization: we will explicitly mention that these calls are generally nonblocking and mention the exception of deferred synchronization (e.g., waiting for the post to occur), which may make one call blocking and incomplete.
We discussed the organization of the upcoming workshop in June. Tentative dates set are the week of June 13-16. We also discussed proposed changes to the RMA progress section.
We discussed (and finalized) a proposal to state that MPI_Win_free
is generally synchronizing in normative text, not in an AtoI.
Main topic was the organization of a small workshop/symposium meant to gather input from users, implementors, and vendors on the future of MPI RMA. A proposal to hold such a workshop at ISC'22 was rejected so we will do it on our own, likely virtual due to persisting travel restrictions.
Further notes:
- 2 contiguous days preferable over 2 days in separate weeks
- Maybe co-located with the MPI Forum in March?
- Otherwise, week after ISC (week before ISC collides with IPDPS)
- Joseph will create a doodle poll and set up a dedicated wiki page (link to come)
We discussed some of the changes proposed after Dan's Semantic Terms review.
- PR: https://github.com/mpiwg-rma/mpi-standard/pull/1
- The synchronization semantics of MPI_Win_free not only depend on whether the
no_locks
info key was set but also on whether the last call toMPI_Win_fence
was made usingMPI_MODE_NOSUCCEED
. The PR needs updating. - There seems to be some agreement that
MPI_Win_fence
is bad but that a collective synchronization that works with passive target would be nice to have (similar toshmem_fence
)
- PR: https://github.com/mpiwg-rma/mpi-standard/pull/2
- The current standard expresses the RMA transfer semantics of put/get in terms of send and receive
- The PR describes the semantics in in reference to the data types chapter and spells out over- and underflow semantics. It needs the addition of the displacement unit.
- The description in terms of send and receive might be easier to understand for MPI users as it uses familiar terms.
- We need to make sure nothing is lost in the rewrite.