-
Notifications
You must be signed in to change notification settings - Fork 914
Description
(This problem is version-agnostic, present e.g. in 4.1.0 but also in the current Git.)
The function MPI_Sendrecv_replace
first creates a temporary contiguous buffer into which it moves the local data for sending and in this way decouples the send-receive operation by sending from that buffer and receiving into the original one.
This requires that the data to be sent are first converted to the MPI_PACKED
(1-byte) datatype and that the associated count
(or packed_size
to be more specific) parameter is adequately increased. To perform the send-receive, MPI_Sendrecv_replace
then simply calls MPI_Sendrecv
operating on the two buffers.
However, to call MPI_Sendrecv
the 32-bit integer API is used, which puts a needless internal limit on the actual byte-size (packed_size
) of the message.
Given how simple the code of MPI_Sendrecv
is---particularly when the parameter checking is omitted given that this would be an internal call---couldn't the code be duplicated in MPI_Sendrecv_replace
, too, to avoid this limit?
Such a limit is not present in MPICH, which apparently uses internal 64-bit integer API in such places.