Implement hierarchical MPI_Gatherv and MPI_Scatterv #12376

Relax the function requirement to allow null low/up_rank output pointers, and rename the arguments because the function works for non-root ranks as well. Signed-off-by: Wenduo Wang <wenduwan@amazon.com>

Add gatherv implementation to optimize large-scale communications on multiple nodes and multiple processes per node, by avoiding high-incast traffic on the root process. Because *V collectives do not have equal datatype/count on every process, it does not natively support message-size based tuning without an additional global communication. Similar to gather and allgather, the hierarchical gatherv requires a temporary buffer and memory copy to handle out-of-order data, or non-contiguous placement on the output buffer, which results in worse performance for large messages compared to the linear implementation. Signed-off-by: Wenduo Wang <wenduwan@amazon.com>

Add scatterv implementation to optimize large-scale communications on multiple nodes and multiple processes per node, by avoiding high-incast traffic on the root process. Because *V collectives do not have equal datatype/count on every process, it does not natively support message-size based tuning without an additional global communication. Similar to scatter, the hierarchical scatterv requires a temporary buffer and memory copy to handle out-of-order data, or non-contiguous placement on the send buffer, which results in worse performance for large messages compared to the linear implementation. Signed-off-by: Jessie Yang <jiaxiyan@amazon.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement hierarchical MPI_Gatherv and MPI_Scatterv #12376

Implement hierarchical MPI_Gatherv and MPI_Scatterv #12376

Commits on Mar 22, 2024