MPI_PROC_NULL in Topology Creation #4675

omor1 · 2017-12-29T01:49:02Z

I'm trying to create a binary tree topology using MPI_Dist_graph_create_adjacent(), simplifying the graph boundaries by using MPI_PROC_NULL. This allows specifying all nodes in a consistent way.
I'm not sure this is allowed by the specification; I could find no information either way.
However, the neighborhood collectives specify that the borders of a cartesian topology act as though they send and receive from MPI_PROC_NULL. It could be useful to be able to obtain similar behavior in a generic graph.

Example code is given below:

int world_rank;
int world_size;
int neighbor[3];

MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
MPI_Comm_size(MPI_COMM_WORLD, &world_size);

neighbor[0] = world_rank > 0 ? (world_rank-1)/2 : MPI_PROC_NULL;
neighbor[1] = 2*world_rank+1 < world_size ? 2*world_rank+1 : MPI_PROC_NULL;
neighbor[2] = 2*world_rank+2 < world_size ? 2*world_rank+2 : MPI_PROC_NULL;

MPI_Dist_graph_create_adjacent(MPI_COMM_WORLD,
                               3, neighbor, MPI_UNWEIGHTED,
                               3, neighbor, MPI_UNWEIGHTED,
                               MPI_INFO_NULL, true, &CommTree);

Currently this code results in the following error:

*** An error occurred in MPI_Dist_graph_create_adjacent invalid sources
*** reported by process [3896508417,2]
*** on communicator MPI_COMM_WORLD
*** MPI_ERR_ARG: invalid argument of some other kind
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)

The text was updated successfully, but these errors were encountered:

bosilca · 2018-01-02T19:42:20Z

My understanding of the MPI standard description of the topology creation functions is that the list of neighbors is rank based, and should only contains the meaningful neighbors. Thus, you cannot have MPI_PROC_NULL as a neighbor in any type of topology.

omor1 · 2018-01-02T21:04:46Z

It is true that the distributed graph constructors specify sources and destinations as "array[s] of non-negative integers" describing "ranks of processes for which the calling process is a destination / source".

MPI_PROC_NULL is defined as -2 in Open MPI, which would make it invalid according to the specification. However, the regular graph constructor MPI_Graph_create() specifies edges as "array of integers describing graph edges", which could mean that perhaps MPI_PROC_NULL is legal there.

I'm not quite sure what you mean by meaningful neighbors—that term isn't used in the specification, and is certainly vague, as what is 'meaningful' can vary between implementations. The specification in fact explicitly allows edges to be defined multiple times for the same (source, dest) pair, but leaves the meaning up to the implementation. Similarly, at least for the non-distributed graph constructor, a process can be its own neighbor—though this isn't explicitly stated that this is true for the distributed graph constructor, I can't see a reason why this wouldn't be allowed as well.

Regarding MPI_PROC_NULL, the specification states:

The special value MPI_PROC_NULL can be used instead of a rank wherever a source or a destination argument is required in a call.

This would appear to imply that MPI_PROC_NULL should be legal in topology creation functions for specifying ranks of processes, though the standard is vague on this point.

bosilca · 2018-01-02T21:27:24Z

meaningful neighbor is a neighbor that define an edge where a communication will take place. This does not prevent 2 ranks of being neighbors multiple times, not a rank of being it's own neighbor. Communications with MPI_PROC_NULL are meaningless, and except for adding gaps into the communication buffers I cant see why you want to specify them.

This discussion pertains to the MPI standardization effort, I would suggest to ask the question on the MPI Forum mailinglist (mpi-forum@lists.mpi-forum.org).

omor1 · 2018-03-08T18:54:01Z

I ended up posting this on the MPI Comments mailing list: Behavior of [Distributed] Graph Topology Constructors when a neighbor is MPI_PROC_NULL

Adding gaps in communication buffers is exactly the point here; it allows one to say "every process has n neighbors, but some of these happen to be null", functioning akin to how the neighborhood collectives function with non-periodic cartesian topologies:

For a Cartesian topology, created with MPI_Cart_create, the sequence of neighbors in the send and receive buffers at each process is defined by order of the dimensions, first the neighbor in the negative direction and then in the positive direction with displacement 1. The numbers of sources and destinations in the communication routines are 2*ndims with ndims defined in MPI_Cart_create. If a neighbor does not exist, i.e., at the border of a Cartesian topology in the case of a non-periodic virtual grid dimension (i.e., periods[...]==false), then this neighbor is defined to be MPI_PROC_NULL.

If a neighbor in any of the functions is MPI_PROC_NULL, then the neighborhood collective communication behaves like a point-to-point communication with MPI_PROC_NULL in this direction. That is, the buffer is still part of the sequence of neighbors but it is neither communicated nor updated.

This is useful for e.g. binary tree topologies (as described above).

Allowing MPI_PROC_NULL as a neighbor in any topology allows us to add gaps on the send and recv buffers. This does make the traditional neighbor collective have a similar behavior as the V version, but in same time it allows the users to skip the step where they prepare the counts and the displacement array. For more info please take a look at issue open-mpi#4675. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>

bosilca · 2018-03-09T03:22:23Z

From a practical perspective I see your point. You want a similar level of flexibility as the V version of the neighbor collective calls, but without having to provide an array of counts, nor compute locally the neighbors displacements.

I make a PR #4898. Give it a try and let us know. Meanwhile I will try to get convinced by the MPI forum that this is the right approach.

omor1 · 2018-03-10T05:13:49Z

I'll take a look at the PR. I'll also post on the MPI issues repository, as it seems more active than the mailing lists.

omor1 · 2018-03-16T19:56:14Z

It appears that MPICH actually supports this behavior. I've submitted an issue to the MPI Issues repository to clarify the wording to make this explicit.

bosilca mentioned this issue Mar 9, 2018

Allow MPI_PROC_NULL as neighbor. #4898

Merged

omor1 mentioned this issue Mar 13, 2018

Clarify where MPI_PROC_NULL is permitted mpi-forum/mpi-issues#87

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MPI_PROC_NULL in Topology Creation #4675

MPI_PROC_NULL in Topology Creation #4675

omor1 commented Dec 29, 2017

bosilca commented Jan 2, 2018

omor1 commented Jan 2, 2018

bosilca commented Jan 2, 2018

omor1 commented Mar 8, 2018

bosilca commented Mar 9, 2018

omor1 commented Mar 10, 2018

omor1 commented Mar 16, 2018