-
Notifications
You must be signed in to change notification settings - Fork 845
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New MPI_SCATTER bug in OpenMPI5 #12383
Comments
Thanks for the report! meanwhile, you can |
Please drop a note here whether you can confirm the bug or not. My understanding is that for the non-root MPI procs, |
The issue can be evidenced with 2 nodes and 4 mpi tasks. the inline patch below is just enough to fix this issue, but similar ones likely occur in other parts of the module diff --git a/ompi/mca/coll/han/coll_han_scatter.c b/ompi/mca/coll/han/coll_han_scatter.c
index 6f65d7d427..ce180d6ff9 100644
--- a/ompi/mca/coll/han/coll_han_scatter.c
+++ b/ompi/mca/coll/han/coll_han_scatter.c
@@ -242,9 +242,16 @@ int mca_coll_han_scatter_ls_task(void *task_args)
t->w_rank));
OBJ_RELEASE(t->cur_task);
- t->low_comm->c_coll->coll_scatter((char *) t->sbuf, t->scount, t->sdtype, (char *) t->rbuf,
- t->rcount, t->rdtype, t->root_low_rank, t->low_comm,
- t->low_comm->c_coll->coll_scatter_module);
+ if (t->root == t->w_rank) {
+ t->low_comm->c_coll->coll_scatter((char *) t->sbuf, t->scount, t->sdtype, (char *) t->rbuf,
+ t->rcount, t->rdtype, t->root_low_rank, t->low_comm,
+ t->low_comm->c_coll->coll_scatter_module);
+ } else {
+ t->low_comm->c_coll->coll_scatter((char *) t->sbuf, t->rcount, t->rdtype, (char *) t->rbuf,
+ t->rcount, t->rdtype, t->root_low_rank, t->low_comm,
+ t->low_comm->c_coll->coll_scatter_module);
+ }
+
if (t->sbuf_inter_free != NULL && t->noop != true) {
free(t->sbuf_inter_free); |
@dl1ycf yes, I was able to reproduce the issue. I share your analysis: non-root ranks should ignore send buffer/count/type, so this is a bug in the |
Do you assume that recvcount on the non-root ranks equals sendcount on the root rank? I read the manual such that recvcount could be larger (if recvbuf is longer than needed). But I can confirm that coll=^han saves me. |
no, and that would be an incorrect assumption. The correct one is |
@ggouaillardet Since you have already figured out the issue, would it be possible for you to open a patch for this? That would be wonderful :) |
@ggouaillardet nailed it. The issue here is that leafs promoted leaders in the node-level comms should receive the data as rdtype/rcount and then propagate it as rdtype/rcount, with the exception of the original root, who will use the original buffer and thus must use sdtype/scount. |
Fixes open-mpi#12383 Thanks to Christoph van Wüllen for reporting the issue. Signed-off-by: Wenduo Wang <wenduwan@amazon.com>
Fixes open-mpi#12383 Thanks to Christoph van Wüllen for reporting the issue. Signed-off-by: Wenduo Wang <wenduwan@amazon.com>
Fixes open-mpi#12383 Thanks to Christoph van Wüllen for reporting the issue. Signed-off-by: Wenduo Wang <wenduwan@amazon.com> (cherry picked from commit a58e884)
OpenMPI 5.0.2
An MPI_SCATTER fails if we have more than one node each running more than 1 process, and if recvcount on the slave
does not match sendcount on the sender. The following FORTRAN program demonstrates the error: if run as follows
rank 0 on node1
rank 1 on node1
rank 2 on node2
rank 3 on node2
rank 4 on node3
rank 5 on node3
the result on ranks 2,3,4,5 is wrong. If run with one rank per node it is OK.
When changing the second parameter of the second (client) MPI_SCATTER from 0 to 1000 it works
in either case.
P.S.: this was no problem with OpenMPI 4
Yours,
Christoph van Wüllen
The text was updated successfully, but these errors were encountered: