Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Q] srun + OpenMPI MPI_Comm_split_type #8299

Closed
angainor opened this issue Dec 18, 2020 · 7 comments
Closed

[Q] srun + OpenMPI MPI_Comm_split_type #8299

angainor opened this issue Dec 18, 2020 · 7 comments

Comments

@angainor
Copy link

I'm having trouble using MPI_Comm_split_type for in-node splits with custom OMPI_COMM_TYPE_*. Everything works fine when I run with mpirun, but the same code doesn't work with srun. Is this supposed to work, or are there some limitations I'm not aware of? I'm trying with OpenMPI 4.0.3 and 4.0.5, Centos 7.7 with Slurm 19.05, stock hwloc 1.11 (but I also tried to compile OpenMPI with hwloc 2.4.0), pmix 3.1.5.

This is a simple test app:

#include <mpi.h>
#include <stdio.h>

int main(int argc, char **argv)
{
  MPI_Comm split_comm;
  int split_rank = -1, split_size = -1;

  MPI_Init(&argc, &argv);

  MPI_Comm_split_type(MPI_COMM_WORLD, OMPI_COMM_TYPE_NUMA, 0, MPI_INFO_NULL, &split_comm);
  MPI_Comm_rank(split_comm, &split_rank);
  MPI_Comm_size(split_comm, &split_size);

  fprintf(stderr, "rank %d >>> split size %d\n", split_rank, split_size);

  MPI_Barrier(MPI_COMM_WORLD);
  MPI_Finalize();
}

On our EPYC 7742 system I get this with mpirun:

mpirun -np 128 ./splittest
rank 0 >>> split size 16

and this with srun

srun -n 128 ./splittest
rank 0 >>> split size 1

Essentially, I get a split size 1 whatever type I use.

@jsquyres jsquyres added this to the v4.0.6 milestone Jan 12, 2021
@jsquyres
Copy link
Member

Apologies for the delay here; we missed this issue as it was filed over the holidays. The issue was reported against the v4.0.x series, but I pre-emptively also applied the v4.1.x tag, too.

@rhc54
Copy link
Contributor

rhc54 commented Jan 12, 2021

@bosilca pointed me to the right place in the code. The problem here is that the split code requests the process locality for each proc participating in the split - this locality is then used for the split. OMPI's runtime provides that locality, but Slurm does not. At the moment, the code doesn't return an error if OMPI is unable to get the locality of each process - it just assumes that every proc is in its own region.

PMIx v4 added support for computing locality to make it easier for RMs to implement it, but Slurm hasn't been updated yet to take advantage of it. That might be one solution, if someone wants to update Slurm to support the PMIx v4 features.

@bosilca
Copy link
Member

bosilca commented Jan 12, 2021

During the discussion I proposed to throw an error if we are missing the information for the requested split type. I should have known better, one don't just raise errors in MPI, as we will automatically trigger the default error handler and abort the application. Thus, with the current MPI version there is little we can do, except returning a duplicate of MPI_COMM_SELF to put each process in its own communicator.

@rhc54
Copy link
Contributor

rhc54 commented Jan 12, 2021

Ah, good point! I had totally missed that one too. FWIW: it would be pretty simple to update Slurm to provide the required info, if someone is interested in doing so. Basically just one function call.

@angainor
Copy link
Author

Apologies for the delay here; we missed this issue as it was filed over the holidays. The issue was reported against the v4.0.x series, but I pre-emptively also applied the v4.1.x tag, too.

NP. we use mpirun for now, so it's not critical.

@jsquyres
Copy link
Member

jsquyres commented Feb 1, 2021

@artpol84 Since NVIDIA is the maintainer of the SLURM plugin, is there any chance you guys will add support for this? 😄

@rhc54
Copy link
Contributor

rhc54 commented Mar 1, 2021

Slurm folks show no interest in supporting it, and no answer from NVIDIA

@rhc54 rhc54 closed this as completed Mar 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants