-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tests seemingly fail if MPI runs on login node are not allowed/configured #211
Comments
galexv
changed the title
Tests seemingly fail when MPI runs on login node are not allowed/configured
Tests seemingly fail if MPI runs on login node are not allowed/configured
May 5, 2016
galexv
added a commit
that referenced
this issue
Mar 29, 2019
This commit introduces the following environment variables that affect `make test` (or `ctest`) behavior of ALPSCore: | Variable | Default | Usual value | Meaning | | `ALPS_TEST_MPIEXEC` | `${MPIEXEC}` | `mpiexec` | MPI launcher | | `ALPS_TEST_MPI_NROC_FLAG` | `${MPIEXEC_NUMPROC_FLAG}`| `-n` | flag to specify the number of MPI processes | | `ALPS_TEST_MPI_NPROC` | 1 | 1 | How many MPI processes to launch in MPI-enabled tests | | `ALPS_TEST_MPIEXEC_PREFLAGS` | `${MPIEXEC_PREFLAGS}` | (empty string) | MPI launcher arguments preceding the executable name | | `ALPS_TEST_MPIEXEC_POSTFLAGS` | `${MPIEXEC_POSTFLAGS}` | (empty string) | MPI launcher arguments preceding the arguments for the executable | The `${...}` above are CMake variables, normally set by `find_MPI` module. Related: issue #211. This should close #296. Intended use: ** Case 1: Vanilla MPI-enabled environment. ** The command to run an MPI program using 2 processes: `mpiexec -n 2 some_test` Setting the variables to run each MPI-enabled tests on 2 processes: `ALPS_TEST_MPI_NPROC=2 make test` ** Case 2: NERSC Cori ** (Disclaimer: not tested with an actual Cori run) Users are not supposed to run `mpiexec`. One has to allocate interactive nodes first. Allocating 2 Haswell nodes for 30 minutes: `salloc -N 2 -C haswell -q interactive -t 0:30:00` Command to run on the allocated nodes: `srun some_test` Setting the variables to run each MPI-enabled tests on the allocated nodes: `ALPS_TEST_MPIEXEC=srun ALPS_TEST_MPI_NPROC=' ' ALPS_TEST_MPI_NPROC_FLAG=' ' make test` (note the variables are assigned spaces, not empty strings!) ** Case 3: Blue Waters ** (Disclaimer: not tested on actual Blue Waters machine) The `aprun` command is supposed to be used to launch parallel processes from an interactive node (see https://bluewaters.ncsa.illinois.edu/using-aprun ). Command to run on 16 cores, using 8 cores per node (that is, 2 nodes), placing the processes on adjacent cores: `aprun -N 8 -d 1 -n 16 some_test` Setting the variables to run each MPI-enabled tests with this configuration: `ALPS_TEST_MPIEXEC=aprun ALPS_TEST_MPI_NPROC=16 ALPS_TEST_MPI_NPROC_FLAG='-N 8 -d 1 -n' make test`
galexv
added a commit
that referenced
this issue
Mar 29, 2019
`make test` (or `ctest`) behavior of ALPSCore: | Variable | Default | Usual value | Meaning | |-----------------------------|-------------------------|----------------|---------| | `ALPS_TEST_MPIEXEC` | `${MPIEXEC}` | `mpiexec` | MPI launcher | | `ALPS_TEST_MPI_NROC_FLAG` | `${MPIEXEC_NUMPROC_FLAG}`| `-n` | flag to specify the number of MPI processes | | `ALPS_TEST_MPI_NPROC` | 1 | 1 | How many MPI processes to launch in MPI-enabled tests | | `ALPS_TEST_MPIEXEC_PREFLAGS` | `${MPIEXEC_PREFLAGS}` | (empty string) | MPI launcher arguments preceding the executable name | | `ALPS_TEST_MPIEXEC_POSTFLAGS` | `${MPIEXEC_POSTFLAGS}` | (empty string) | MPI launcher arguments preceding the arguments for the executable | The `${...}` above are CMake variables, normally set by `FindMPI` module. Related: issue #211. This should close #296. Intended use: **Case 1: Vanilla MPI-enabled environment.** The command to run an MPI program using 2 processes: `mpiexec -n 2 some_test` Setting the variables to run each MPI-enabled tests on 2 processes: `ALPS_TEST_MPI_NPROC=2 make test` **Case 2: NERSC Cori** (*Disclaimer:* not tested with an actual Cori run) Users are not supposed to run `mpiexec`. One has to allocate interactive nodes first. Allocating 2 Haswell nodes for 30 minutes: `salloc -N 2 -C haswell -q interactive -t 0:30:00` Command to run on the allocated nodes: `srun some_test` Setting the variables to run each MPI-enabled tests on the allocated nodes: `ALPS_TEST_MPIEXEC=srun ALPS_TEST_MPI_NPROC=' ' ALPS_TEST_MPI_NPROC_FLAG=' ' make test` (note the variables are assigned spaces, not empty strings!) **Case 3: Blue Waters** (*Disclaimer:* not tested on actual Blue Waters machine) The `aprun` command is supposed to be used to launch parallel processes from an interactive node (see https://bluewaters.ncsa.illinois.edu/using-aprun ). Command to run on 16 cores, using 8 cores per node (that is, 2 nodes), placing the processes on adjacent cores: `aprun -N 8 -d 1 -n 16 some_test` Setting the variables to run each MPI-enabled tests with this configuration: `ALPS_TEST_MPIEXEC=aprun ALPS_TEST_MPI_NPROC=16 ALPS_TEST_MPI_NPROC_FLAG='-N 8 -d 1 -n' make test`
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Running
make test
reports failure on Stampede. The reasons are:mpd
is not running, andmpiexec
is anyway not allowed [although it can be run, if needed. ;) ].We may want to check if a dummy MPI program runs successfully, and if not, run the tests without
mpiexec
(they should still pass). Also, we may want to havempitest
make-target to run MPI-related tests only, and on more than 1 core.The text was updated successfully, but these errors were encountered: