Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use PRRTE MCA environment variable for oversubscription in OpenMPI easyblock #3360

Merged

Conversation

geimer
Copy link
Contributor

@geimer geimer commented Jun 13, 2024

With Open MPI 5.x, PRRTE is used as run-time environment. This requires setting a different MCA environment variable to allow for node oversubscription when running tests. See https://docs.open-mpi.org/en/v5.0.x/mca.html#converting-mapping-parameters

@bedroge
Copy link
Contributor

bedroge commented Jun 13, 2024

Thanks for this fix! I actually ran into that issue recently when testing the OpenMPI 5.0.3 easyconfig, but I hadn't looked into the cause. This seems to work fine for me (test report coming soon...).

@bedroge
Copy link
Contributor

bedroge commented Jun 13, 2024

@boegelbot please test @ jsc-zen3
EB_ARGS="OpenMPI-5.0.3-GCC-13.3.0.eb"

@boegelbot
Copy link

@bedroge: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de

PR test command 'if [[ develop != 'develop' ]]; then EB_BRANCH=develop ./easybuild_develop.sh 2> /dev/null 1>&2; EB_PREFIX=/home/boegelbot/easybuild/develop source init_env_easybuild_develop.sh; fi; EB_PR=3360 EB_ARGS="OpenMPI-5.0.3-GCC-13.3.0.eb" EB_REPO=easybuild-easyblocks EB_BRANCH=develop /opt/software/slurm/bin/sbatch --job-name test_PR_3360 --ntasks=8 ~/boegelbot/eb_from_pr_upload_jsc-zen3.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 4372

Test results coming soon (I hope)...

- notification for comment with ID 2165750313 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@bedroge
Copy link
Contributor

bedroge commented Jun 13, 2024

Test report by @bedroge

Overview of tested easyconfigs (in order)

  • SUCCESS OpenMPI-5.0.3-GCC-13.3.0.eb

Build succeeded for 1 out of 1 (1 easyconfigs in total)
bob-Latitude-5300 - Linux Ubuntu 22.04, x86_64, Intel(R) Core(TM) i7-8665U CPU @ 1.90GHz, Python 3.10.12
See https://gist.github.com/bedroge/6337c328ce45a7b00660acd808f7e50a for a full test report.

edit: log shows that it was using the correct environment variable, e.g.:

== 2024-06-13 16:16:52,116 easyblock.py:3634 INFO sanity check command PRTE_MCA_rmaps_default_mapping_policy=:oversubscribe mpirun -n 6 /data/eb/build/OpenMPI/5.0.3/GCC-13.3.0/mpi_test_ring_usempi ran successfully! (output: Process 0 sending 10 to  1 tag 201 ( 6 processes in ring)

@boegelbot
Copy link

Test report by @boegelbot

Overview of tested easyconfigs (in order)

  • SUCCESS OpenMPI-5.0.3-GCC-13.3.0.eb

Build succeeded for 1 out of 1 (1 easyconfigs in total)
jsczen3c1.int.jsc-zen3.fz-juelich.de - Linux Rocky Linux 9.4, x86_64, AMD EPYC-Milan Processor (zen3), Python 3.9.18
See https://gist.github.com/boegelbot/2bd90138e57fb9e7da88bffce7f80f82 for a full test report.

@bedroge
Copy link
Contributor

bedroge commented Jun 13, 2024

Test report by @bedroge

Overview of tested easyconfigs (in order)

  • SUCCESS OpenMPI-4.1.5-GCC-12.2.0.eb

Build succeeded for 1 out of 1 (1 easyconfigs in total)
bob-Latitude-5300 - Linux Ubuntu 22.04, x86_64, Intel(R) Core(TM) i7-8665U CPU @ 1.90GHz, Python 3.10.12
See https://gist.github.com/bedroge/c9e5cce9dcb327a055620a818a44d4c8 for a full test report.

edit: for this version it also picks up the right environment variable:

== 2024-06-13 17:05:05,445 easyblock.py:3634 INFO sanity check command OMPI_MCA_rmaps_base_oversubscribe=1 mpirun -n 6 /data/eb/build/OpenMPI/4.1.5/GCC-12.2.0/mpi_test_ring_usempi ran successfully!

@bedroge
Copy link
Contributor

bedroge commented Jun 13, 2024

Test report by @bedroge

Overview of tested easyconfigs (in order)

  • SUCCESS OpenMPI-5.0.3-GCC-13.3.0.eb
  • SUCCESS OpenMPI-4.1.6-GCC-13.2.0.eb

Build succeeded for 2 out of 2 (2 easyconfigs in total)
interactive2 - Linux Rocky Linux 8.9, x86_64, AMD EPYC-Milan Processor (zen2), Python 3.6.8
See https://gist.github.com/bedroge/f349b8b84ec078c6e63591ccd6e20804 for a full test report.

@bedroge bedroge changed the title OpenMPI: Support PRRTE MCA env var for oversubscription support PRRTE MCA environment variable for oversubscription in OpenMPI easyblock Jun 13, 2024
@bedroge bedroge added this to the release after 4.9.2 milestone Jun 13, 2024
@bedroge bedroge merged commit 10e9a62 into easybuilders:develop Jun 13, 2024
41 checks passed
@geimer geimer deleted the fix-openmpi5-oversubscribe-envvar branch June 14, 2024 07:30
@boegel boegel changed the title support PRRTE MCA environment variable for oversubscription in OpenMPI easyblock use PRRTE MCA environment variable for oversubscription in OpenMPI easyblock Jun 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants