New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
{mpi}[GCC/13.2.0] OpenMPI v5.0.3, PMIx v5.0.2 #17561
base: develop
Are you sure you want to change the base?
{mpi}[GCC/13.2.0] OpenMPI v5.0.3, PMIx v5.0.2 #17561
Conversation
I don't think there is really anything new to do with regards to CUDA. Just continue to patch in support for internal header. |
easybuild/easyconfigs/o/OpenMPI/OpenMPI-5.0.0rc10-GCC-12.2.0.eb
Outdated
Show resolved
Hide resolved
is this PR going to be merged soon? I would be interested in using this version of OpenMPI. |
…penMPI500rc10 bump PMIx and OpenMPI to 5.0.1 and use GCC 13.2.0
My remaining question here is, whether we want to add the CUDA-related patches first, or merge this PR as is and add the CUDA-related patches in a follow-up PR? |
@boegelbot please test @ jsc-zen3 |
@SebastianAchilles: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de PR test command '
Test results coming soon (I hope)... - notification for comment with ID 1904058315 processed Message to humans: this is just bookkeeping information for me, |
Test report by @boegelbot |
I can have a look this week to see how hard it is to port over the internal CUDA patches... |
@boegelbot please test @ jsc-zen3 |
@SebastianAchilles: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de PR test command '
Test results coming soon (I hope)... - notification for comment with ID 1904168663 processed Message to humans: this is just bookkeeping information for me, |
Test report by @boegelbot |
This patch has changed since libcuda is no longer dlopen()'ed by Open MPI. Instead we can generate a stub library, and at runtime the CUDA-dependent DSO's (but not the main libmpi.so library) load libcuda.so. This is then consistent with https://docs.open-mpi.org/en/v5.0.x/tuning-apps/networking/cuda.html (but --enable-mca-dso=<comma-delimited-list-of-cuda-components> is done by default already)
@boegelbot please test @ jsc-zen3 |
@boegel: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de PR test command '
Test results coming soon (I hope)... - notification for comment with ID 1946547876 processed Message to humans: this is just bookkeeping information for me, |
Test report by @boegelbot |
I also had a look at |
# disable MPI1 compatibility for now, see what breaks... | ||
# configopts += '--enable-mpi1-compatibility ' | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# disable MPI1 compatibility for now, see what breaks... | |
# configopts += '--enable-mpi1-compatibility ' |
This is commented out in all easyconfigs in EB5. I suggest we drop the comment for OpenMPI 5.
Test report by @branfosj |
Test report by @branfosj |
Test report by @branfosj |
@boegel I guess we should also include the Edit: done in boegel#94 |
Add write memory barrier patch for `smcuda` to OpenMPI 5.0.2 easyconfig
@bartoldeman how did you decide which functions to put stubs for in your patch? I guess those will need to potentially be updated for newer OpenMPI versions, that might call additional functions? |
@casparvl this is just a result of grepping for them or if you compile and it's not in the header file you get an error message. Indeed newer Open MPI may use more/different CUDA functions which would necessitate changing the header file. |
@boegelbot please test @ generoso |
@boegel: Request for testing this PR well received on login1 PR test command '
Test results coming soon (I hope)... - notification for comment with ID 1995061997 processed Message to humans: this is just bookkeeping information for me, |
Test report by @boegelbot |
@boegelbot please test @ jsc-zen3 |
ofi with psm3 caused issues on Generoso.. I think we've seen something like this before... |
@bartoldeman: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de PR test command '
Test results coming soon (I hope)... - notification for comment with ID 1996249416 processed Message to humans: this is just bookkeeping information for me, |
Test report by @boegelbot |
`OpenMPI-5.0.x_add_atomic_wmb.patch` is obsolete now
Bump to OpenMPI to 5.0.3, PMIx to 5.0.2
@boegelbot please test @ jsc-zen3 |
@bartoldeman: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de PR test command '
Test results coming soon (I hope)... - notification for comment with ID 2086458131 processed Message to humans: this is just bookkeeping information for me, |
Test report by @boegelbot |
(created using
eb --new-pr
)WIP since we're using release candidates here, not final releases.
I had to strip out the CUDA-related patches we are using for OpenMPI 4.1.5 to get the build working, we'll need to figure out how to move forward there (cc @Micket, @bartoldeman)