-
Couldn't load subscription status.
- Fork 928
Closed
Description
OpenMPI can fail to find libcuda.so and will build without opal acclerator cuda when --with-cuda/--with-cuda-libdir is specified.
This was already reported as a bug in #12264 and fixed in #12382, but the bug persists.
I noticed in with a v5.0.3 tarball, and have been able to reproduce on master.
Details of the problem
user@bigtwin1d:~/bkitor/bk_share/ompi_builds/ompi[master]$ ./build/bin/ompi_info | grep 'Configure command'
Configure command line: '--prefix=/home/user/bkitor/bk_share/ompi_builds/ompi/build' '--with-cuda=/usr/local/cuda' '--with-ofi=/usr/local'
user@bigtwin1d:~/bkitor/bk_share/ompi_builds/ompi[master]$ ./build/bin/ompi_info | grep 'MCA accelerator'
MCA accelerator: null (MCA v2.1.0, API v1.0.0, Component v5.1.0)user@bigtwin1d:~/bkitor/bk_share/ompi_builds/ompi[master]$ ./build/bin/ompi_info | grep 'Configure command'
Configure command line: '--prefix=/home/user/bkitor/bk_share/ompi_builds/ompi/build' '--with-cuda=/usr/local/cuda' '--with-cuda-libdir=/usr/local/cuda' '--with-ofi=/usr/local'
user@bigtwin1d:~/bkitor/bk_share/ompi_builds/ompi[master]$ ./build/bin/ompi_info | grep 'MCA accelerator'
MCA accelerator: null (MCA v2.1.0, API v1.0.0, Component v5.1.0)The crux of the issue is that /usr/local/cuda is a symlink, and the find command in opal_check_cuda.m4 won't follow it by default.
Adding the -H flag should fix the issue.