Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setting CXX/CC confuses CMake MPI detection #11478

Closed
tjhei opened this issue Jan 5, 2021 · 20 comments
Closed

Setting CXX/CC confuses CMake MPI detection #11478

tjhei opened this issue Jan 5, 2021 · 20 comments

Comments

@tjhei
Copy link
Member

tjhei commented Jan 5, 2021

Setting CC and CXX to mpicxx or mpicc confuses our MPI detection. While the library compiles correctly, include directories are not picked up in downstream projects leading to weird parsing errors inside IDEs.

Here is the test:

cd build-a
CC=mpicc CXX=mpicxx cmake -D DEAL_II_WITH_MPI=ON ..
cd ..

cd build-b
CXX=mpicxx cmake -D DEAL_II_WITH_MPI=ON ..
cd ..

cd build-c
cmake -D DEAL_II_WITH_MPI=ON ..
cd ..

cd build-d
CC=mpicc cmake -D DEAL_II_WITH_MPI=ON ..

Here is the result:

$ grep MPI_INCLUDE_DIR build-*/detailed.log
build-a/detailed.log:#            MPI_INCLUDE_DIRS = 
build-b/detailed.log:#            MPI_INCLUDE_DIRS = /usr/lib/openmpi/include/openmpi/opal/mca/event/libevent2021/libevent;/usr/lib/openmpi/include/openmpi/opal/mca/event/libevent2021/libevent/include;/usr/lib/openmpi/include;/usr/lib/openmpi/include/openmpi
build-c/detailed.log:#            MPI_INCLUDE_DIRS = /usr/lib/openmpi/include/openmpi/opal/mca/event/libevent2021/libevent;/usr/lib/openmpi/include/openmpi/opal/mca/event/libevent2021/libevent/include;/usr/lib/openmpi/include;/usr/lib/openmpi/include/openmpi
build-d/detailed.log:#            MPI_INCLUDE_DIRS = /usr/lib/openmpi/include/openmpi/opal/mca/event/libevent2021/libevent;/usr/lib/openmpi/include/openmpi/opal/mca/event/libevent2021/libevent/include;/usr/lib/openmpi/include;/usr/lib/openmpi/include/openmpi

and

$ grep CMAKE_CXX_COMPILER:F build-*/CMakeCache.txt
build-a/CMakeCache.txt:CMAKE_CXX_COMPILER:FILEPATH=/usr/bin/mpicxx
build-b/CMakeCache.txt:CMAKE_CXX_COMPILER:FILEPATH=/usr/bin/mpicxx
build-c/CMakeCache.txt:CMAKE_CXX_COMPILER:FILEPATH=/usr/bin/c++
build-d/CMakeCache.txt:CMAKE_CXX_COMPILER:FILEPATH=/usr/bin/c++

Finally, ld=gold is also affected:

$ diff build-a/detailed.log build-c/detailed.log 
9c9
< #        CMAKE_BINARY_DIR:       /ssd/deal-gite/build-a
---
> #        CMAKE_BINARY_DIR:       /ssd/deal-gite/build-c
11,12c11,12
< #                                /usr/bin/mpicxx
< #        CMAKE_C_COMPILER:       /usr/bin/mpicc
---
> #                                /usr/bin/c++
> #        CMAKE_C_COMPILER:       /usr/bin/cc
20c20
< #        DEAL_II_LINKER_FLAGS:         -Wl,--as-needed -rdynamic -fuse-ld=gold
---
> #        DEAL_II_LINKER_FLAGS:         -Wl,--as-needed -rdynamic  -lpthread
72,76c72,76
< #            MPI_CXX_FLAGS = 
< #            MPI_LINKER_FLAGS = 
< #            MPI_INCLUDE_DIRS = 
< #            MPI_USER_INCLUDE_DIRS = 
< #            MPI_LIBRARIES = /usr/lib/openmpi/lib/libmpi_usempif08.so;/usr/lib/openmpi/lib/libmpi_usempi_ignore_tkr.so;/usr/lib/openmpi/lib/libmpi_mpifh.so;/usr/lib/openmpi/lib/libmpi.so
---
> #            MPI_CXX_FLAGS = -pthread
> #            MPI_LINKER_FLAGS = -Wl,-rpath -Wl,/usr/lib/openmpi/lib -Wl,--enable-new-dtags -pthread
> #            MPI_INCLUDE_DIRS = /usr/lib/openmpi/include/openmpi/opal/mca/event/libevent2021/libevent;/usr/lib/openmpi/include/openmpi/opal/mca/event/libevent2021/libevent/include;/usr/lib/openmpi/include;/usr/lib/openmpi/include/openmpi
> #            MPI_USER_INCLUDE_DIRS = /usr/lib/openmpi/include/openmpi/opal/mca/event/libevent2021/libevent;/usr/lib/openmpi/include/openmpi/opal/mca/event/libevent2021/libevent/include;/usr/lib/openmpi/include;/usr/lib/openmpi/include/openmpi
> #            MPI_LIBRARIES = /usr/lib/openmpi/lib/libmpi_cxx.so;/usr/lib/openmpi/lib/libmpi_usempif08.so;/usr/lib/openmpi/lib/libmpi_usempi_ignore_tkr.so;/usr/lib/openmpi/lib/libmpi_mpifh.so;/usr/lib/openmpi/lib/libmpi.so
$ cmake --version
cmake version 3.17.3

Any help here is appreciated. @tamiko can you reproduce this?

@masterleinad
Copy link
Member

#5510 (comment) and https://gitlab.kitware.com/cmake/cmake/-/issues/17538 are related.
In general, setting CMAKE_CXX_COMPILER should just not be set to a MPI compiler wrapper (but MPI_CXX_COMPILER can be set additionally).

@tjhei
Copy link
Member Author

tjhei commented Jan 6, 2021

Fascinating. Thanks for digging this up. Something else must be going wrong with ld-gold here.

@tjhei
Copy link
Member Author

tjhei commented Jan 6, 2021

Well, I am running into #2820 so I can not use the gold linker when I don't specify the compiler wrappers (Ubuntu 16.04 and 20.04 do the same thing). Sigh.

@tjhei
Copy link
Member Author

tjhei commented Jan 7, 2021

A suggestion for anyone running Ubuntu (and for myself in the future) unless we change anything in the deal.II MPI detection:

@tjhei
Copy link
Member Author

tjhei commented Jan 9, 2021

In general, setting CMAKE_CXX_COMPILER should just not be set to a MPI compiler wrapper (but MPI_CXX_COMPILER can be set additionally).

I have one additional observation:
If I want to use a non-default mpi+compiler installation, I am used to a) set the MPI compiler (for example using MPICH_CXX=clang++) for the wrapper and b) configuring deal.II with CXX=mpicxx. This works as expected.

Without setting CXX=mpicxx but just using MPI_CXX_COMPILER this won't work, as the c++ compiler is being auto-detected (something like /usr/bin/c++).

Does this mean that I have to do something like:

export MPICH_CXX=clang++
CXX=clang++ cmake -D MPI_CXX_COMPILER=mpicxx -D DEAL_II_WITH_MPI=ON ..

Am I right? This is the first time I am seeing this. I don't think we document this or do we?

@tamiko
Copy link
Member

tamiko commented Jan 10, 2021

@tjhei I will try to reproduce again tomorrow - sorry that I was a bit slow on this one.

@tjhei
Copy link
Member Author

tjhei commented Jan 26, 2021

So my conclusion is that the current situation makes OpenMPI unusable on Ubuntu 16.04/18.04/20.04 for me because I can either:
a). use CXX=mpicxx and not have MPI includes set correctly in IDE
b) can not use gold, which makes development a pain (#2820)

:-(

@tjhei tjhei added this to the Release 9.3 milestone Apr 13, 2021
@tjhei
Copy link
Member Author

tjhei commented Apr 13, 2021

It would be great to figure this out before the release, as Ubuntu+OpenMPI is a common setup.

@peterrum
Copy link
Member

peterrum commented May 1, 2021

I don't think I can help here. Are there realistic chances that we find a solution in the next two weeks?

@tjhei
Copy link
Member Author

tjhei commented May 1, 2021

I think this is a major issue, but I have no idea how to proceed and I am hoping @tamiko finds some time before the release.

@tamiko tamiko self-assigned this May 1, 2021
@tamiko
Copy link
Member

tamiko commented May 1, 2021

Let me have a look before the release.
But this is pretty much a CMake internal issue - we rely internally on what the FindMPI.cmake provides us.

But maybe we can work around that.

@tamiko
Copy link
Member

tamiko commented May 22, 2021

Ah, shoot - I promised this one. Let us update the milesonte to 10.0 and if I have a solution in the next 7 days that is not too scary we can cherry-pick.

@tamiko
Copy link
Member

tamiko commented May 30, 2021

@tjhei I was able to reproduce this. The issue is an apparent underlinkage with openmpi on Ubuntu (and Debian) systems where apparently libopen-pal is missing on the link line. See https://bugs.launchpad.net/ubuntu/+source/deal.ii/+bug/1841577

I will try to come up with a heuristic to (somehow robustly) detect this.

@tamiko
Copy link
Member

tamiko commented May 30, 2021

@tjhei For a quick confirmation: If you apply the following patch:

Description: Explicitly link libopen-pal to avoid apparent underlinking in openmpi
 This allows deal.ii to be linked with the gold linker,
 avoiding a FTBFS on arm64 and ppc64el.
Bug-Ubuntu: https://bugs.launchpad.net/bugs/1841577
Author: Graham Inggs <ginggs@debian.org>
Last-Update: 2019-08-28

--- a/cmake/modules/FindMPI.cmake
+++ b/cmake/modules/FindMPI.cmake
@@ -124,6 +124,8 @@
   SET(MPI_VERSION_MINOR "0")
 ENDIF()
 
+LIST(APPEND MPI_CXX_LIBRARIES -lopen-pal)
+
 DEAL_II_PACKAGE_HANDLE(MPI
   LIBRARIES
     OPTIONAL MPI_CXX_LIBRARIES MPI_Fortran_LIBRARIES MPI_C_LIBRARIES

does a CC=clang CXX=clang++ cmake -DWITH_MPI=ON [...] work as intended picking up ld.gold?

@tjhei
Copy link
Member Author

tjhei commented May 30, 2021

Clang has the same issue for me.

@tamiko
Copy link
Member

tamiko commented May 30, 2021

@tjhei The above patch on Ubuntu 20.04.2 resolves the underlinkage issue for me and ld.gold shows up in the linker flags (for both, gcc and clang). Does it for you?

@tjhei
Copy link
Member Author

tjhei commented May 30, 2021

fixes it with clang and gcc!

@tjhei
Copy link
Member Author

tjhei commented May 31, 2021

To conclude:
The applied patch allows using gold without having to specify CXX=mpicxx.

The following issue remains:
a) CC=mpicc CXX=mpicxx cmake -D DEAL_II_WITH_MPI=ON .. gives
build-a/detailed.log:# MPI_INCLUDE_DIRS =

b) cmake -D DEAL_II_WITH_MPI=ON .. gives
build-c/detailed.log:# MPI_INCLUDE_DIRS = /usr/lib/openmpi/include/openmpi/opal/mca/event/libevent2021/libevent;/usr/lib/openmpi/include/openmpi/opal/mca/event/libevent2021/libevent/include;/usr/lib/openmpi/include;/usr/lib/openmpi/include/openmpi

c) cmake -D MPI_Fortran_COMPILER=mpif90 -D MPI_C_COMPILER=mpicc -D MPI_CXX_COMPILER=mpicxx -D DEAL_II_WITH_MPI=ON /src/ gives
# MPI_INCLUDE_DIRS = /usr/include/x86_64-linux-gnu/mpich

So, setting CXX=mpicxx is indeed not working correctly.

@tamiko tamiko removed this from the Release 9.3 milestone May 31, 2021
@tamiko
Copy link
Member

tamiko commented May 31, 2021

ugh We have to fix the wrong path.

@tjhei I am afraid that there is not much we can do here except opening an upstream bug report with CMake. We rely on the FindMPI.cmake module shipped with CMake to set up all MPI_* variables correctly. In summary:

  • case (a) CC=mpicc CXX=mpicxx cmake [...] works as intended by upstream, i.e., do not populate any MPI_* variables if not necessary. The only way to alter this behavior would be to either play games how we set compilers, or reimplementing FindMPI.cmake. Both I would like to avoid.

  • case (b) cmake -DWITH_MPI=ON [...] works as intended

  • case(c) I don't know what is going wrong here? Are mpif90, mpicc, mpicxx the right compiler wrappers? CMake has some odd behavior eating environment variables / aliases.

For me, the most reliable way of setting the desired MPI library is as follows (tested on Debian-11):

# cmake -DMPI_Fortran_COMPILER=mpif90.mpich -DMPI_C_COMPILER=mpicc.mpich -DMPI_CXX_COMPILER=mpicxx.mpich -DWITH_MPI=ON ..
MPI_USER_INCLUDE_DIRS = /usr/include/x86_64-linux-gnu/mpich
cmake -DMPI_Fortran_COMPILER=mpif90.openmpi -DMPI_C_COMPILER=mpicc.openmpi -DMPI_CXX_COMPILER=mpicxx.openmpi -DWITH_MPI=ON ..
MPI_INCLUDE_DIRS = /usr/lib/x86_64-linux-gnu/openmpi/include/openmpi;/usr/lib/x86_64-linux-gnu/openmpi/include

@tamiko
Copy link
Member

tamiko commented Nov 30, 2022

I will close this issue for now.

@tamiko tamiko closed this as completed Nov 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants