Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compile successfully but import error #109

Closed
helsmy opened this issue Nov 17, 2021 · 21 comments
Closed

Compile successfully but import error #109

helsmy opened this issue Nov 17, 2021 · 21 comments

Comments

@helsmy
Copy link

helsmy commented Nov 17, 2021

I compile this project with python setup.py install --mfem-branch=master on CentOS sucessfully.
But when I try to import mfem.ser I got

    from . import _cpointers
    ImportError: libmpi.so.12: cannot open shared object file: No such file or directory

My enviroment:

  • CentOS 7.9.2009 (Core)
  • Conda 4.10.3
  • Python 3.8.12
    • six 1.16.0
    • numpy 1.21.2
  • GCC 8.1.0
  • CMake 3.17.0

Any possible solution for this? Thanks advance.

@sshiraiwa
Copy link
Member

@helsmy Thank you for reporting this issue. This looks mysterious, since the serial install is not supposed to be linked with libmpi.so. As far as I try it from a clean new Docker image of CentOS7 and conda 4.3.21 (not 4.10.3, admittedly), I don't get this error. Do you see something suspicious in your install log? If the entire install log is available, we may be able to further diagnose the issue.

@helsmy
Copy link
Author

helsmy commented Nov 17, 2021

I uninstall previous mfem package and recompile it again. And errors change to:

ImportError                               Traceback (most recent call last)
<ipython-input-1-bda064bf5c1a> in <module>
----> 1 import mfem.ser as mfem

~/miniconda3/envs/FEM/lib/python3.8/site-packages/mfem/ser.py in <module>
----> 1 from  mfem._ser.cpointers import *
      2 from  mfem._ser.globals import *
      3 from  mfem._ser.mem_manager import *
      4 from  mfem._ser.device import *
      5 from  mfem._ser.hash import *

~/miniconda3/envs/FEM/lib/python3.8/site-packages/mfem/_ser/cpointers.py in <module>
     11 # Import the low-level C/C++ module
     12 if __package__ or "." in __name__:
---> 13     from . import _cpointers
     14 else:
     15     import _cpointers

ImportError: libmfem.so.4.3: cannot open shared object file: No such file or directory

I log stdout to this file install.log and stderr to errors.log

@sshiraiwa
Copy link
Member

Thanks. PyMFEM cloned from git master needs to be linked with MFEM master. According the log, it successfully built MFEM master, but it did not re-build PyMFEM since, perhaps, old built is there. Can you try

python setup.py clean
python setup.py install --mfem-branch=master --skip-ext

--skip-ext is an option not to compile MFEM again.

@helsmy
Copy link
Author

helsmy commented Nov 17, 2021

I delete everything about previous PyMFEM, and clone it again and rebuild it. cpointers are okay but got a new error:

ImportError                               Traceback (most recent call last)
<ipython-input-1-bda064bf5c1a> in <module>
----> 1 import mfem.ser as mfem

~/miniconda3/envs/FEM/lib/python3.8/site-packages/mfem/ser.py in <module>
      1 from  mfem._ser.cpointers import *
----> 2 from  mfem._ser.globals import *
      3 from  mfem._ser.mem_manager import *
      4 from  mfem._ser.device import *
      5 from  mfem._ser.hash import *

~/miniconda3/envs/FEM/lib/python3.8/site-packages/mfem/_ser/globals.py in <module>
     11 # Import the low-level C/C++ module
     12 if __package__ or "." in __name__:
---> 13     from . import _globals
     14 else:
     15     import _globals

ImportError: /public3/home/sc52474/miniconda3/envs/FEM/lib/python3.8/site-packages/mfem/_ser/_globals.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZN4mfem15MakeParFilenameERKSsiSsi

Here is the log for this time
install.log
errors.log

@sshiraiwa
Copy link
Member

I think what is happening is that MFEM and PyMFEM is compiled with different compiler. Internally MFEM is build by cmake and cmake find a compiler by itself different from c, c++, which setup.py uses.
After cleaning it, can you try specifying compiler from the command line like

python setup.py install --CXX=/public3/soft/gcc/8.1.0/bin/c++ --CC==/public3/soft/gcc/8.1.0/bin/cc

@helsmy
Copy link
Author

helsmy commented Nov 18, 2021

It works, thanks. Beside there is a duplicated = in command, it would better be -CC=/public3/soft/gcc/8.1.0/bin/cc

@helsmy helsmy closed this as completed Nov 18, 2021
@sshiraiwa
Copy link
Member

sshiraiwa commented Nov 18, 2021 via email

@helsmy
Copy link
Author

helsmy commented Nov 23, 2021

Hi, I try to compile the parallel version with conda, but I got /home/sence/miniconda3/envs/fem/bin/../lib/gcc/x86_64-conda-linux-gnu/9.3.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find -lz
It seems that ld of conda cannot find the zlib of conda. Any possible way to solve this?
Here's the logs
install.log
error.log

@sshiraiwa
Copy link
Member

sshiraiwa commented Nov 23, 2021

Maybe it is necessary to specify the compilers for parallel too. Can you try the following? I suppose you are using a mpi compiler wrapper in conda. Please give a full path to them.

$ python setup.py install --with-parallel --CC=<compiler for serial> --CXX=<c++ compiler for serial> --MPICC=<compiler for parallel, such as mpicc> --MPICXX=<c++ compiler for parallel, such as mpicxx> 

@helsmy
Copy link
Author

helsmy commented Nov 23, 2021

I use $ python setup.py install --with-parallel --CC=$CC --CXX=$CXX --MPICC=mpicc --MPICXX=mpicxx
And compile suite of conda, GCC 9.3 cmake 3.19.6. But, I still got the same error.

@sshiraiwa
Copy link
Member

Thanks. This seems due to the fact that conda uses LDFLAGS to set compiler's library path. Setup.py uses CMAKE_SHARED_LINK_FLAG to give an additional search path, but it is initialized with empty. I made PR110 to set it to LDFLAGS. I suppose this is how it should be. Please try this PR. Note that this is PR to master, therefore --mfem-branch=master is necessary.

@helsmy
Copy link
Author

helsmy commented Nov 24, 2021

I add lines to setup.py as pull 110. It sucessfully link zlib, but it got errors when compile examples under serial.
/public3/home/sc52474/miniconda3/envs/FEM2/bin/../lib/gcc/x86_64-conda-linux-gnu/9.3.0/../../../../x86_64-conda-linux-gnu/bin/ld: /public3/home/sc52474/miniconda3/envs/FEM2/lib/libstdc++.so: undefined reference to aligned_alloc@GLIBC_2.16'`
It seems I miss some package here, but it seems that conda donot provide glibc. How can I solve this? Thanks.
Here is log:
error.log
install.log

@sshiraiwa
Copy link
Member

Can you try a change I just made in PR111. There is one line, which set DCMAKE_EXE_LINKER_FLAGS.

@helsmy
Copy link
Author

helsmy commented Nov 24, 2021

It seems that it can not find some thing in stdlibc++ at compiling test on serial part
CMakeError.log:

Compiling the CXX compiler identification source file "CMakeCXXCompilerId.cpp" failed.
Compiler: /public3/home/sc52474/miniconda3/envs/FEM2/bin/x86_64-conda-linux-gnu-c++ 
Build flags: -std=c++11
Id flags:  

The output was:
1
/public3/home/sc52474/miniconda3/envs/FEM2/bin/../lib/gcc/x86_64-conda-linux-gnu/9.3.0/../../../../x86_64-conda-linux-gnu/bin/ld: /public3/home/sc52474/miniconda3/envs/FEM2/bin/../lib/gcc/x86_64-conda-linux-gnu/9.3.0/../../../../x86_64-conda-linux-gnu/lib/../lib/libstdc++.so: undefined reference to `aligned_alloc@GLIBC_2.16'
/public3/home/sc52474/miniconda3/envs/FEM2/bin/../lib/gcc/x86_64-conda-linux-gnu/9.3.0/../../../../x86_64-conda-linux-gnu/bin/ld: /public3/home/sc52474/miniconda3/envs/FEM2/bin/../lib/gcc/x86_64-conda-linux-gnu/9.3.0/../../../../x86_64-conda-linux-gnu/lib/../lib/libstdc++.so: undefined reference to `clock_gettime@GLIBC_2.17'
collect2: error: ld returned 1 exit status


Performing C++ SOURCE FILE Test POSIXCLOCKS_BUILD failed with the following output:
Change Dir: /public3/home/sc52474/FEMProject/PyMFEM/external/mfem/cmbuild_ser/CMakeFiles/CMakeTmp

Run Build Command(s):/usr/bin/gmake cmTC_52549/fast && /usr/bin/gmake -f CMakeFiles/cmTC_52549.dir/build.make CMakeFiles/cmTC_52549.dir/build
gmake[1]: Entering directory `/public3/home/sc52474/FEMProject/PyMFEM/external/mfem/cmbuild_ser/CMakeFiles/CMakeTmp'
Building CXX object CMakeFiles/cmTC_52549.dir/src.cxx.o
/public3/home/sc52474/miniconda3/envs/FEM2/bin/x86_64-conda-linux-gnu-c++    -std=c++11 -DPOSIXCLOCKS_BUILD   -o CMakeFiles/cmTC_52549.dir/src.cxx.o -c /public3/home/sc52474/FEMProject/PyMFEM/external/mfem/cmbuild_ser/CMakeFiles/CMakeTmp/src.cxx
Linking CXX executable cmTC_52549
/public3/soft/cmake/3.17.0/bin/cmake -E cmake_link_script CMakeFiles/cmTC_52549.dir/link.txt --verbose=1
/public3/home/sc52474/miniconda3/envs/FEM2/bin/x86_64-conda-linux-gnu-c++  -std=c++11 -DPOSIXCLOCKS_BUILD  -Wl,-O2 -Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,-z,now -Wl,--disable-new-dtags -Wl,--gc-sections -Wl,-rpath,/public3/home/sc52474/miniconda3/envs/FEM2/lib -Wl,-rpath-link,/public3/home/sc52474/miniconda3/envs/FEM2/lib -L/public3/home/sc52474/miniconda3/envs/FEM2/lib  -rdynamic CMakeFiles/cmTC_52549.dir/src.cxx.o  -o cmTC_52549 
/public3/home/sc52474/miniconda3/envs/FEM2/bin/../lib/gcc/x86_64-conda-linux-gnu/9.3.0/../../../../x86_64-conda-linux-gnu/bin/ld: CMakeFiles/cmTC_52549.dir/src.cxx.o: in function `main':
src.cxx:(.text+0x15): undefined reference to `clock_gettime'
collect2: error: ld returned 1 exit status
gmake[1]: *** [cmTC_52549] Error 1
gmake[1]: Leaving directory `/public3/home/sc52474/FEMProject/PyMFEM/external/mfem/cmbuild_ser/CMakeFiles/CMakeTmp'
gmake: *** [cmTC_52549/fast] Error 2


Source file was:

#include <time.h>
int main()
{
  struct timespec ts;
  clock_gettime(CLOCK_MONOTONIC, &ts);
  return 0;
}

All of my conda
https://pastebin.com/dhwDaiME

@sshiraiwa
Copy link
Member

Not sure what is going on. Serial build goes thorough with centos7.9.2009 docker image + Anacond3-2021.11-linux-x86_64.sh. There are people reporting a similar issue relating to GLIBC version (data61/MP-SPDZ#1).

@v-dobrev
Copy link
Member

The above source is from the CMake module https://github.com/mfem/mfem/blob/master/config/cmake/modules/FindPOSIXClocks.cmake and it should be fine to fail -- it should continue and test the same source with the -lrt flag (where it should find clock_gettime). And even if these tests fail, that should be fine too and the MFEM CMake build system should just pick standard C++ functions for its timing routines, see: https://github.com/mfem/mfem/blob/0843a87d7953cf23e556dcfd426d27bd9cfb3e21/CMakeLists.txt#L437-L442.

For example, if I replace clock_gettime with clock_gettime1 in FindPOSIXClocks.cmake (thus forcing it to fail on my machine), I get this output

...
-- Looking for POSIXClocks ...
--    checking library: <standard c/c++>
--    checking library: rt
--  *** POSIXClocks not found. (missing: POSIXCLOCKS_LIBRARIES) 
-- MFEM build type: CMAKE_BUILD_TYPE = Release
...

and the cmake command completes without failure.

One thing that I find strange in the above is that the compiler is x86_64-conda-linux-gnu-c++ -- shouldn't that be just c++?

@sshiraiwa
Copy link
Member

sshiraiwa commented Nov 24, 2021

@v-dobrev Thank you for comment. Setup.py picks what $CC, $CXX says for compiler to build MFEM and then PyMFEM.
In Anaconda, one can install an additional compiler (although I never did before...) and, for example,

$ conda install -c conda-forge/label/gcc9 gcc_linux-64
$ conda install -c conda-forge/label/gcc9 gxx_linux-64

will set $CXX x86_64-conda-linux-gnu-c++, and $CC for corresponding C compiler. However, x86_64-conda-linux-gnu-c++ itself seems okay, as I reported above.

@v-dobrev
Copy link
Member

Hmm, looking at the lines (from the above):

/public3/home/sc52474/miniconda3/envs/FEM2/bin/../lib/gcc/x86_64-conda-linux-gnu/9.3.0/../../../../x86_64-conda-linux-gnu/bin/ld: /public3/home/sc52474/miniconda3/envs/FEM2/bin/../lib/gcc/x86_64-conda-linux-gnu/9.3.0/../../../../x86_64-conda-linux-gnu/lib/../lib/libstdc++.so: undefined reference to `aligned_alloc@GLIBC_2.16'
/public3/home/sc52474/miniconda3/envs/FEM2/bin/../lib/gcc/x86_64-conda-linux-gnu/9.3.0/../../../../x86_64-conda-linux-gnu/bin/ld: /public3/home/sc52474/miniconda3/envs/FEM2/bin/../lib/gcc/x86_64-conda-linux-gnu/9.3.0/../../../../x86_64-conda-linux-gnu/lib/../lib/libstdc++.so: undefined reference to `clock_gettime@GLIBC_2.17'

it looks that either libstdc++.so is not linked properly, or the C++ compiler is not adding the right link flag (when invoking ld) for things like aligned_alloc@GLIBC_2.16 and clock_gettime@GLIBC_2.17.

@helsmy
Copy link
Author

helsmy commented Nov 26, 2021

I try to compile with gcc and mpi provided by system. gcc 9.1.0 and intel mpi 17. Everything but wappers goes fine . I got this
https://pastebin.com/2DHrehFz

@sshiraiwa
Copy link
Member

Hi. @helsmy. The error is happening when the compiler looks for inline Array(int asize, MemoryType mt). I am not sure how it is happening. But, this method was added only after 4.3 release (Blame says it is added 2 months ago). Thus, I am guessing that you might have MFEM 4.3, although at some point above, you had 4.3.1 installed. One thing to try is to clean all external build directory as follows.

python setup.py clean --all-exts

And, then rebuild it.

python setup.py install --mfem-branch=master (--with-parallel if you prefer)

Please let us know.

@helsmy
Copy link
Author

helsmy commented Dec 1, 2021

It's not caused by Pymfem. mpi4py requires LD_LIBRARY_PATH to set library path of mpi. Everything works fine after I clean everything and recompile. Thanks for everyone.
In a nutshell, use gcc and cmake provide by system and LD_LIBRARY_PATH.
And How to set LD_LIBRARY_PATH https://mpi4py.readthedocs.io/en/stable/appendix.html#building-mpi

@helsmy helsmy closed this as completed Dec 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants