Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error compiling Tests from source in Linux #92

Closed
RaulPPelaez opened this issue Jan 9, 2023 · 4 comments
Closed

Error compiling Tests from source in Linux #92

RaulPPelaez opened this issue Jan 9, 2023 · 4 comments

Comments

@RaulPPelaez
Copy link
Contributor

Hi, I am having trouble compiling openmm-torch.
After installing the necessary dependencies I run CMake with:

(openmm) raul:build$ cmake -DCMAKE_INSTALL_PREFIX=~/.usr -DPYTORCH_DIR=~/anaconda3/envs/openmm/lib/python3.10/site-packages/torch  -DOPENMM_DIR=~/.usr  ..
CMake Warning (dev) in CMakeLists.txt:
  No project() command is present.  The top-level CMakeLists.txt file must
  contain a literal, direct call to the project() command.  Add a line of
  code such as

    project(ProjectName)

  near the top of the file, but after cmake_minimum_required().

  CMake is pretending there is a "project(Project)" command on the first
  line.
This warning is for project developers.  Use -Wno-dev to suppress it.

-- The CUDA compiler identification is NVIDIA 11.7.99
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
-- Caffe2: CUDA detected: 11.7
-- Caffe2: CUDA nvcc is: /usr/local/cuda/bin/nvcc
-- Caffe2: CUDA toolkit directory: /usr/local/cuda
-- Caffe2: Header version is: 11.7
-- Found CUDNN: /usr/lib64/libcudnn.so  
-- Found cuDNN: v8.0.4  (include: /usr/include, library: /usr/lib64/libcudnn.so)
-- /usr/local/cuda/lib64/libnvrtc.so shorthash is 581f1f99
-- Automatic GPU detection failed. Building for common architectures.
-- Autodetected CUDA architecture(s): 3.5;5.0;5.2;6.0;6.1;7.0;7.5;8.0;8.6;8.6+PTX
-- Added CUDA NVCC flags for: -gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_86,code=compute_86
CMake Warning at /home/raul/anaconda3/envs/openmm/lib/python3.10/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:22 (message):
  static library kineto_LIBRARY-NOTFOUND not found.
Call Stack (most recent call first):
  /home/raul/anaconda3/envs/openmm/lib/python3.10/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:127 (append_torchlib_if_found)
  CMakeLists.txt:15 (FIND_PACKAGE)


-- Found Torch: /home/raul/anaconda3/envs/openmm/lib/python3.10/site-packages/torch/lib/libtorch.so  
CMake Warning (dev) at /usr/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:438 (message):
  The package name passed to `find_package_handle_standard_args` (OPENCL)
  does not match the name of the calling package (OpenCL).  This can lead to
  problems in calling code that expects `find_package` result variables
  (e.g., `_FOUND`) to follow a certain pattern.
Call Stack (most recent call first):
  FindOpenCL.cmake:85 (find_package_handle_standard_args)
  CMakeLists.txt:125 (FIND_PACKAGE)
This warning is for project developers.  Use -Wno-dev to suppress it.

-- Found OPENCL: /usr/local/cuda/lib64/libOpenCL.so  
-- Configuring done
-- Generating done
-- Build files have been written to: /home/raul/openmm-torch/build

Then I compile, which goes fine until the tests

(openmm) raul:build$ make
[  3%] Building CXX object CMakeFiles/OpenMMTorch.dir/openmmapi/src/TorchForce.cpp.o
[  6%] Building CXX object CMakeFiles/OpenMMTorch.dir/openmmapi/src/TorchForceImpl.cpp.o
[ 10%] Building CXX object CMakeFiles/OpenMMTorch.dir/serialization/src/TorchForceProxy.cpp.o
[ 13%] Building CXX object CMakeFiles/OpenMMTorch.dir/serialization/src/TorchSerializationProxyRegistration.cpp.o
[ 16%] Linking CXX shared library libOpenMMTorch.so
[ 16%] Built target OpenMMTorch
[ 20%] CMake-copying file /home/raul/openmm-torch/tests/central.pt to /home/raul/openmm-torch/build/tests/central.pt
[ 23%] CMake-copying file /home/raul/openmm-torch/tests/forces.pt to /home/raul/openmm-torch/build/tests/forces.pt
[ 26%] CMake-copying file /home/raul/openmm-torch/tests/global.pt to /home/raul/openmm-torch/build/tests/global.pt
[ 30%] CMake-copying file /home/raul/openmm-torch/tests/periodic.pt to /home/raul/openmm-torch/build/tests/periodic.pt
[ 30%] Built target CopyTestFiles
[ 33%] Building CXX object serialization/tests/CMakeFiles/TestSerializeTorchForce.dir/TestSerializeTorchForce.cpp.o
[ 36%] Linking CXX executable ../../TestSerializeTorchForce
/usr/bin/ld: CMakeFiles/TestSerializeTorchForce.dir/TestSerializeTorchForce.cpp.o: in function `testSerialization()':
TestSerializeTorchForce.cpp:(.text+0x34a): undefined reference to `OpenMM::throwException(char const*, int, std::string const&)'
/usr/bin/ld: TestSerializeTorchForce.cpp:(.text+0x424): undefined reference to `OpenMM::throwException(char const*, int, std::string const&)'
/usr/bin/ld: TestSerializeTorchForce.cpp:(.text+0x4fe): undefined reference to `OpenMM::throwException(char const*, int, std::string const&)'
/usr/bin/ld: TestSerializeTorchForce.cpp:(.text+0x604): undefined reference to `OpenMM::throwException(char const*, int, std::string const&)'
/usr/bin/ld: TestSerializeTorchForce.cpp:(.text+0x71d): undefined reference to `OpenMM::throwException(char const*, int, std::string const&)'
/usr/bin/ld: CMakeFiles/TestSerializeTorchForce.dir/TestSerializeTorchForce.cpp.o:TestSerializeTorchForce.cpp:(.text+0x834): more undefined references to `OpenMM::throwException(char const*, int, std::string const&)' follow
/usr/bin/ld: CMakeFiles/TestSerializeTorchForce.dir/TestSerializeTorchForce.cpp.o: in function `void OpenMM::XmlSerializer::serialize<TorchPlugin::TorchForce>(TorchPlugin::TorchForce const*, std::string const&, std::ostream&)':
TestSerializeTorchForce.cpp:(.text._ZN6OpenMM13XmlSerializer9serializeIN11TorchPlugin10TorchForceEEEvPKT_RKSsRSo[_ZN6OpenMM13XmlSerializer9serializeIN11TorchPlugin10TorchForceEEEvPKT_RKSsRSo]+0x66): undefined reference to `OpenMM::SerializationNode::setName(std::string const&)'
/usr/bin/ld: TestSerializeTorchForce.cpp:(.text._ZN6OpenMM13XmlSerializer9serializeIN11TorchPlugin10TorchForceEEEvPKT_RKSsRSo[_ZN6OpenMM13XmlSerializer9serializeIN11TorchPlugin10TorchForceEEEvPKT_RKSsRSo]+0xcd): undefined reference to `OpenMM::SerializationNode::hasProperty(std::string const&) const'
/usr/bin/ld: TestSerializeTorchForce.cpp:(.text._ZN6OpenMM13XmlSerializer9serializeIN11TorchPlugin10TorchForceEEEvPKT_RKSsRSo[_ZN6OpenMM13XmlSerializer9serializeIN11TorchPlugin10TorchForceEEEvPKT_RKSsRSo]+0x104): undefined reference to `OpenMM::SerializationProxy::getTypeName() const'
/usr/bin/ld: TestSerializeTorchForce.cpp:(.text._ZN6OpenMM13XmlSerializer9serializeIN11TorchPlugin10TorchForceEEEvPKT_RKSsRSo[_ZN6OpenMM13XmlSerializer9serializeIN11TorchPlugin10TorchForceEEEvPKT_RKSsRSo]+0x154): undefined reference to `OpenMM::SerializationProxy::getTypeName() const'
/usr/bin/ld: TestSerializeTorchForce.cpp:(.text._ZN6OpenMM13XmlSerializer9serializeIN11TorchPlugin10TorchForceEEEvPKT_RKSsRSo[_ZN6OpenMM13XmlSerializer9serializeIN11TorchPlugin10TorchForceEEEvPKT_RKSsRSo]+0x191): undefined reference to `OpenMM::SerializationNode::setStringProperty(std::string const&, std::string const&)'
/usr/bin/ld: ../../libOpenMMTorch.so: undefined reference to `OpenMM::Platform::createKernel(std::string const&, OpenMM::ContextImpl&) const'
/usr/bin/ld: ../../libOpenMMTorch.so: undefined reference to `OpenMM::SerializationNode::setBoolProperty(std::string const&, bool)'
/usr/bin/ld: ../../libOpenMMTorch.so: undefined reference to `OpenMM::SerializationNode::getDoubleProperty(std::string const&) const'
/usr/bin/ld: ../../libOpenMMTorch.so: undefined reference to `OpenMM::SerializationNode::setIntProperty(std::string const&, int)'
/usr/bin/ld: ../../libOpenMMTorch.so: undefined reference to `OpenMM::SerializationNode::getIntProperty(std::string const&) const'
/usr/bin/ld: ../../libOpenMMTorch.so: undefined reference to `OpenMM::SerializationProxy::SerializationProxy(std::string const&)'
/usr/bin/ld: ../../libOpenMMTorch.so: undefined reference to `OpenMM::SerializationNode::getBoolProperty(std::string const&) const'
/usr/bin/ld: ../../libOpenMMTorch.so: undefined reference to `OpenMM::SerializationNode::getStringProperty(std::string const&) const'
/usr/bin/ld: ../../libOpenMMTorch.so: undefined reference to `OpenMM::SerializationNode::getName() const'
/usr/bin/ld: ../../libOpenMMTorch.so: undefined reference to `OpenMM::SerializationNode::createChildNode(std::string const&)'
/usr/bin/ld: ../../libOpenMMTorch.so: undefined reference to `OpenMM::SerializationNode::setDoubleProperty(std::string const&, double)'
/usr/bin/ld: ../../libOpenMMTorch.so: undefined reference to `OpenMM::SerializationNode::getIntProperty(std::string const&, int) const'
collect2: error: ld returned 1 exit status
make[2]: *** [serialization/tests/CMakeFiles/TestSerializeTorchForce.dir/build.make:110: TestSerializeTorchForce] Error 1
make[1]: *** [CMakeFiles/Makefile2:277: serialization/tests/CMakeFiles/TestSerializeTorchForce.dir/all] Error 2
make: *** [Makefile:146: all] Error 2

Naturally the function OpenMM::throwException exists and is present in libOpenMM.so, the compilation line generated by CMake does include -lOpenMM, so it should have access to that function.
I am using the master branch of both OpenMM (openmm/openmm@d1678fb) and OpenMM-torch (994f92f).
The compilation line for this particular file is:

/usr/bin/c++ CMakeFiles/TestSerializeTorchForce.dir/TestSerializeTorchForce.cpp.o -o ../../TestSerializeTorchForce   -L/home/raul/.usr/lib/plugins  -Wl,-rpath,/home/raul/.us
r/lib/plugins:/home/raul/openmm-torch/build:/home/raul/anaconda3/envs/openmm/lib/python3.10/site-packages/torch/lib:/usr/local/cuda/lib64/stubs ../../libOpenMMTorch
.so -lOpenMM /home/raul/anaconda3/envs/openmm/lib/python3.10/site-packages/torch/lib/libtorch.so -Wl,--no-as-needed,"/home/raul/anaconda3/envs/openmm/lib/python3.10/site-pac
kages/torch/lib/libtorch_cuda.so" -Wl,--as-needed -Wl,--no-as-needed,"/home/raul/anaconda3/envs/openmm/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cpp.so" -Wl,--as-
needed -Wl,--no-as-needed,"/home/raul/anaconda3/envs/openmm/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so" -Wl,--as-needed /home/raul/anaconda3/envs/openmm/lib/pyth
on3.10/site-packages/torch/lib/libc10_cuda.so /home/raul/anaconda3/envs/openmm/lib/python3.10/site-packages/torch/lib/libc10.so /usr/local/cuda/lib64/libcufft.so /usr/local/
cuda/lib64/libcurand.so /usr/local/cuda/lib64/libcublas.so -Wl,--no-as-needed,"/home/raul/anaconda3/envs/openmm/lib/python3.10/site-packages/torch/lib/libtorch_cuda_cu.so" -
Wl,--as-needed -Wl,--no-as-needed,"/home/raul/anaconda3/envs/openmm/lib/python3.10/site-packages/torch/lib/libtorch.so" -Wl,--as-needed /home/raul/anaconda3/envs/openmm/lib/
python3.10/site-packages/torch/lib/libc10.so /usr/local/cuda/lib64/stubs/libcuda.so /usr/local/cuda/lib64/libnvrtc.so /usr/local/cuda/lib64/libnvToolsExt.so /usr/local/cuda/
lib64/libcudart.so /home/raul/anaconda3/envs/openmm/lib/python3.10/site-packages/torch/lib/libc10_cuda.so

My suspicion is that openmm-torch is only compatible with a certain range of openmm versions, but I did not see a reference to such thing in the documentation.

Any clues are appreciated.

@RaulPPelaez
Copy link
Contributor Author

I get the same error when using OpenMM 7.5 (openmm/openmm@a9cfd7f)

@peastman
Copy link
Member

peastman commented Jan 9, 2023

My suspicion is that openmm-torch is only compatible with a certain range of openmm versions

That's correct. If you're compiling the latest openmm-torch code from source, you should also have compiled the most recent OpenMM code from source. You didn't say what version of either you were using, but you did later mention OpenMM 7.5 which is over three years old.

@RaulPPelaez
Copy link
Contributor Author

I am trying the latest commits from the master branch of both openmm and openmm-torch. Should I expect that to work?

@RaulPPelaez
Copy link
Contributor Author

I cannot pinpoint what the exact source of the error was, but it was an environment issue.
The following conda env (I took the versions from the ubuntu 18 CI) allowed me to compile the master of openmm and then use it to compile the master of openmm-torch:

conda install -c conda-forge cudatoolkit-dev=11.2 cmake make cython swig fftw doxygen numpy cudatoolkit=11.2 gxx_linux-64=10.3 sysroot_linux-64=2.17 pytorch-gpu=1.12

I will leave this for reference, but I am closing the issue.
Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants