New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
{bio}[foss/2022a] AlphaFold v2.3.1, HH-suite v3.3.0, Kalign v3.3.5, OpenMM 8.0.0 w/ Python 3.10.4 #17604
Conversation
@maxim-masterov unless there's a strong reason not to, can we bump the AlphaFold version to the current |
@orbsmiv I've updated it to v2.3.1. Also, added a patch to |
easybuild/easyconfigs/a/AlphaFold/AlphaFold-2.3.0-foss-2022a-CUDA-11.7.0.eb
Outdated
Show resolved
Hide resolved
easybuild/easyconfigs/a/AlphaFold/AlphaFold-2.3.0-foss-2022a-CUDA-11.7.0.eb
Outdated
Show resolved
Hide resolved
Test report by @orbsmiv |
('UCX-CUDA', '1.12.1', versionsuffix), | ||
('cuDNN', '8.4.1.50', versionsuffix, SYSTEM), | ||
('NCCL', '2.12.12', versionsuffix), | ||
('OpenMM', '8.0.0'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maxim-masterov Doesn't OpenMM
need to be GPU-capable for AlphaFold?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@boegel I'm checking it. I didn't include OpenMM compiled with CUDA support as previous easyconfigs didn't use it. I'm trying to build it now, but as you mentioned earlier, I hit some errors. At the moment it looks like CMake picks nvcc
from /usr/bin
, instead of $EBCUDAROOT/bin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@boegel I've added CUDA to OpenMM, It builds fine and passes all tests after specifying the OPENMM_CUDA_COMPILER variable. Without this variable, OpenMM tries to use /usr/local/cuda/bin/nvcc
, instead of ${EBROOTCUDA}/bin/nvcc
. The only test that fails with this variable is CudaCompiler
, which requires OPENMM_CUDA_COMPILER
not to be set, so I've excluded it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I gave this a try, as we also had some requests for this new AlphaFold version, but I do get an internal compiler error as well:
/dev/shm/f115372/OpenMM/8.0.0/foss-2022a/openmm-8.0.0/platforms/common/src/CommonKernels.cpp: In member function void OpenMM::CommonCalcGayBerneForceKernel::sortAtoms():
/dev/shm/f115372/OpenMM/8.0.0/foss-2022a/openmm-8.0.0/platforms/common/src/CommonKernels.cpp:5055:6: internal compiler error: in vect_get_vec_defs_for_operand, at tree-vect-stmts.c:1450
5055 | void CommonCalcGayBerneForceKernel::sortAtoms() {
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0x69b359 vect_get_vec_defs_for_operand(vec_info*, _stmt_vec_info*, unsigned int, tree_node*, vec<tree_node*, va_heap, vl_ptr>*, tree_node*)
../../gcc/tree-vect-stmts.c:1450
0xf42df4 vect_build_gather_load_calls
../../gcc/tree-vect-stmts.c:2728
0xf42df4 vectorizable_load
../../gcc/tree-vect-stmts.c:8718
0xf4bca0 vect_transform_stmt(vec_info*, _stmt_vec_info*, gimple_stmt_iterator*, _slp_tree*, _slp_instance*)
../../gcc/tree-vect-stmts.c:10922
0xf4fa41 vect_transform_loop_stmt
../../gcc/tree-vect-loop.c:9254
0xf6740d vect_transform_loop(_loop_vec_info*, gimple*)
../../gcc/tree-vect-loop.c:9690
0xf9059c try_vectorize_loop_1
../../gcc/tree-vectorizer.c:1104
0xf91181 vectorize_loops()
../../gcc/tree-vectorizer.c:1243
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
odd, what optarch
flags do you use? and what machine are you building on?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't set any optarch
, so it was using -march=native
, and this was on an AMD EPYC 7763. Also tried OpenMM 7.7.0 with CUDA and foss 2022a, but that resulted in the same error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I built succesfully on Intel Platinum 8360Y with optarch=Intel:O2 -march=core-avx2;GCC:O2 -mavx2 -mfma
and NVIDIA A100. And commenting out CUDA
allowed me to build on AMD EPYC 7H12 without GPUs on board.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was experimenting with compiler flags and managed to get the same "internal compiler" error. It appears when -march
is used, which switches on the tree vectorization. Surprisingly, with -march=znver2
on our AMD CPUs all works fine, but with, e.g., -march=skylake-avx512
on our Intel CPUs I get this error.
There are two ways to solve it, except patching the compiler, which will require shrinking down the code to a small reproducible example and posting an issue on Bugzilla. The first is to compile OpenMM with -fno-tree-vectorize
flag. The second is to patch platforms/common/src/CommonKernels.cpp
file and add __attribute__((optimize("no-tree-vectorize")))
in front of the CommonCalcGayBerneForceKernel::sortAtoms()
function definition. IMO, the second is better, as it affects only one function, instead of the whole source code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I confirm that just removing -ftree-vectorize
is enough to workaround this ICE. Using -march=native
will only be an issue on those architectures affected by this bug.
On my side,I hit this ICE on our old Intel Broadwells with just AVX2. However, the build on our AMD EPYC 7282 (znver2
) worked fine and those also just support AVX2.
So maybe we can collect a few known systems where this ICE triggers and update the comment OpenMM-8.0.0_add_no_tree_vectorize.patch
to inform that this patch is not always needed.
Test report by @bedroge |
One other question, how certain are you / can we be about the compatibility of OpenMM 8 with AlphaFold? I read in the issue that @boegel referred to in this PR that the developer of OpenMM confirms that 7.7.0 is backwards compatible and should work fine, but is this also true for 8.0.0? |
This comment was marked as outdated.
This comment was marked as outdated.
@bedroge According to the OpenMM developers, the only reason why AlphaFold doesn't work with OpenMM >=7.7.0 is some refactoring they did with respect to the Python wrappers. All the functionality is back-compatible. I've asked one of our users to test the installation performed with easyconfigs from this PR, I'll update this thread as soon as he confirms that all works well (or not :) ) |
easybuild/easyconfigs/a/AlphaFold/AlphaFold-2.3.1-foss-2022a-CUDA-11.7.0.eb
Outdated
Show resolved
Hide resolved
easybuild/easyconfigs/a/AlphaFold/AlphaFold-2.3.1-foss-2022a-CUDA-11.7.0.eb
Outdated
Show resolved
Hide resolved
Test report by @boegel |
Test report by @bedroge Can be ignored, accidentally ran it on the wrong machine without a GPU... |
@bedroge did it fail because it was building OpenMM on a node without GPU? |
Ah, yes, sorry, ran it in the wrong tab. Will try again on a GPU node. |
Test report by @bedroge |
Test report by @lexming |
Test report by @boegel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Going in, thanks @maxim-masterov! |
Test report by @boegel |
(created using
eb --new-pr
)