AlphaFold 2.3.1 #118

boegel · 2023-03-21T19:31:16Z

link to support ticket: #2023031760000665
website: https://github.com/deepmind/alphafold
installation docs: https://github.com/deepmind/alphafold
toolchain: foss/2022a
easyblock to use: (see existing AlphaFold easyconfigs)
required dependencies:
- (see existing AlphaFold easyconfigs)
notes:
- ...
effort: (TBD)

The text was updated successfully, but these errors were encountered:

boegel · 2023-03-21T19:36:05Z

WIP easyconfigs added in 118_AlphaFold

I ran into a problem with the OpenMM dependency, which triggers a compiler crash (ICE, Internal Compiler Error), which seems to boil down to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99746 or https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106841, so we'll need to backport a patch for GCCcore 11.2.0 (if we go with foss/2021b like with AlphaFold 2.3.0) or GCCcore 11.3.0 (if we go with foss/2022a, which should be feasible now that there's a TensorFlow with foss/2022a - see easybuilders/easybuild-easyconfigs#17241)

boegel · 2023-03-22T09:36:17Z

Maybe we should update to OpenMM 7.7.0, see also discussion in google-deepmind/alphafold#404

lexming · 2023-04-11T22:28:34Z

I just finished the installation of AlphaFold 2.3.1 in Hydra with foss/2022a. Relevant easyconfigs are in 118_AlphaFold

I also hit that ICE with OpenMM v8.0.0 in GCC 11.3.0, but:

only in the build with CUDA because the failing code path links to CUDA
only fails on Intel AVX2 architectures (i.e. Broadwell). AMD Zen2 (AVX2) is fine.

The ICE is the following:

during GIMPLE pass: vect
/theia/scratch/brussel/vo/000/bvo00005/vsc10122/easybuild/install/broadwell/build/OpenMM/8.0.0/foss-2022a-CUDA-11.7.0/openmm-8.0.0/platforms/common/src/CommonKernels.cpp: In member function void OpenMM::CommonCalcGayBerneForceKernel::sortAtoms():
/theia/scratch/brussel/vo/000/bvo00005/vsc10122/easybuild/install/broadwell/build/OpenMM/8.0.0/foss-2022a-CUDA-11.7.0/openmm-8.0.0/platforms/common/src/CommonKernels.cpp:5055:6: internal compiler error: in vect_get_vec_defs_for_operand, at tree-vect-stmts.c:1450
 5055 | void CommonCalcGayBerneForceKernel::sortAtoms() {
      |      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0x69b399 vect_get_vec_defs_for_operand(vec_info*, _stmt_vec_info*, unsigned int, tree_node*, vec<tree_node*, va_heap, vl_ptr>*, tree_node*)
        ../../gcc/tree-vect-stmts.c:1450
0xf42e34 vect_build_gather_load_calls
        ../../gcc/tree-vect-stmts.c:2728
0xf42e34 vectorizable_load
        ../../gcc/tree-vect-stmts.c:8718
0xf4bce0 vect_transform_stmt(vec_info*, _stmt_vec_info*, gimple_stmt_iterator*, _slp_tree*, _slp_instance*)
        ../../gcc/tree-vect-stmts.c:10922
0xf4fa81 vect_transform_loop_stmt
        ../../gcc/tree-vect-loop.c:9254
0xf6744d vect_transform_loop(_loop_vec_info*, gimple*)
        ../../gcc/tree-vect-loop.c:9690
0xf905dc try_vectorize_loop_1
        ../../gcc/tree-vectorizer.c:1104
0xf911c1 vectorize_loops()
        ../../gcc/tree-vectorizer.c:1243
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.
make[2]: *** [platforms/cuda/sharedTarget/CMakeFiles/OpenMMCUDA.dir/__/__/common/src/CommonKernels.cpp.o] Error 1

Which raises many questions:

It looks like very much the same as GCC bug https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99746. However, that bug in GCC is already fixed in GCC v11.3.0. So why is it still happening?
The backtrace shows calls from files that belong to the GCC source code but with relative paths such as ../../gcc/tree-vect-stmts.c:1450. Those paths should be absolute. What are they relative to? OpenMM does not bundle any of this.
The line numbers do not match the calling functions shown in the backtrace for GCC 11.3.0. Not sure if those line numbers refer for the stripped source code instead.

I have the suspicion that the backtrace shown in the ICE is not referring to the GCC compiler used by EasyBuild, but the compiler used by Nvidia in the CUDA pre-built binaries.

Nevertheless, since this ICE is only afecting our older systems, it is not worth the effort on our side to go any deeper. We just disabled the vectorization on the affected installation.

update: I just saw in the EasyBuild PR that a patch was recently added to the OpenMM easyconfig to disable vectorization on the single function that fails. That's a better solution.

boegel · 2023-04-14T16:26:11Z

@lexming This issue can be closed, and the 118_AlphaFold directory can be removed, since easybuilders/easybuild-easyconfigs#17604 is merged?

lexming · 2023-04-17T07:33:02Z

Last PR:

{bio}[foss/2022a] AlphaFold v2.3.1, OpenMM v8.0.0 w/ Python 3.10.4 easybuilders/easybuild-easyconfigs#17740

and closing this issue.

boegel added difficulty: easy software that should be easy to support priority: high Python update site:ugent Software installation request for UGent Tier-2 GPU labels Mar 21, 2023

boegel added a commit that referenced this issue Mar 21, 2023

WIP easyconfigs for AlphaFold 2.3.1 (#118)

7294be4

boegel mentioned this issue Mar 27, 2023

{bio}[foss/2022a] AlphaFold v2.3.1, HH-suite v3.3.0, Kalign v3.3.5, OpenMM 8.0.0 w/ Python 3.10.4 easybuilders/easybuild-easyconfigs#17604

Merged

lexming closed this as completed Apr 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AlphaFold 2.3.1 #118

AlphaFold 2.3.1 #118

boegel commented Mar 21, 2023

boegel commented Mar 21, 2023

boegel commented Mar 22, 2023

lexming commented Apr 11, 2023 •

edited

boegel commented Apr 14, 2023

lexming commented Apr 17, 2023

AlphaFold 2.3.1 #118

AlphaFold 2.3.1 #118

Comments

boegel commented Mar 21, 2023

boegel commented Mar 21, 2023

boegel commented Mar 22, 2023

lexming commented Apr 11, 2023 • edited

boegel commented Apr 14, 2023

lexming commented Apr 17, 2023

lexming commented Apr 11, 2023 •

edited