Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues building CP2K #27

Open
ocaisa opened this issue Feb 27, 2022 · 24 comments
Open

Issues building CP2K #27

ocaisa opened this issue Feb 27, 2022 · 24 comments

Comments

@ocaisa
Copy link
Contributor

ocaisa commented Feb 27, 2022

I thought I would give this a full test with Fortran, and CP2K is a good benchmark for that. The build (v8.2) is failing with:

/project/60005/easybuild/build/CP2K/8.2/gmtfbf-2021a/cp2k-8.2/exts/dbcsr/src/mpi/dbcsr_mpiwrap.F:1669:21:

 1669 |       CALL mpi_bcast(msg, msglen, MPI_LOGICAL, source, gid, ierr)
      |                     1
......
 3160 |       CALL mpi_bcast(msg, msglen, ${mpi_type1}$, source, gid, ierr)
      |                     2
Error: Type mismatch between actual argument at (1) and actual argument at (2) (LOGICAL(4)/COMPLEX(4)).
@eschnett
Copy link
Owner

You are certainly giving MPItrampoline a good workout. Thank you for your patience.

@ocaisa
Copy link
Contributor Author

ocaisa commented Feb 28, 2022

I tried again this morning with the updated PR but it is still failing. I realised my snippet above is not really showing all the issues that the compilation was running into so I've uploaded a complete gist of the build. The Makefile used with the build is shown at L324. The compiler errors begin at L430.

Sorry for barraging you over the last week, but I would really like to do some performance verification checks for MPItrampoline with real applications (and ultimately support it as part of a toolchain in EasyBuild).

@eschnett
Copy link
Owner

I'm now using Spack (sorry!) to build CP2K against MPItrampoline. I can reproduce your errors. These errors are reported because the MPI standard technically violates the Fortran standard, and newer GNU Fortran compilers report these errors. There are, of course, command line flags or function attributes that one can use to circumvent these errors. I am now looking into ways to automate this, so that people using MPItrampoline don't have to manually specify these flags.

@eschnett
Copy link
Owner

eschnett commented Mar 2, 2022

Here is a patch to make CP2K build with MPItrampoline. In some cases CP2K violates the MPI standard (in a way that is harmless for other MPI implementations), other changes are only necessary for MPItrampoline's Fortran interface.

I have also released a new version of MPItrampoline that has some missing features added.

@ocaisa
Copy link
Contributor Author

ocaisa commented Mar 2, 2022

Confirmed that worked for me. I had to make a tiny change to the patch for version 8.2. I also had to enable -fallow-argument-mismatch to get a successful compilation, otherwise I ran into errors like:

/project/60005/easybuild/build/CP2K/8.2/gmtfbf-2021a/cp2k-8.2/exts/dbcsr/src/mpi/dbcsr_mpiwrap.F:5413:47:

 5413 |          CALL MPI_FILE_READ_AT_ALL(fh, offset, msg, msg_len, ${mpi_type1}$, MPI_STATUS_IGNORE, ierr)
      |                                               1
......
 5435 |       CALL MPI_FILE_READ_AT_ALL(fh, offset, msg, 1, ${mpi_type1}$, MPI_STATUS_IGNORE, ierr)
      |                                            2   
Error: Rank mismatch between actual argument at (1) and actual argument at (2) (scalar and rank-1)
/project/60005/easybuild/build/CP2K/8.2/gmtfbf-2021a/cp2k-8.2/exts/dbcsr/src/mpi/dbcsr_mpiwrap.F:5358:48:

 5358 |          CALL MPI_FILE_WRITE_AT_ALL(fh, offset, msg, msg_len, ${mpi_type1}$, MPI_STATUS_IGNORE, ierr)
      |                                                1
......
 5380 |       CALL MPI_FILE_WRITE_AT_ALL(fh, offset, msg, 1, ${mpi_type1}$, MPI_STATUS_IGNORE, ierr)
      |                                             2   
Error: Rank mismatch between actual argument at (1) and actual argument at (2) (scalar and rank-1)

(this might already be fixed with the most recent release 9.1)

@eschnett
Copy link
Owner

eschnett commented Mar 2, 2022

Yes, I forgot about -fallow-argument-mismatch. The Spack recipe for CP2K already uses this flag for MPICH, so I added it there for MPItrampoline as well. It would be convenient of CP2K did this automatically.

@ocaisa
Copy link
Contributor Author

ocaisa commented Mar 2, 2022

I can add it automatically for EasyBuild, it is already triggered in some scenarios (GCC 10+, CP2K < 7.1).

@ocaisa
Copy link
Contributor Author

ocaisa commented Mar 2, 2022

Unfortunately the test suite is segfaulting on every execution, an example:

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
/project/60005/easybuild/build/CP2K/8.2/gmtfbf-2021a/TEST-Linux-x86-64-gmtfbf-psmp-2022-03-02_14-46-04/UNIT/libcp2k_unittest.out
 **           ##                                                    ##        **
 **                                                                           **
 **                                                ... make the atoms dance   **
 **                                                                           **
 **            Copyright (C) by CP2K developers group (2000-2021)             **
 **                      J. Chem. Phys. 152, 194103 (2020)                    **
 **                                                                           **
 *******************************************************************************

 *******************************************************************************
 *   ___                                                                       *
 *  /   \                                                                      *
 * [ABORT]                                                                     *
 *  \___/                             CPASSERT failed                          *
 *    |                                                                        *
 *  O/|                                                                        *
 * /| |                                                                        *
 * / \                                                      pw/pw_grids.F:1601 *
 *******************************************************************************


 ===== Routine Calling Stack ===== 

            8 pw_grid_assign
            7 pw_grid_setup_internal
            6 pw_grid_setup
            5 pw_env_rebuild
            4 qs_env_rebuild_pw_env
            3 qs_env_setup
            2 qs_init_subsys
            1 CP2K
[xlnode1:2056906:0:2056906] Caught signal 11 (Segmentation fault: Sent by the kernel at address (nil))
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
EXIT CODE:  1  MEANING:  RUNTIME FAIL
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
/project/60005/easybuild/build/CP2K/8.2/gmtfbf-2021a/TEST-Linux-x86-64-gmtfbf-psmp-2022-03-02_14-46-04/QS/regtest-grid/simple_non-ortho_grid_auto.inp.out
       Description:                       Goedecker-Teter-Hutter pseudopotential
                                           Goedecker et al., PRB 54, 1703 (1996)
                                          Hartwigsen et al., PRB 58, 3641 (1998)
                                                      Krack, TCA 114, 145 (2005)

       Gaussian exponent of the core charge distribution:               4.109048
       Electronic configuration (s p d ...):                               2   2

@ocaisa
Copy link
Contributor Author

ocaisa commented Mar 2, 2022

The test suite is somewhat notorious but I'd expect we should be able to match the results given with OpenMPI.

@eschnett
Copy link
Owner

eschnett commented Mar 2, 2022

I will have a look. How do you run the test suite?

@ocaisa
Copy link
Contributor Author

ocaisa commented Mar 2, 2022

TBH I'm not sure, it runs automatically with EB. Here's the steps:

== 2022-03-02 14:46:04,950 build_log.py:265 INFO testing...
== 2022-03-02 14:46:04,952 easyblock.py:3711 INFO Starting test step
== 2022-03-02 14:46:04,952 easyconfig.py:1686 INFO Generating template values...
== 2022-03-02 14:46:04,952 mpi.py:120 INFO Using template MPI command 'mpiexec -n %(nr_ranks)s %(cmd)s' for MPI family 'MPItrampoline'
== 2022-03-02 14:46:04,952 mpi.py:305 INFO Using MPI command template 'mpiexec -n %(nr_ranks)s %(cmd)s' (params: {'nr_ranks': 1, 'cmd': 'xxx_command_xxx'})
== 2022-03-02 14:46:04,953 easyconfig.py:1705 INFO Template values: arch='x86_64', bitbucket_account='cp2k', builddir='/project/def-sponsor00/easybuild/build/CP2K/8.2/gmtfbf-2021a', github_account='cp2k', installdir='/project/def-sponsor00/easybuild/software/CP2K/8.2-gmtfbf-2021a', module_name='CP2K/8.2-gmtfbf-2021a', mpi_cmd_prefix='mpiexec -n 1', name='CP2K', nameletter='C', nameletterlower='c', namelower='cp2k', parallel='16', toolchain_name='gmtfbf', toolchain_version='2021a', version='8.2', version_major='8', version_major_minor='8.2', version_minor='2', versionprefix='', versionsuffix=''
== 2022-03-02 14:46:04,953 easyblock.py:3719 INFO Running method test_step part of step test
== 2022-03-02 14:46:04,953 environment.py:91 INFO Environment variable CP2K_DATA_DIR set to /project/60005/easybuild/build/CP2K/8.2/gmtfbf-2021a/cp2k-8.2/data (previously undefined)
== 2022-03-02 14:46:04,955 cp2k.py:746 INFO No reference output found for regression test, just continuing without it...
== 2022-03-02 14:46:04,960 cp2k.py:753 INFO Using 4 cores for the MPI tests
== 2022-03-02 14:46:04,960 mpi.py:120 INFO Using template MPI command 'mpiexec -n %(nr_ranks)s %(cmd)s' for MPI family 'MPItrampoline'
== 2022-03-02 14:46:04,960 mpi.py:305 INFO Using MPI command template 'mpiexec -n %(nr_ranks)s %(cmd)s' (params: {'nr_ranks': 4, 'cmd': ''})
== 2022-03-02 14:46:04,963 run.py:233 INFO running cmd: /project/60005/easybuild/build/CP2K/8.2/gmtfbf-2021a/cp2k-8.2/tools/regtesting/do_regtest -nobuild -config cp2k_regtest.cfg

and cp2k_regtest.cfg contains

[ocaisa@xlnode1 gmtfbf-2021a]$ cat cp2k_regtest.cfg 
FORT_C_NAME="gfortran"
dir_base=/project/60005/easybuild/build/CP2K/8.2/gmtfbf-2021a
cp2k_version=psmp
dir_triplet=Linux-x86-64-gmtfbf
export ARCH=${dir_triplet}
cp2k_dir=cp2k-8.2
leakcheck="YES"
maxtasks=4

@ocaisa
Copy link
Contributor Author

ocaisa commented Mar 2, 2022

Not all of the tests are failing straight away, I see some are running for quite some time. I'll need to do a comparision build with OpenMPI to fully check. Unfortunately the test suite takes ages, so I won't be able to report back for quite a while.

@ocaisa
Copy link
Contributor Author

ocaisa commented Mar 2, 2022

The regressions tests with OpenMPI took 3 hours (and had 10 failures). The regression tests with MPItrampoline are still running (over 5 hours) and look to have more than 1000 failures.

Here's a backtrace on one of the errors:

==== backtrace (tid:3398606) ====
 0  /cvmfs/pilot.eessi-hpc.org/versions/2021.12/software/linux/x86_64/amd/zen2/software/UCX/1.10.0-GCCcore-10.3.0/lib64/libucs.so.0(ucs_handle_error+0x254) [0x14f2a7b5f474]
 1  /cvmfs/pilot.eessi-hpc.org/versions/2021.12/software/linux/x86_64/amd/zen2/software/UCX/1.10.0-GCCcore-10.3.0/lib64/libucs.so.0(+0x21657) [0x14f2a7b5f657]
 2  /cvmfs/pilot.eessi-hpc.org/versions/2021.12/software/linux/x86_64/amd/zen2/software/UCX/1.10.0-GCCcore-10.3.0/lib64/libucs.so.0(+0x2180a) [0x14f2a7b5f80a]
 3  /cvmfs/pilot.eessi-hpc.org/versions/2021.12/compat/linux/x86_64/lib/../lib64/libpthread.so.0(+0x120f0) [0x14f2c322d0f0]
 4  /cvmfs/pilot.eessi-hpc.org/versions/2021.12/compat/linux/x86_64/lib/../lib64/libc.so.6(+0x157c77) [0x14f2c1530c77]
 5  /project/def-sponsor00/easybuild/software/MPItrampoline/3.8.0-GCC-10.3.0/mpiwrapper/lib/libopen-pal.so.40(+0x41798) [0x14f2adecb798]
 6  /project/def-sponsor00/easybuild/software/MPItrampoline/3.8.0-GCC-10.3.0/mpiwrapper/lib/libmpi.so.40(ompi_coll_base_allreduce_intra_redscat_allgather+0x1a9) [0x14f2ae09cd49]
 7  /project/def-sponsor00/easybuild/software/MPItrampoline/3.8.0-GCC-10.3.0/mpiwrapper/lib/openmpi/mca_coll_tuned.so(ompi_coll_tuned_allreduce_intra_dec_fixed+0x4a) [0x14f2a799062a]
 8  /project/def-sponsor00/easybuild/software/MPItrampoline/3.8.0-GCC-10.3.0/mpiwrapper/lib/libmpi.so.40(PMPI_Allreduce+0xf0) [0x14f2ae054090]
 9  /project/def-sponsor00/easybuild/software/MPItrampoline/3.8.0-GCC-10.3.0/mpiwrapper/lib/libmpi_mpifh.so.40(mpi_allreduce_+0x79) [0x14f2ae1715b9]
10  /project/60005/easybuild/build/CP2K/8.2/gmtfbf-2021a/cp2k-8.2/exe/Linux-x86-64-gmtfbf/cp2k.psmp() [0x1d7d70a]
11  /project/60005/easybuild/build/CP2K/8.2/gmtfbf-2021a/cp2k-8.2/exe/Linux-x86-64-gmtfbf/cp2k.psmp() [0x1b0c09c]
12  /project/60005/easybuild/build/CP2K/8.2/gmtfbf-2021a/cp2k-8.2/exe/Linux-x86-64-gmtfbf/cp2k.psmp() [0x9fbbe6]
13  /project/60005/easybuild/build/CP2K/8.2/gmtfbf-2021a/cp2k-8.2/exe/Linux-x86-64-gmtfbf/cp2k.psmp() [0xa7091b]
14  /project/60005/easybuild/build/CP2K/8.2/gmtfbf-2021a/cp2k-8.2/exe/Linux-x86-64-gmtfbf/cp2k.psmp() [0xa733ad]
15  /project/60005/easybuild/build/CP2K/8.2/gmtfbf-2021a/cp2k-8.2/exe/Linux-x86-64-gmtfbf/cp2k.psmp() [0xa693e1]
16  /project/60005/easybuild/build/CP2K/8.2/gmtfbf-2021a/cp2k-8.2/exe/Linux-x86-64-gmtfbf/cp2k.psmp() [0x831661]
17  /project/60005/easybuild/build/CP2K/8.2/gmtfbf-2021a/cp2k-8.2/exe/Linux-x86-64-gmtfbf/cp2k.psmp() [0x47524e]
18  /project/60005/easybuild/build/CP2K/8.2/gmtfbf-2021a/cp2k-8.2/exe/Linux-x86-64-gmtfbf/cp2k.psmp() [0x478df1]
19  /project/60005/easybuild/build/CP2K/8.2/gmtfbf-2021a/cp2k-8.2/exe/Linux-x86-64-gmtfbf/cp2k.psmp() [0x474139]
20  /project/60005/easybuild/build/CP2K/8.2/gmtfbf-2021a/cp2k-8.2/exe/Linux-x86-64-gmtfbf/cp2k.psmp() [0x40edaf]
21  /cvmfs/pilot.eessi-hpc.org/versions/2021.12/compat/linux/x86_64/lib/../lib64/libc.so.6(__libc_start_main+0xce) [0x14f2c13fc7fe]
22  /project/60005/easybuild/build/CP2K/8.2/gmtfbf-2021a/cp2k-8.2/exe/Linux-x86-64-gmtfbf/cp2k.psmp() [0x4725ca]

I might bump everything to later compilers and OpenMPI next week to see if the problems still exist with out latest toolchains.

@eschnett
Copy link
Owner

eschnett commented Mar 2, 2022

I would expect that the regression test failures are either errors in the MPItrampoline Fortran bindings, or errors in my changes to CP2K. So far, I have built the OpenMPI CP2K tests in a Docker container; my next steps would be to convert these to MPItrampoline tests so that I can run these tests locally.

If you have a reproducible setup that I can use, then that could save me some time.

@ocaisa
Copy link
Contributor Author

ocaisa commented Mar 3, 2022

What I have is reproducible but tedious for you I suspect (it would involved building everything down to the compiler and requires customisations for MPItrampoline that are currently not merged in an EasyBuild release yet).

My job with the MPItrampoline tests was killed after 13 hours of testing :(

@ocaisa
Copy link
Contributor Author

ocaisa commented Mar 3, 2022

You can find the docs on the tests at https://www.cp2k.org/dev:regtesting . If you have an existing build you should be able to run the tests using that build (after getting the sources and then starting from Step 2 using the -nobuild option)

There is a section there also about the directory structure you need.

@eschnett
Copy link
Owner

eschnett commented Mar 3, 2022

I tried a -nobuild test, following your instructions. This leads to the error

make: *** No rule to make target 'realclean'.  Stop.

Apparently there is a makefile that needs to be somewhere.

I also tried the Docker container I mentioned earlier (with OpenMPI, no changes, straight from the checkout). This led to many failures (55 out of 60).

Building everything locally wouldn't be a problem, e.g. the Docker containers started by building GCC. But I am looking for instructions (someone "holding my hand") to reproduce the issue. Either a Dockerfile or a shell script for macOS or Linux would work.

@ocaisa
Copy link
Contributor Author

ocaisa commented Mar 3, 2022

Assuming that you have the sources and a build of CP2K already, here were my steps

mkdir CP2K_testing
tar -jxvf cp2k-8.2.tar.bz2 
cd cp2k-8.2
mv data/ ../CP2K_testing/
mv tests/ ../CP2K_testing/
mv tools ../CP2K_testing/
cd  ../CP2K_testing/
mkdir -p exe
module load CP2K/8.2-gmtfbf-2021a 
cd exe/
ln -s prebuilt $EBROOTCP2K/bin  # Path to installation directory

I could then run the tests on the cluster with

#!/bin/bash -l
#SBATCH --time=01:00:00
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=2
#SBATCH --cpus-per-task=2
#SBATCH --ntasks-per-core=1
 
# More SBATCH options:
# If you need 512GB memory nodes (otherwise only 256GB guaranteed):
#    #SBATCH --mem=497G
# To run on the debug queue (max 10 nodes, 30 min):
#    #SBATCH--partition=debug
 
set -o errexit
set -o nounset
set -o pipefail
 
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}
export OMP_PROC_BIND=close
export OMP_PLACES=cores
 
module load CP2K/8.2-gmtfbf-2021a 

# Let the user see the currently loaded modules in the slurm log for completeness:
module list
 
CP2K_BASE_DIR="/home/ocaisa/CP2K_testing"
CP2K_TEST_DIR="/scratch/ocaisa/cp2k_regtesting"
CP2K_REGTEST_SCRIPT_DIR="/home/ocaisa/CP2K_testing/tools/regtesting"

CP2K_ARCH="prebuilt"
CP2K_VERSION="psmp"
 
NTASKS_SINGLE_TEST=2
NNODES_SINGLE_TEST=1
SRUN_CMD="mpiexec"
 
# to run tests across nodes (to check for communication effects), use:
# NNODES_SINGLE_TEST=4
# SRUN_CMD="srun --cpu-bind=verbose,cores --ntasks-per-node 2"
 
# the following should be sufficiently generic:
 
mkdir -p "${CP2K_TEST_DIR}"
cd "${CP2K_TEST_DIR}"
 
cp2k_rel_dir=$(realpath --relative-to="${CP2K_TEST_DIR}" "${CP2K_BASE_DIR}")
# srun does not like `-np`, override the complete command instead:
export cp2k_run_prefix="${SRUN_CMD} -N ${NNODES_SINGLE_TEST} -n ${NTASKS_SINGLE_TEST}"
 
"${CP2K_REGEST_SCRIPT_DIR:-${CP2K_BASE_DIR}/tools/regtesting}/do_regtest" \
  -arch "${CP2K_ARCH}" \
  -version "${CP2K_VERSION}" \
  -nobuild \
  -mpiranks ${NTASKS_SINGLE_TEST} \
  -ompthreads ${OMP_NUM_THREADS} \
  -maxtasks ${SLURM_NTASKS} \
  -cp2kdir "${cp2k_rel_dir}" \
 |& tee "${CP2K_TEST_DIR}/${CP2K_ARCH}.${CP2K_VERSION}.log"

@ocaisa
Copy link
Contributor Author

ocaisa commented Mar 4, 2022

Ok, I think this may be simpler than needing to run the full test suite. I downloaded a H.inp sample input file from the CP2K repo. This takes a second to run with OpenMPI:

OMP_NUM_THREADS=2 mpiexec --oversubscribe -n 8 cp2k.psmp ./H.inp

but with the MPItrampoline version it hangs when gathering statistics at the end:


  **** **** ******  **  PROGRAM STARTED AT               2022-03-04 12:47:00.962
 ***** ** ***  *** **   PROGRAM STARTED ON        xlnode1.int.eessi-gpu.learnhpc
 **    ****   ******    PROGRAM STARTED BY                                ocaisa
 ***** **    ** ** **   PROGRAM PROCESS ID                                532650
  **** **  *******  **  PROGRAM STARTED IN                          /home/ocaisa

 CP2K| version string:                                          CP2K version 8.2
 CP2K| source code revision number:                                  git:310b7ab
 CP2K| cp2kflags: omp libint fftw3 libxc parallel mpi3 scalapack xsmm plumed2   
 CP2K| is freely available from                            https://www.cp2k.org/
 CP2K| Program compiled at                        Thu Mar 3 11:47:21 AM UTC 2022
 CP2K| Program compiled on                     xlnode1.int.eessi-gpu.learnhpc.eu
 CP2K| Program compiled for                                  Linux-x86-64-gmtfbf
 CP2K| Data directory path    /project/def-sponsor00/easybuild/software/CP2K/8.2
 CP2K| Input file name                                                   ./H.inp

 GLOBAL| Method name                                                        ATOM
 GLOBAL| Project name                                                          H
 GLOBAL| Run type                                                   ENERGY_FORCE
 GLOBAL| FFT library                                                       FFTW3
 GLOBAL| Diagonalization library                                       ScaLAPACK
 GLOBAL| Orthonormality check for eigenvectors                          DISABLED
 GLOBAL| Matrix multiplication library                                    SCALAP
 GLOBAL| All-to-all communication in single precision                          F
 GLOBAL| FFTs using library dependent lengths                                  F
 GLOBAL| Grid backend                                                       AUTO
 GLOBAL| Global print level                                               MEDIUM
 GLOBAL| MPI I/O enabled                                                       T
 GLOBAL| Total number of message passing processes                             8
 GLOBAL| Number of threads for this process                                    2
 GLOBAL| This output is from process                                           0
 GLOBAL| CPU model name                          AMD EPYC 7742 64-Core Processor
 GLOBAL| CPUID                                                              1002

 MEMORY| system memory details [Kb]
 MEMORY|                        rank 0           min           max       average
 MEMORY| MemTotal             65951236             0             0             0
 MEMORY| MemFree              56722588             0             0             0
 MEMORY| Buffers                  2104             0             0             0
 MEMORY| Cached                7199448             0             0             0
 MEMORY| Slab                   537888             0             0             0
 MEMORY| SReclaimable           300676             0             0             0
 MEMORY| MemLikelyFree        64224816             0             0             0


 *** Fundamental physical constants (SI units) ***

 *** Literature: B. J. Mohr and B. N. Taylor,
 ***             CODATA recommended values of the fundamental physical
 ***             constants: 2006, Web Version 5.1
 ***             http://physics.nist.gov/constants

 Speed of light in vacuum [m/s]                             2.99792458000000E+08
 Magnetic constant or permeability of vacuum [N/A**2]       1.25663706143592E-06
 Electric constant or permittivity of vacuum [F/m]          8.85418781762039E-12
 Planck constant (h) [J*s]                                  6.62606896000000E-34
 Planck constant (h-bar) [J*s]                              1.05457162825177E-34
 Elementary charge [C]                                      1.60217648700000E-19
 Electron mass [kg]                                         9.10938215000000E-31
 Electron g factor [ ]                                     -2.00231930436220E+00
 Proton mass [kg]                                           1.67262163700000E-27
 Fine-structure constant                                    7.29735253760000E-03
 Rydberg constant [1/m]                                     1.09737315685270E+07
 Avogadro constant [1/mol]                                  6.02214179000000E+23
 Boltzmann constant [J/K]                                   1.38065040000000E-23
 Atomic mass unit [kg]                                      1.66053878200000E-27
 Bohr radius [m]                                            5.29177208590000E-11

 *** Conversion factors ***

 [u] -> [a.u.]                                              1.82288848426455E+03
 [Angstrom] -> [Bohr] = [a.u.]                              1.88972613288564E+00
 [a.u.] = [Bohr] -> [Angstrom]                              5.29177208590000E-01
 [a.u.] -> [s]                                              2.41888432650478E-17
 [a.u.] -> [fs]                                             2.41888432650478E-02
 [a.u.] -> [J]                                              4.35974393937059E-18
 [a.u.] -> [N]                                              8.23872205491840E-08
 [a.u.] -> [K]                                              3.15774647902944E+05
 [a.u.] -> [kJ/mol]                                         2.62549961709828E+03
 [a.u.] -> [kcal/mol]                                       6.27509468713739E+02
 [a.u.] -> [Pa]                                             2.94210107994716E+13
 [a.u.] -> [bar]                                            2.94210107994716E+08
 [a.u.] -> [atm]                                            2.90362800883016E+08
 [a.u.] -> [eV]                                             2.72113838565563E+01
 [a.u.] -> [Hz]                                             6.57968392072181E+15
 [a.u.] -> [1/cm] (wave numbers)                            2.19474631370540E+05
 [a.u./Bohr**2] -> [1/cm]                                   5.14048714338585E+03
 
 DBCSR| CPU Multiplication driver                                           XSMM
 DBCSR| Multrec recursion limit                                              512
 DBCSR| Multiplication stack size                                           1000
 DBCSR| Maximum elements for images                                    UNLIMITED
 DBCSR| Multiplicative factor virtual images                                   1
 DBCSR| Use multiplication densification                                       T
 DBCSR| Multiplication size stacks                                             3
 DBCSR| Use memory pool for CPU allocation                                     F
 DBCSR| Number of 3D layers                                               SINGLE
 DBCSR| Use MPI memory allocation                                              F
 DBCSR| Use RMA algorithm                                                      F
 DBCSR| Use Communication thread                                               T
 DBCSR| Communication thread load                                             83
 DBCSR| MPI: My node id                                                        0
 DBCSR| MPI: Number of nodes                                                   8
 DBCSR| OMP: Current number of threads                                         2
 DBCSR| OMP: Max number of threads                                             2
 DBCSR| Split modifier for TAS multiplication algorithm                  1.0E+00



                           ****  ******  ****   ****   
                          **  ** ****** **  ** ******  
                          ******   **   **  ** **  **  
                          **  **   **    ****  **  **  
                                                       
                             University of Zurich      
                                 2009 - 2015           
                                                       
                                 Version 0.0           
                                                                   


 Atomic Energy Calculation             Hydrogen [H]          Atomic number:    1


 METHOD    | Restricted Kohn-Sham Calculation
 METHOD    | Nonrelativistic Calculation
 FUNCTIONAL| ROUTINE=NEW
 FUNCTIONAL| BECKE88:
 FUNCTIONAL| A. Becke, Phys. Rev. A 38, 3098 (1988) {LDA version}               
 FUNCTIONAL| LYP:
 FUNCTIONAL| C. Lee, W. Yang, R.G. Parr, Phys. Rev. B, 37, 785 (1988) {LDA versi
 FUNCTIONAL| on}                                                                

 Electronic structure
    Total number of core electrons                                          0.00
    Total number of valence electrons                                       1.00
    Total number of electrons                                               1.00
    Multiplicity                                                   not specified
    S      1.00


 *******************************************************************************
                  Iteration          Convergence                     Energy [au]
 *******************************************************************************
                          1        0.320749E-01                  -0.456955069647
                          2        0.324918E-02                  -0.457634427736
                          3        0.262900E-03                  -0.457648540569
                          4        0.494227E-06                  -0.457648648451

 Energy components [Hartree]           Total Energy ::           -0.457648648451
                                        Band Energy ::           -0.222846251753
                                     Kinetic Energy ::            0.482431430167
                                   Potential Energy ::           -0.940080078618
                                      Virial (-V/T) ::            1.948629421373
                                        Core Energy ::           -0.496696135527
                                          XC Energy ::           -0.266586511371
                                     Coulomb Energy ::            0.305633998447

 Orbital energies  State     L     Occupation   Energy[a.u.]          Energy[eV]

                       1     0          1.000      -0.222846           -6.063955


 Total Electron Density at R=0:                                         0.288108


                             NORMAL TERMINATION OF     
                                                       
                           ****  ******  ****   ****   
                          **  ** ****** **  ** ******  
                          ******   **   **  ** **  **  
                          **  **   **    ****  **  **  


 -------------------------------------------------------------------------------
 -                                                                             -
 -                                DBCSR STATISTICS                             -
 -                                                                             -
 -------------------------------------------------------------------------------

Given the backtrace above, I wonder if it is a specfic problem with MPI_Allreduce?

I also found an issue in the issue in the CP2K repo about the clash between the Fortran and MPI standards: cp2k/cp2k#1019

@eschnett
Copy link
Owner

eschnett commented Mar 4, 2022

I'll have a look.

What architecture are you using (x86_64?), and what MPI implementation (MPICH?)?

@ocaisa
Copy link
Contributor Author

ocaisa commented Mar 4, 2022

Yes x86_64 (AMD Rome), I'm using OpenMPI for the MPItrampoline default and comparing that to the same toolchain with vanilla Open MPI.

@ocaisa
Copy link
Contributor Author

ocaisa commented Apr 4, 2022

To leave a comment here, I did manage to get a patch that worked in a few cases but it was quite invasive and you would need quite a bit of knowledge (both programming language and use case) to get it right. The core problem is (it seems to me) that CP2K is using MPI constants to do variable initialisations and MPItrampoline can't allow that since it (and therefore the compiler) doesn't know what those constants should be until runtime. This looks like it might be a wider issue since I've seen the same type of problem appear for other (Fortran) applications.

@eschnett made the suggestion that perhaps (for Fortran at least) MPItrampoline should set it's own constants and then do runtime translation of those constants for the actual MPI runtime used.

@ocaisa
Copy link
Contributor Author

ocaisa commented Nov 11, 2022

@eschnett I was trying to consider a way to get around this. Much as I want to, for our case it is very hard to consider using MPItrampoline as part of a toolchain if there will be key Fortran applications that won't work. As a compromise, I was wondering if there would be a way to allow us to fix the MPI constant values when using the MPItrampoline compiler wrappers. There are only two key variants that I can think of, OpenMPI and MPICH, so perhaps an option to the compiler wrappers that allows us to use a specific set of values? That would allow me create two variants of problem applications like CP2K, one for an OpenMPI compatibility use case and one for MPICH compatibility use case. This would cover every scenario that I can currently think of (and would be extensible for ones I can't).

@eschnett
Copy link
Owner

@ocaisa This would be a good compromise. Let me think about this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants