Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thread number has changed #217

Open
hfp opened this issue Aug 7, 2019 · 21 comments
Open

Thread number has changed #217

hfp opened this issue Aug 7, 2019 · 21 comments
Milestone

Comments

@hfp
Copy link
Member

hfp commented Aug 7, 2019

When building and running CP2K as POPT variant (any workload), DBCSR complains:

 *******************************************************************************
 *   ___                                                                       *
 *  /   \                                                                      *
 * [ABORT]                                                                     *
 *  \___/                        Thread number has changed                     *
 *    |                                                                        *
 *  O/|                                                                        *
 * /| |                                                                        *
 * / \                                                dbcsr_dist_methods.F:610 *
 *******************************************************************************

Apparently DBCSR is built with OpenMP even in case of CP2K/POPT. In dbcsr_dist_methods.F:204, dist%d%num_threads is set according to the number of OpenMP-threads. The suggestion is not to build DBCSR without OpenMP, but to revise the check about whether the "Thread number has changed" in a broken way. It should be possible to build and run DBCSR with and without OpenMP independent of CP2K's OpenMP-flag. I suggest to only warn and terminate if the thread-count transitions from N to M with N.GT.1 and N.NE.M.

@alazzaro
Copy link
Member

alazzaro commented Aug 7, 2019

This is a bug in CP2K :( Could you tell me where do you get it? In principle, in the popt you should not have any mention of openmp in the arch file...

Yes, the current model has a static and not centralized way to handle threads, which is inefficient. So, we have to adapt CP2K to this model.
Let's leave this issue open for any feature activity about threading.

@hfp
Copy link
Member Author

hfp commented Aug 7, 2019

I ran the following exe/Linux-x86-64-intelx/cp2k.popt tests/QS/benchmark_single_node/dbcsr.inp (any workload should fail similarly) and built with make ARCH=Linux-x86-64-intelx VERSION=popt GNU=1 LIBINTROOT=$HOME/libint/gnu-skx LIBXCROOT=$HOME/libxc/gnu-skx ELPAROOT=$HOME/elpa/gnu-skx-omp DBG=1 -j48 (CP2K/master). There was no -fopenmp flag present for any of the Fortran translation units and DBCSR was built just as part of CP2K (I double-checked with tee-ing the output and grep-ing for OpenMP). So for that matter any POPT-Archfile should be fine (you do not have to worry about this "Linux-x86-64-intelx" ;-).

@alazzaro
Copy link
Member

alazzaro commented Aug 7, 2019

Well, we don't see such a problem on the dashboard...
So, I see that you are using
VERSION=psmp
but I assume it is a wrong copy&paste. Somehow you are enabling OpenMP for DBCSR... Could you confirm that during the compilation you don't see any -fopenmp?

@hfp
Copy link
Member Author

hfp commented Aug 7, 2019

Edited - indeed copy/paste issue (PSMP->POPT). I will find out why I am getting this while the Dashboard configs seem to be clean.

@hfp
Copy link
Member Author

hfp commented Aug 7, 2019

Maybe I understand this incorrectly, but looking at https://github.com/cp2k/dbcsr/blob/develop/src/dist/dbcsr_dist_methods.F#L45 seems to pull-in routines from the OpenMP runtime and later calling e.g., OMP_GET_MAX_THREADS(). Should this be guarded by #if defined(_OPENMP)?

@alazzaro
Copy link
Member

alazzaro commented Aug 7, 2019

Well no, in Fortran is enough to have

!$

at the beginning of the line...

@alazzaro
Copy link
Member

alazzaro commented Aug 7, 2019

Would it be that you are linking to OpenMP libraries, even if the flag -fopenmp is off?

@hfp
Copy link
Member Author

hfp commented Aug 7, 2019

This also works if OpenMP is linked but not enabled at compile-time?

@hfp
Copy link
Member Author

hfp commented Aug 7, 2019

Would it be that you are linking to OpenMP libraries, even if the flag -fopenmp is off?

Yes, but that's "normal" I think. Though, OpenBLAS or MKL could pull-in OpenMP runtime as well.

@alazzaro
Copy link
Member

alazzaro commented Aug 7, 2019

So the assumption here is that we get OpenMP in DBCSR even if a compile-time we don't ask for it just because we are linking OpenMP libraries...
Could you try to avoid OpenMP everywhere (i.e. including linking) and see if this is the culprit?
From what I can say, POPT tests do not use OpenMP at all everywhere.

@hfp
Copy link
Member Author

hfp commented Aug 7, 2019

POPT tests do not use OpenMP at all everywhere.

Yes, that's the reason for Dashboard being clean. My POPT indirectly links against the OpenMP-runtime. I will double-check that this is the root cause and report back.

@alazzaro
Copy link
Member

alazzaro commented Aug 7, 2019

Assuming that you find the reason in the linking, then we can proceed to use _OPENMP macro and see if it works

@hfp
Copy link
Member Author

hfp commented Aug 7, 2019

I am now getting linker errors about missing OpenMP symbols. I will resolve this over the course of today and build a non-OpenMP POPT. Stay tuned.

@hfp
Copy link
Member Author

hfp commented Aug 7, 2019

Let me collect my findings. The problem already seems to be present at CP2K level (GLOBAL| Number of threads for this process 48) likely because CP2K uses OpenMP runtime in a similar fashion:

[hpabst@clx4 cp2k]$ exe/Linux-x86-64-intelx/cp2k.popt tests/QS/benchmark_single_node/dbcsr.inp

  **** **** ******  **  PROGRAM STARTED AT               2019-08-07 12:18:58.044
 ***** ** ***  *** **   PROGRAM STARTED ON                                  clx4
 **    ****   ******    PROGRAM STARTED BY                                hpabst
 ***** **    ** ** **   PROGRAM PROCESS ID                                307757
  **** **  *******  **  PROGRAM STARTED IN                /nfs/home2/hpabst/cp2k

 CP2K| version string:                    CP2K version 7.0 (Development Version)
 CP2K| source code revision number:                                  git:e4cf6be
 CP2K| cp2kflags: omp fftw3 parallel mpi3 scalapack xsmm has_no_mpi_mod mkl
 CP2K| is freely available from                            https://www.cp2k.org/
 CP2K| Program compiled at                           Wed 7 Aug 12:16:06 BST 2019
 CP2K| Program compiled on                                                  clx4
 CP2K| Program compiled for                                  Linux-x86-64-intelx
 CP2K| Data directory path                           /nfs/home2/hpabst/cp2k/data
 CP2K| Input file name                  tests/QS/benchmark_single_node/dbcsr.inp

 GLOBAL| Method name                                                        TEST
 GLOBAL| Project name                                                    PROJECT
 GLOBAL| Preferred FFT library                                             FFTW3
 GLOBAL| Preferred diagonalization lib.                                       SL
 GLOBAL| Run type                                                           NONE
 GLOBAL| All-to-all communication in single precision                          F
 GLOBAL| FFTs using library dependent lengths                                  F
 GLOBAL| Global print level                                               MEDIUM
 GLOBAL| MPI I/O enabled                                                       T
 GLOBAL| Total number of message passing processes                             1
 GLOBAL| Number of threads for this process                                   48
 GLOBAL| This output is from process                                           0
 GLOBAL| CPU model name            Intel(R) Xeon(R) Platinum 8260L CPU @ 2.40GHz
 GLOBAL| CPUID                                                              1003

@alazzaro
Copy link
Member

alazzaro commented Aug 7, 2019

Well, not a surprise since DBCSR was part of CP2K ;)
We can add a check for that and stop if this is the case, but I wonder if there is a problem during the preprocessing then... I mean, OpenMP should be off because we are using

!$

macro during the compiling time, so I don't see why it can be used at runtime.
@dev-zero Do you have any idea for that?

@hfp
Copy link
Member Author

hfp commented Aug 7, 2019

Ok, I found the root cause: I am using -fopenmp-simd which does not enable OpenMP support for multithreaded applications. This was decided when OpenMP introduced SIMD support, and the objective was to allow indepenent opt-in for both facilities. For instance, -fopenmp-simd does not implicitly define _OPENMP. However, at least GNU Fortran enables lines like !$ USE OMP_LIB, ONLY: omp_get_max_threads, which is likely unintended.

Here is some playground (perhaps called omprt.F90):

      PROGRAM omprt
!$      USE OMP_LIB, ONLY: omp_get_max_threads
        IMPLICIT NONE
        INTEGER :: max_nthreads

        max_nthreads = 1
!$      max_nthreads = omp_get_max_threads()
        WRITE(*,*) "max_nthreads:", max_nthreads
#if defined(_OPENMP)
        WRITE(*,*) "_OPENMP: defined"
#else
        WRITE(*,*) "_OPENMP: not defined"
#endif
      END PROGRAM

Some output:

gfortran omprt.F90
./a.out
 max_nthreads:           1
 _OPENMP: not defined

Some more:

gfortran -fopenmp-simd omprt.F90
/tmp/ccLNztfr.o: In function `MAIN__':
omprt.F90:(.text+0x13): undefined reference to `omp_get_max_threads_'
collect2: error: ld returned 1 exit status

The latter behavior is a bug, especially since ifort works correctly. Intel Fortran needs -qopenmp-simd (icc supports both -fopenmp-simd and -qopenmp-simd).

@hfp
Copy link
Member Author

hfp commented Aug 7, 2019

I understand that -fopenmp-simd has no adoption (yet) in CP2K/DBCSR and it is implicitly enabled by -fopenmp (as opposed to the other way around). Anyhow, I think with this it would be almost overly cautious to protect calling omp_* with #if defined(_OPENMP). However, there are likely not too many lines and someone may decide about it...

@alazzaro
Copy link
Member

alazzaro commented Aug 7, 2019

No really, we use !$ everywhere in the code! Is this with a recent GCC compiler?

@hfp
Copy link
Member Author

hfp commented Aug 7, 2019

I used GNU Fortran 8.3 i.e., the latest valid version for me (side note: 9.1 fails tons of CP2K's regtests).

@alazzaro
Copy link
Member

alazzaro commented Aug 7, 2019

The man page says:

-fopenmp-simd
           Enable handling of OpenMP's SIMD directives with "#pragma omp" in C/C++ and "!$omp" in Fortran. Other OpenMP directives are ignored.

so it is clearly a bug in GCC during the preprocessing... Anyhow, I think the problem is understood, and it is no related to DBCSR (it is CP2K), but let's leave the issue open for future threading activity in DBCSR...

@hfp
Copy link
Member Author

hfp commented Aug 7, 2019

Thanks! The fix for me is also easy, I will just pull-off the -fopenmp-simd in case of POPT (or similar). This way, I can proceed building POPT with some libraries being MT. In fact, I am planning to enable a race-free (synchronized) batch-processing of SMMs using libxsmmext in particular for POPT.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants