New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Thread number has changed #217
Comments
This is a bug in CP2K :( Could you tell me where do you get it? In principle, in the popt you should not have any mention of openmp in the arch file... Yes, the current model has a static and not centralized way to handle threads, which is inefficient. So, we have to adapt CP2K to this model. |
I ran the following |
Well, we don't see such a problem on the dashboard... |
Edited - indeed copy/paste issue (PSMP->POPT). I will find out why I am getting this while the Dashboard configs seem to be clean. |
Maybe I understand this incorrectly, but looking at https://github.com/cp2k/dbcsr/blob/develop/src/dist/dbcsr_dist_methods.F#L45 seems to pull-in routines from the OpenMP runtime and later calling e.g., |
Well no, in Fortran is enough to have !$ at the beginning of the line... |
Would it be that you are linking to OpenMP libraries, even if the flag -fopenmp is off? |
This also works if OpenMP is linked but not enabled at compile-time? |
Yes, but that's "normal" I think. Though, OpenBLAS or MKL could pull-in OpenMP runtime as well. |
So the assumption here is that we get OpenMP in DBCSR even if a compile-time we don't ask for it just because we are linking OpenMP libraries... |
Yes, that's the reason for Dashboard being clean. My POPT indirectly links against the OpenMP-runtime. I will double-check that this is the root cause and report back. |
Assuming that you find the reason in the linking, then we can proceed to use _OPENMP macro and see if it works |
I am now getting linker errors about missing OpenMP symbols. I will resolve this over the course of today and build a non-OpenMP POPT. Stay tuned. |
Let me collect my findings. The problem already seems to be present at CP2K level (
|
Well, not a surprise since DBCSR was part of CP2K ;) !$ macro during the compiling time, so I don't see why it can be used at runtime. |
Ok, I found the root cause: I am using Here is some playground (perhaps called PROGRAM omprt
!$ USE OMP_LIB, ONLY: omp_get_max_threads
IMPLICIT NONE
INTEGER :: max_nthreads
max_nthreads = 1
!$ max_nthreads = omp_get_max_threads()
WRITE(*,*) "max_nthreads:", max_nthreads
#if defined(_OPENMP)
WRITE(*,*) "_OPENMP: defined"
#else
WRITE(*,*) "_OPENMP: not defined"
#endif
END PROGRAM Some output: gfortran omprt.F90
./a.out
max_nthreads: 1
_OPENMP: not defined Some more: gfortran -fopenmp-simd omprt.F90
/tmp/ccLNztfr.o: In function `MAIN__':
omprt.F90:(.text+0x13): undefined reference to `omp_get_max_threads_'
collect2: error: ld returned 1 exit status The latter behavior is a bug, especially since |
I understand that |
No really, we use !$ everywhere in the code! Is this with a recent GCC compiler? |
I used GNU Fortran 8.3 i.e., the latest valid version for me (side note: 9.1 fails tons of CP2K's regtests). |
The man page says:
so it is clearly a bug in GCC during the preprocessing... Anyhow, I think the problem is understood, and it is no related to DBCSR (it is CP2K), but let's leave the issue open for future threading activity in DBCSR... |
Thanks! The fix for me is also easy, I will just pull-off the |
When building and running CP2K as POPT variant (any workload), DBCSR complains:
Apparently DBCSR is built with OpenMP even in case of CP2K/POPT. In
dbcsr_dist_methods.F:204
,dist%d%num_threads
is set according to the number of OpenMP-threads. The suggestion is not to build DBCSR without OpenMP, but to revise the check about whether the "Thread number has changed" in a broken way. It should be possible to build and run DBCSR with and without OpenMP independent of CP2K's OpenMP-flag. I suggest to only warn and terminate if the thread-count transitions from N to M with N.GT.1 and N.NE.M.The text was updated successfully, but these errors were encountered: