Open
Description
There are two potential issues with DBCSR's init/fini flow wrt ACC interface and LIBSMM:
- DBCSR calls
acc_init()
, which eventually callslibsmm_acc_init()
internally (this is a potential issue since the backend (CUDA/HIP, or OpenCL, etc.) then depends on LIBSMM). The latter routine (libsmm_acc_init
) was only recently introduced (everything here applies similarly to the respective finalize routine also). However, the expected flow (IMHO) is that DBCSR callsacc_init()
andlibsmm_acc_init()
since both respective interfaces are used (and DBCSR does not judge potential internal depedency between ACC and LIBSMM implementation). Then,libsmm_acc_init()
may callacc_init()
internally if the ACC interface is used to implement LIBSMM (note, this dependency is somewhat expected). In turn, all init/finalize routines should be safe against multiple/over-initialization. - DBCSR calls
dbcsr_multiply_lib_init()
inside of a parallel region, and subsequent (sub-)initialization may be called only by the master thread. For instance,acc_init()
is called by the master thread only (OMP MASTER construct). However,acc_finalize()
is apparently called by all threads of a parallel region (workshare). The latter is inconsistent with respect to a "tandem flow" of init/finalize. The expect behavior (IMHO) is to protect acc_init/finalize and libsmm_acc_init/finalize against multiple threads, i.e., ACC and LIBSMM implementations are only expected to be thread-safe after initialization and before finalization.
Metadata
Metadata
Assignees
Labels
No labels