-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test_square_sparse_rma.perf test in make test #238
Comments
Unfortunately, the problem is OpenMPI, which is buggy with RMA... |
BTW, I see that you are using a single node. Could you check without MPI, i.e. cmake -DUSE_MPI=OFF -DUSE_CUDA=ON -DUSE_CUBLAS=ON -DWITH_GPU=P100 ? |
@alazzaro
|
For 1 node there is no need to use MPI with DBCSR and the performance is the same (as expected). At this point, my suspicious that OpenMPI is buggy with RMA seems the reason for your problem. |
Closing this issue, now cmake detects OpenMPI version and avoid to run RMA test if OpenMPI doesn't support it (#307 ) |
Describe the bug
When I use the v2.0.0-rc7 tag to run the make test in CUDA version but the test will be stuck in test_square_sparse_rma.perf. How much time will it finish or it justs stucks?
To Reproduce
Steps to reproduce the behavior:
Test project /home/soga/dbcsr-v2.0.0-rc7/build
Start 1: dbcsr_perf:inputs/test_H2O.perf
1/18 Test Delete COPYRIGHT as there is now LICENSE #1: dbcsr_perf:inputs/test_H2O.perf ....................... Passed 22.88 sec
Start 2: dbcsr_perf:inputs/test_rect1_dense.perf
2/18 Test Test dbcsr_performance_driver doesn't work for non-square matrices #2: dbcsr_perf:inputs/test_rect1_dense.perf ............... Passed 524.70 sec
Start 3: dbcsr_perf:inputs/test_rect1_sparse.perf
3/18 Test Update file headers #3: dbcsr_perf:inputs/test_rect1_sparse.perf .............. Passed 388.77 sec
Start 4: dbcsr_perf:inputs/test_rect2_dense.perf
4/18 Test Automate sync to CP2K SVN #4: dbcsr_perf:inputs/test_rect2_dense.perf ............... Passed 460.63 sec
Start 5: dbcsr_perf:inputs/test_rect2_sparse.perf
5/18 Test get automated coverage reports working #5: dbcsr_perf:inputs/test_rect2_sparse.perf .............. Passed 441.30 sec
Start 6: dbcsr_perf:inputs/test_square_dense.perf
6/18 Test Bug in Cannon #6: dbcsr_perf:inputs/test_square_dense.perf .............. Passed 323.97 sec
Start 7: dbcsr_perf:inputs/test_square_sparse.perf
7/18 Test Dev loc #7: dbcsr_perf:inputs/test_square_sparse.perf ............. Passed 274.20 sec
Start 8: dbcsr_perf:inputs/test_square_sparse_bigblocks.perf
8/18 Test File extensions in the Makefile #8: dbcsr_perf:inputs/test_square_sparse_bigblocks.perf ... Passed 281.11 sec
Start 9: dbcsr_perf:inputs/test_square_sparse_rma.perf
... stuck here ...
Environment:
Operating system & version: 'Ubuntu 18.04.2 LTS'
Compiler vendor & version:
Build environment (make or cmake): cmake version 3.15.3
Configuration of DBCSR (either the cmake flags or the
Makefile.inc
): cmake -DUSE_CUDA=ON -DUSE_CUBLAS=ON -DWITH_GPU=P100MPI implementation and version:
Package: Open MPI buildd@lcy01-amd64-009 Distribution
Open MPI: 2.1.1
Open MPI repo revision: v2.1.0-100-ga2fdb5b
Open MPI release date: May 10, 2017
Open RTE: 2.1.1
Open RTE repo revision: v2.1.0-100-ga2fdb5b
Open RTE release date: May 10, 2017
OPAL: 2.1.1
OPAL repo revision: v2.1.0-100-ga2fdb5b
OPAL release date: May 10, 2017
MPI API: 3.1.0
Ident string: 2.1.1
Prefix: /usr
Configured architecture: x86_64-pc-linux-gnu
Configure host: lcy01-amd64-009
Configured by: buildd
Configured on: Mon Feb 5 19:59:59 UTC 2018
Configure host: lcy01-amd64-009
Built by: buildd
Built on: Mon Feb 5 20:05:56 UTC 2018
Built host: lcy01-amd64-009
C bindings: yes
C++ bindings: yes
Fort mpif.h: yes (all)
Fort use mpi: yes (full: ignore TKR)
Fort use mpi size: deprecated-ompi-info-value
Fort use mpi_f08: yes
Fort mpi_f08 compliance: The mpi_f08 module is available, but due to
limitations in the gfortran compiler, does not
support the following: array subsections, direct
passthru (where possible) to underlying Open MPI's
C functionality
Fort mpi_f08 subarrays: no
Java bindings: yes
Wrapper compiler rpath: disabled
C compiler: gcc
C compiler absolute: /usr/bin/gcc
C compiler family name: GNU
C compiler version: 7.3.0
C++ compiler: g++
C++ compiler absolute: /usr/bin/g++
Fort compiler: gfortran
Fort compiler abs: /usr/bin/gfortran
If CUDA is being used: CUDA version and GPU architecture:CUDA 10.1, GPU Card 1080Ti
BLAS/LAPACK implementation and version
If applicable: Runtime information (how many nodes, type of nodes, ...): one node only.
Thanks,
Vitesse.
The text was updated successfully, but these errors were encountered: