Describe the bug
For the below example, the total time of the LCAO SCF calculation by scalapack_gvx is about 3 times longer than that by genelpa, and the longer of diagonalization (hamiltSolvePsiK) is 7 times.
The detail of the time cost is:

The left one is for scalapack_gvx and the right one is for genelpa.
006_Ti15.zip
Expected behavior
No response
To Reproduce
set ks_solver to be scalapack_gvx and genelpa respectively, and run the job with command OMP_NUM_THREADS=1 mpirun -np 16 abacus
Environment
Bohrium image: registry.dp.tech/dptech/abacus:3.1.0
Additional Context
No response
Describe the bug
For the below example, the total time of the LCAO SCF calculation by scalapack_gvx is about 3 times longer than that by genelpa, and the longer of diagonalization (hamiltSolvePsiK) is 7 times.
The detail of the time cost is:

The left one is for scalapack_gvx and the right one is for genelpa.
006_Ti15.zip
Expected behavior
No response
To Reproduce
set
ks_solverto bescalapack_gvxandgenelparespectively, and run the job with commandOMP_NUM_THREADS=1 mpirun -np 16 abacusEnvironment
Bohrium image: registry.dp.tech/dptech/abacus:3.1.0
Additional Context
No response