forked from abacusmodeling/abacus-develop
-
Notifications
You must be signed in to change notification settings - Fork 145
Closed
Labels
GPU & DCU & HPCGPU and DCU and HPC related any issuesGPU and DCU and HPC related any issues
Description
Describe the bug
When I set ks_solver=cusolver, ABACUS quits with CUDA error (for example, a simple SCF task). After my testing, this seems to occur after PR #5225. Maybe @Cstandardlib can have a look?
START CHARGE : atomic
DONE(1.21413 SEC) : INIT SCF
* * * * * *
<< Start SCF iteration.
CUDA error at /home/itztony/Documents/Research/Coding/abacus-develop/source/module_hsolver/kernels/cuda/diag_cusolver.cu:132 code=1(cudaErrorInvalidValue) "cudaMemcpy(d_A2, A, sizeof(cuDoubleComplex) * lda * m, cudaMemcpyHostToDevice)"
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
Process name: [[48991,1],0]
Exit code: 1
--------------------------------------------------------------------------
Expected behavior
No response
To Reproduce
No response
Environment
No response
Additional Context
No response
Task list for Issue attackers (only for developers)
- Verify the issue is not a duplicate.
- Describe the bug.
- Steps to reproduce.
- Expected behavior.
- Error message.
- Environment details.
- Additional context.
- Assign a priority level (low, medium, high, urgent).
- Assign the issue to a team member.
- Label the issue with relevant tags.
- Identify possible related issues.
- Create a unit test or automated test to reproduce the bug (if applicable).
- Fix the bug.
- Test the fix.
- Update documentation (if necessary).
- Close the issue and inform the reporter (if applicable).
Cstandardlib
Metadata
Metadata
Assignees
Labels
GPU & DCU & HPCGPU and DCU and HPC related any issuesGPU and DCU and HPC related any issues