Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hi, when I started the training progress, I met an error about cuda. Could you please tell me which version of cuda do you use? #14

Closed
elegentbamboo opened this issue Sep 4, 2021 · 1 comment

Comments

@elegentbamboo
Copy link

I use a single GPU RTX3080, and cuda10.1, cudnn7.6.3, and torch1.4.0 as you specify in requirements.txt.
When I ran train.py to train the transfuser model, an error occurred: RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED.
After searching this error, I set torch.backends.cudnn.enabled=false, but another error came: RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc). I guess it is the problem of version.

@ap229997
Copy link
Collaborator

ap229997 commented Sep 4, 2021

I'd suggest updating cuda, cudnn, and torch to the latest version. I've seen cases where older cuda versions give errors when running on RTX 3080.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants