-
Notifications
You must be signed in to change notification settings - Fork 295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RuntimeError: CUDA error: invalid configuration argument #702
Comments
which version of cuda and pytorch are you using? |
CUDA 11.1 and Pytorch 1.10.0 |
Could you switch to another cuda version, e.g., cuda 10.2?
Most people are using cuda 11.1 when they have such an issue. |
Sure, I will try. Thanks for the suggestion! |
For future reference, the following issues are related to this one using cuda 11.1 |
Looks like this is most likely a PyTorch bug that we just happen to be triggering, so probably would be easiest to try different versions of PyTorch and/or CUDA because we would not be able to fix this ourselves. |
After we switch to CUDA 10.2, the issue is resolved. Thanks a lot! |
(We can use |
When I was trying a zipformer (pruned_transducer_stateless7) on spgispeech, I did the following:
I got the following error after the training run for a while:
It seems not an OOM error. If setting
--max-duration 300
, this error can happen at batch 50.On the other hand, if I try
--max-duration 100
as default, it goes well after many batches but the GPU memory usage is very low.Do you know what may be the issue?
The text was updated successfully, but these errors were encountered: