-
-
Notifications
You must be signed in to change notification settings - Fork 281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
EWC fails with RNN's using CUDA #736
Comments
It is probably better to keep the eval mode for all the modules and add an exception for RNNs. We don't want the train mode for modules such as dropout or batch normalization. |
@AntonioCarta in that case should the line in |
I agree. |
@iacobo thanks for reporting! Are you willing to submit a PR with the solution? In my opinion, we could raise a warning in case of RNN + CUDA, explaining the problem to the user, and then use the train mode instead of failing. |
@AndreaCossu Yep sure thing. |
Fix #736 (CUDA bug with RNN's using EWC)
馃悰 Describe the bug
When training a model which contains an RNN/LSTM/GRU layer using the EWC strategy using CUDA, an error is raised since RNN-like layers in PyTorch do not support
backward
calls on CUDA devices when ineval
mode.Trace:
Source of error:
https://avalanche-api.continualai.org/_modules/avalanche/training/plugins/ewc/#EWCPlugin
馃悳 To Reproduce
device
to the GPU.馃悵 Expected behavior
RNN models to run with EWC without error on GPU.
馃 Additional context
This appears to be the opposite behaviour of the
EWCPlugin
as defined in:https://avalanche-api.continualai.org/_modules/avalanche/training/plugins/#EWCPlugin
馃悶 Potential fix
If this is just a typo, the above would be fixed by changing
model.eval()
tomodel.train()
.The text was updated successfully, but these errors were encountered: