-
Notifications
You must be signed in to change notification settings - Fork 188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CudnnRNN is not differentiable twice #1
Comments
Hi, I solved this problem by switching to CUDA 9.0 and reinstall PyTorch. Another thing is to use LSTMCell than LSTM. Best, |
Thanks for the fast feedback. Sadly still no luck. I switched my GRU Layer to a GRUCell which only changed the Error to RuntimeError: GRUFused is not differentiable twice. Since I'm working in an environment where can't easily change the CUDA Version (Currently 7.5) using a different CUDA is no option. Are you sure that this would solve the problem?
I'm really hoping to solve this issue since I need to implement this using a RNN policy. |
I’m pretty positive that changing the CUDA version will solve the problem if you are using GRUCell since that was my case and I didn’t change a single line of code. Alternatively, you can use PPO instead of TRPO, which should give you similar performance. |
I did change from TRPO to PPO. I will compare results but it seems to train fine. Thank you ! |
Hi,
First of all thank you very much for providing this great repository!
I am currently implementing GAIL using a MLP as well as an RNN policy net (Two different experiments). The MLP network is working as intended but if I switch to the RNN policy I get a RuntimeError: CudnnRNN is not differentiable twice during execution of this line in
core.trpo.Fvp_fim
:The only difference between my MLP and RNN policy implementation is the initialization of the hidden state during
get_log_prob
andget_fim
within my policy class.Given your recent commit (d66765eecad38ddc3f6e0f33d35ef70a7ed11892) I thought that the network is only differentiating once during TRPO.
Am I doing something wrong or is the network still differentiating twice ?
Thank you very much!
The text was updated successfully, but these errors were encountered: