You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is due to modules bias parameter to be of size 3 * state_size while the backward outputs a tensor of size 1 x 3 * state_size . The problem is still here for torch 0.4.0, but the backprop engine doesn't complaint as the number of elements is the same.
So the solution could be to remove the keepdim=True in the d_bias computing e.g. here (but it's the same for python baseline, cpp and cuda)
But then you get the opposite error message when running check.py and grad_check.py :
Thanks for fixing it. However you forgot to change it on python/lltm_baseline.py. The module is actually never called whether from benchmark.py or [grad_]check.py so it doesn't trigger an error, but if you call it, you will also have the problem.
The problem only occurs on pytorch master, because it's backprop engine is less compliant :
when running
benchmark.py cpp
(or cuda) :This is due to modules bias parameter to be of size
3 * state_size
while the backward outputs a tensor of size1 x 3 * state_size
. The problem is still here for torch 0.4.0, but the backprop engine doesn't complaint as the number of elements is the same.So the solution could be to remove the
keepdim=True
in the d_bias computing e.g. here (but it's the same for python baseline, cpp and cuda)But then you get the opposite error message when running
check.py
andgrad_check.py
:This is because now the bias given to the function is of size
1 x 15
!The solution is pretty simple, but needs to decide on what to do :
bias
parameter in every nn module dimension1 x ...
check.py
andgrad_check.py
and remove thekeepdim=True
arguments when computingd_bias
sums.The text was updated successfully, but these errors were encountered: