Consistency problem with check and modules regarding bias #10

ClementPinard · 2018-06-11T16:33:08Z

The problem only occurs on pytorch master, because it's backprop engine is less compliant :
when running benchmark.py cpp (or cuda) :

Traceback (most recent call last):
  File "benchmark.py", line 43, in <module>
    (new_h.sum() + new_C.sum()).backward()
  File "/home/cpinard/anaconda3/lib/python3.6/site-packages/torch/tensor.py", line 93, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/home/cpinard/anaconda3/lib/python3.6/site-packages/torch/autograd/__init__.py", line 89, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: Function LLTMFunctionBackward returned an invalid gradient at index 2 - expected shape [384] but got [1, 384]

This is due to modules bias parameter to be of size 3 * state_size while the backward outputs a tensor of size 1 x 3 * state_size . The problem is still here for torch 0.4.0, but the backprop engine doesn't complaint as the number of elements is the same.

So the solution could be to remove the keepdim=True in the d_bias computing e.g. here (but it's the same for python baseline, cpp and cuda)

But then you get the opposite error message when running check.py and grad_check.py :

Traceback (most recent call last):
  File "check.py", line 107, in <module>
    check_backward(variables, options.cuda, options.verbose)
  File "check.py", line 53, in check_backward
    (baseline_values[0] + baseline_values[1]).sum().backward()
  File "/home/cpinard/anaconda3/lib/python3.6/site-packages/torch/tensor.py", line 93, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/home/cpinard/anaconda3/lib/python3.6/site-packages/torch/autograd/__init__.py", line 89, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: Function LLTMFunctionBackward returned an invalid gradient at index 2 - expected shape [1, 15] but got [15]

This is because now the bias given to the function is of size 1 x 15 !

The solution is pretty simple, but needs to decide on what to do :

either make bias parameter in every nn module dimension 1 x ...
or squeeze bias in check.py and grad_check.py and remove the keepdim=True arguments when computing d_bias sums.

The text was updated successfully, but these errors were encountered:

goldsborough · 2018-06-11T16:40:52Z

We recently added code to verify the gradient shape (pytorch/pytorch#8168), so it's expected that this would break. I'll fix it

goldsborough · 2018-06-11T18:08:19Z

Fixed on master

ClementPinard · 2018-06-11T18:55:15Z

Thanks for fixing it. However you forgot to change it on python/lltm_baseline.py. The module is actually never called whether from benchmark.py or [grad_]check.py so it doesn't trigger an error, but if you call it, you will also have the problem.

goldsborough closed this as completed Jun 11, 2018

onlytailei mentioned this issue Oct 6, 2018

segmentation fault for pcl icp implementation in pytorch cpp extension #20

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consistency problem with check and modules regarding bias #10

Consistency problem with check and modules regarding bias #10

ClementPinard commented Jun 11, 2018

goldsborough commented Jun 11, 2018

goldsborough commented Jun 11, 2018

ClementPinard commented Jun 11, 2018

Consistency problem with check and modules regarding bias #10

Consistency problem with check and modules regarding bias #10

Comments

ClementPinard commented Jun 11, 2018

goldsborough commented Jun 11, 2018

goldsborough commented Jun 11, 2018

ClementPinard commented Jun 11, 2018