You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Run torch.autograd.set_detect_anomaly(True) to get better error messages
try x = torch.tensor([1.],requires_grad=True); hessian(x * x * 1, x) and see that it returns 2 correctly
try x = torch.tensor([1.],requires_grad=True); hessian(x * x, x) and see that the code crashes
The error message is
Warning: Traceback of forward call that caused the error:
File "C:\Users\Ludvig\Miniconda3\envs\robustness2019\lib\traceback.py", line 197, in format_stack
return format_list(extract_stack(f, limit=limit))
(print_stack at ..\torch\csrc\autograd\python_anomaly_mode.cpp:57)
Traceback (most recent call last):
File "C:\Users\Ludvig\Miniconda3\envs\robustness2019\lib\site-packages\IPython\core\interactiveshell.py", line 3331, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-31-789b47ffdeb8>", line 19, in <module>
x = torch.tensor([1.],requires_grad=True); hessian(x * x, x) # breaks
File "<ipython-input-31-789b47ffdeb8>", line 16, in hessian
return jacobian(jacobian(y, x, create_graph=True), x)
File "<ipython-input-31-789b47ffdeb8>", line 10, in jacobian
grad_x, = torch.autograd.grad(flat_y, x, grad_y, retain_graph=True, create_graph=create_graph)
File "C:\Users\Ludvig\Miniconda3\envs\robustness2019\lib\site-packages\torch\autograd\__init__.py", line 157, in grad
inputs, allow_unused)
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [1]] is at version 2; expected version 1 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck
I cannot troubleshoot this further. It seems to happen somewhere in the C++ code?
Expected behavior
I expect that both my runs (with and without multiplying by 1) returns the same answer.
wget https://raw.githubusercontent.com/pytorch/pytorch/master/torch/utils/collect_env.py
# For security purposes, please check the contents of collect_env.py before running it.
python collect_env.py
PyTorch Version (e.g., 1.0): 1.4.0
OS (e.g., Linux): Windows 10
How you installed PyTorch (conda, pip, source): conda
Build command you used (if compiling from source):
We use github issues only for bugs or feature requests.
Please use the forum to ask questions: https://discuss.pytorch.org/
For your particular issue, this is not related to pytorch.
What happens is that when you do x * x only, then the gradient return in the jacobian computation is the same as the one that is passed as input. And it is modified inplace to compute the full Jacobian. But that value is needed for the double backward computation.
You can fix this by:
Make sure that returned gradient is not the same by making the backward clone it (this is what happens when you multiply by 1)
Fix the jacobian code to not reuse the original Tensor by doing jac.append(grad_x.reshape(x.shape).clone()).
馃悰 Bug
When running
torch.autograd.grad
recursively to compute a hessian, the code crashes. It does not crash if i mutiply my variables by 1.To Reproduce
Steps to reproduce the behavior:
Run
torch.autograd.set_detect_anomaly(True)
to get better error messagestry
x = torch.tensor([1.],requires_grad=True); hessian(x * x * 1, x)
and see that it returns 2 correctlytry
x = torch.tensor([1.],requires_grad=True); hessian(x * x, x)
and see that the code crashesThe error message is
I cannot troubleshoot this further. It seems to happen somewhere in the C++ code?
Expected behavior
I expect that both my runs (with and without multiplying by 1) returns the same answer.
Environment
Please copy and paste the output from our
environment collection script
(or fill out the checklist below manually).
You can get the script and run it with:
1.4.0
Windows 10
conda
,pip
, source):conda
cpuonly
n/a
Additional context
Related to this SO thread: https://stackoverflow.com/questions/61308237/cannot-find-in-place-operation-causing-runtimeerror-one-of-the-variables-neede
The text was updated successfully, but these errors were encountered: