Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error with inplace operations due to recent changes in pytorch #36

Closed
jdollinger-bit opened this issue Jul 17, 2023 · 1 comment · Fixed by #37
Closed

Error with inplace operations due to recent changes in pytorch #36

jdollinger-bit opened this issue Jul 17, 2023 · 1 comment · Fixed by #37

Comments

@jdollinger-bit
Copy link
Contributor

Hi, running the following script with torch=2.0.0:

import unet
import torch
loss_fn = torch.nn.BCEWithLogitsLoss()
input = torch.ones([8, 3, 24, 24])
targets = torch.ones([8, 10, 24, 24])
unet_model = unet.UNet2D(in_channels=3, out_classes=10, residual=True, num_encoding_blocks=2)
out = unet_model(input)
loss = loss_fn(targets, out)
loss.backward()

gave the following error:


RuntimeError Traceback (most recent call last)
/tmp/ipykernel_5969/4012461463.py in
5 out = unet_model(input)
6 loss = loss_fn(targets, out)
----> 7 loss.backward()

~/.local/lib/python3.10/site-packages/torch/_tensor.py in backward(self, gradient, retain_graph, create_graph, inputs)
485 inputs=inputs,
486 )
--> 487 torch.autograd.backward(
488 self, gradient, retain_graph, create_graph, inputs=inputs
489 )

~/.local/lib/python3.10/site-packages/torch/autograd/init.py in backward(tensors, grad_tensors, retain_graph, create_graph,
grad_variables, inputs)
198 # some Python versions print out the first line of a multi-line function
199 # calls in the traceback and some print out the last line
--> 200 Variable.execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
201 tensors, grad_tensors
, retain_graph, create_graph, inputs,
202 allow_unreachable=True, accumulate_grad=True) # Calls into the C++ engine to run the backward pass

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [8, 64, 24, 24]], which is output 0 of ReluBackward0, is at version 1; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

Torch does not allow the += operator on tensors anymore.
To fix it, you need to change two lines of code that are considered inplace operations in the newer torch version:
decoding.py l.136: x += connection => x = x + connection
encoding.py l.146: x += connection => x = x + connection
I made a clone of this rep and fixed the error locally this way, but I did not have the permission to push.
Best Regards,
Johannes Dollinger

@fepegar
Copy link
Owner

fepegar commented Jul 18, 2023

Hi, @jdollinger-bit. Thanks for reporting. Can you please fix this in a branch of a fork and open a pull request?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants