Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loss can't backward #39

Closed
Originofamonia opened this issue Jun 15, 2021 · 2 comments
Closed

Loss can't backward #39

Originofamonia opened this issue Jun 15, 2021 · 2 comments

Comments

@Originofamonia
Copy link

Hi, I encountered an issue that the loss can't backward. It seems there is an operation to modify a tensor inplace, but I can't find where is it. Could you please look into it?
Here is the stacktrace:
Traceback (most recent call last):
File "/home/qiyuan/2021summer/VQA_ReGAT/main.py", line 275, in
main()
File "/home/qiyuan/2021summer/VQA_ReGAT/main.py", line 271, in main
train(model, train_loader, eval_loader, args, device)
File "/home/qiyuan/2021summer/VQA_ReGAT/train.py", line 111, in train
loss.backward()
File "/home/qiyuan/miniconda3/envs/torch14/lib/python3.8/site-packages/torch/tensor.py", line 195, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/home/qiyuan/miniconda3/envs/torch14/lib/python3.8/site-packages/torch/autograd/init.py", line 97, in backward
Variable._execution_engine.run_backward(
RuntimeError: unsupported operation: more than one element of the written-to tensor refers to a single memory location. Please clone() the tensor before performing the operation.
0%| | 0/3467 [00:04<?, ?it/s]

Process finished with exit code 1

Thanks.

@Originofamonia
Copy link
Author

Nevermind, it's been solved by adding clone() to where have expand().

@jhss
Copy link

jhss commented Dec 27, 2022

While running your fork code, I have similar problem.

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [64, 2048]], which is output 0 of ReluBackward0, is at version 1; expected version 0 instead.

I changed all ReLU inplace = False, but I couldn't solve it.
Could you let me know the way to solve the problem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants