Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inplace operation problem #3

Closed
Oceanusity opened this issue Nov 9, 2022 · 4 comments
Closed

inplace operation problem #3

Oceanusity opened this issue Nov 9, 2022 · 4 comments

Comments

@Oceanusity
Copy link

Hello, thank you for the detailed readme which helps us a lot to run the code. Now, I face a replacement operation problem when I run the code. It will be great if you can provide me with some insights on how to deal with it. Thank you.

python -m train.pattern sampling.mode random sampling.redundancy 5 sampling.random_rate 0.35 sampling.batch_factor 8

Here is the traceback when I set torch.autograd.set_detect_anomaly(True).

File "/GNNAK/core/model.py", line 37, in forward
    x = F.relu(x)
  File "/anaconda3/envs/exp/lib/python3.10/site-packages/torch/nn/functional.py", line 1457, in relu
    result = torch.relu(input)
 (Triggered internally at  /opt/conda/conda-bld/pytorch_1659484808560/work/torch/csrc/autograd/python_anomaly_mode.cpp:102.)
  Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
 
File "/GNNAK/train/pattern.py", line 85, in train
    loss.backward()
  File "/anaconda3/envs/exp/lib/python3.10/site-packages/torch/_tensor.py", line 396, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
  File "/anaconda3/envs/exp/lib/python3.10/site-packages/torch/autograd/__init__.py", line 173, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [54318, 144]], which is output 0 of ReluBackward0, is at version 1; expected version 0 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!

Here is my partial environment configuration:

ffmpeg                    4.3                  hf484d3e_0    pytorch
pyg                       2.1.0           py310_torch_1.12.0_cu113    pyg
pytorch                   1.12.1          py3.10_cuda11.3_cudnn8.3.2_0    pytorch
pytorch-cluster           1.6.0           py310_torch_1.12.0_cu113    pyg
pytorch-mutex             1.0                        cuda    pytorch
pytorch-scatter           2.0.9           py310_torch_1.12.0_cu113    pyg
pytorch-sparse            0.6.15          py310_torch_1.12.0_cu113    pyg
torchaudio                0.12.1              py310_cu113    pytorch
torchvision               0.13.1              py310_cu113    pytorch
@Oceanusity Oceanusity changed the title replacement operation problem inplacement operation problem Nov 9, 2022
@Oceanusity Oceanusity changed the title inplacement operation problem inplace operation problem Nov 9, 2022
@Oceanusity
Copy link
Author

Oceanusity commented Nov 9, 2022

When I remove the line F.relu, this problem disappears.

@LingxiaoShawn
Copy link
Owner

Can you check whether you have updated to the newest package based on the instruction?
I can run the code successfully without error.
But it's true that the newest pytorch makes the in-place operation much harder,
you have to remove all in-place operation for correct back propagation.
I don't know why pytorch insists on this.

@zhouxuan1223
Copy link

I would like to ask if the environment you installed by running code directly installed the setup provided by the author. After I installed the latest setup provided by the author, I reported an error when running the environment
200276f1eb2484cfee5a81eea5d3dae

@LingxiaoShawn
Copy link
Owner

This should be cuda problem. Also the setup is using python 3.10.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants