Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when training with torch 1.5.1 #21

Closed
mkmohangb opened this issue Jul 4, 2020 · 2 comments
Closed

Error when training with torch 1.5.1 #21

mkmohangb opened this issue Jul 4, 2020 · 2 comments

Comments

@mkmohangb
Copy link
Contributor

There are some additional checks added in the pytorch framework which is causing the error when running the training script with torch 1.5.1. With torch version 1.2.0 training proceeds without error.

pytorch/pytorch@3e8d813

python train.py --gpu 0

Traceback (most recent call last):
File "train.py", line 231, in
main()
File "train.py", line 178, in main
preds = model(images)
File "/home/xxx/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/xxx/miniconda3/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 153, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/xxx/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/xxx/Self-Correction-Human-Parsing/networks/AugmentCE2P.py", line 298, in forward
x = self.relu1(self.bn1(self.conv1(x)))
File "/home/xxx/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/xxx/Self-Correction-Human-Parsing/modules/bn.py", line 119, in forward
self.training, self.momentum, self.eps, self.activation, self.slope)
RuntimeError: Some elements marked as dirty during the forward method were not returned as output. The inputs that are modified inplace must all be outputs of the Function.

@GoGoDuck912
Copy link
Owner

GoGoDuck912 commented Jul 5, 2020

I suppose this error is accompanied by different PyTorch versions and caused by the implementation of InplaceABN.

In order to upgrade to pytorch>1.0, we adopt [InplaceABN v0.1](the https://github.com/mapillary/inplace_abn/tree/v0.1), which empirically works well on most PyTorch versions.

@mkmohangb
Copy link
Contributor Author

Created Pull request: #22, please review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants