Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CUDA/CPU ERROR] when I trained on my data, I found this error: #2

Closed
Observerspy opened this issue Dec 3, 2019 · 2 comments
Closed

Comments

@Observerspy
Copy link

RuntimeError: Caught RuntimeError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/home/weiqiang/anaconda3/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 60, in _worker
output = module(*input, **kwargs)
File "/home/weiqiang/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/home/weiqiang/cancer/code/models/efficientdet.py", line 130, in forward
P1, P2, P3, P4, P5, P6, P7 = self.efficientnet(inputs)
File "/home/weiqiang/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/home/weiqiang/anaconda3/lib/python3.7/site-packages/efficientnet_pytorch/model.py", line 204, in forward
P1, P2, P3, P4, P5, P6, P7 = self.extract_features(inputs)
File "/home/weiqiang/anaconda3/lib/python3.7/site-packages/efficientnet_pytorch/model.py", line 190, in extract_features
x = MBConvBlock(block_args, self._global_params)(x, drop_connect_rate = drop_connect_rate)
File "/home/weiqiang/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/home/weiqiang/anaconda3/lib/python3.7/site-packages/efficientnet_pytorch/model.py", line 78, in forward
x = self._swish(self._bn1(self._depthwise_conv(x)))
File "/home/weiqiang/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/home/weiqiang/anaconda3/lib/python3.7/site-packages/efficientnet_pytorch/utils.py", line 144, in forward
x = F.conv2d(x, self.weight, self.bias, self.stride, self.padding, self.dilation, self.groups)
RuntimeError: Expected object of backend CUDA but got backend CPU for argument #2 'weight'

@dhananjaisharma10
Copy link

dhananjaisharma10 commented Dec 4, 2019

RuntimeError: Caught RuntimeError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/home/weiqiang/anaconda3/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 60, in _worker
output = module(*input, **kwargs)
File "/home/weiqiang/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/home/weiqiang/cancer/code/models/efficientdet.py", line 130, in forward
P1, P2, P3, P4, P5, P6, P7 = self.efficientnet(inputs)
File "/home/weiqiang/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/home/weiqiang/anaconda3/lib/python3.7/site-packages/efficientnet_pytorch/model.py", line 204, in forward
P1, P2, P3, P4, P5, P6, P7 = self.extract_features(inputs)
File "/home/weiqiang/anaconda3/lib/python3.7/site-packages/efficientnet_pytorch/model.py", line 190, in extract_features
x = MBConvBlock(block_args, self._global_params)(x, drop_connect_rate = drop_connect_rate)
File "/home/weiqiang/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/home/weiqiang/anaconda3/lib/python3.7/site-packages/efficientnet_pytorch/model.py", line 78, in forward
x = self._swish(self._bn1(self._depthwise_conv(x)))
File "/home/weiqiang/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/home/weiqiang/anaconda3/lib/python3.7/site-packages/efficientnet_pytorch/utils.py", line 144, in forward
x = F.conv2d(x, self.weight, self.bias, self.stride, self.padding, self.dilation, self.groups)
RuntimeError: Expected object of backend CUDA but got backend CPU for argument #2 'weight'

Hi! How did you solve this error? I realize that it's because of the new MBConvBlocks being used in the extract features function. In my opinion, it's bad coding as the originally declared self._blocks Module list is not even being used and the newly declared ones do not have their weights on the GPU.

@Observerspy
Copy link
Author

@dhananjaisharma10 Infact, I just add
if torch.cuda.is_available(): torch.set_default_tensor_type('torch.cuda.FloatTensor')
in my own train.py.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants