Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataParallel error -> RuntimeError: Caught RuntimeError in replica 0 on device 0 #75

Closed
praj441 opened this issue Dec 30, 2022 · 1 comment

Comments

@praj441
Copy link

praj441 commented Dec 30, 2022

I am not able to use your code with multi GPU training using nn.(DataParallel) error. The code is running fine when I do -
model = torch.nn.DataParallel(model.cuda()) ----> model = model.cuda()

Have you tried using the code with DataParallel enabled?

Log snippets -

output = net(feat, coord, offset, batch, neighbor_idx)
File "/home/prem/anaconda3/envs/alpha/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/prem/anaconda3/envs/alpha/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 168, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/home/prem/anaconda3/envs/alpha/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 178, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/home/prem/anaconda3/envs/alpha/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 86, in parallel_apply
output.reraise()
File "/home/prem/anaconda3/envs/alpha/lib/python3.7/site-packages/torch/_utils.py", line 425, in reraise
raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in replica 0 on device 0.

@X-Lai
Copy link
Collaborator

X-Lai commented Jan 2, 2023

Thanks for your interest in our work. Currently, the code only supports DDP.

@X-Lai X-Lai closed this as completed Jan 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants