Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: Default process group has not been initialized, please make sure to call init_process_group. #503

Open
km1562 opened this issue Dec 12, 2021 · 1 comment

Comments

@km1562
Copy link

km1562 commented Dec 12, 2021

when I use single GPU, the ABC will produce the error,
I find the answer on the Internet,
it's the reason that ABC use the synbatchnorm,
So,
when use the single GPU,
you need to init the process_group,
maybe you can updata the code to main
————————————————————
image

——————————————————————
cuda_num = os.environ['CUDA_VISIBLE_DEVICES']
cuda_num_list = list(cuda_num.split(","))
if len(cuda_num_list) == 1:
import torch.distributed as dist

  dist.init_process_group(backend='nccl', init_method='tcp://localhost:23456', rank=0, world_size=1)
  print("already init\n")
@kimile599
Copy link

Update the solution worked on colab
cuda_num = os.environ['CUDA_VISIBLE_DEVICES'] casue keyerrors
so i manually changed cuda_num="0"

cuda_num_list = list(cuda_num.split(","))
if len(cuda_num_list) == 1:
import torch.distributed as dist

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants