Skip to content
This repository has been archived by the owner on Mar 12, 2024. It is now read-only.

AttributeError: module 'torch.distributed' has no attribute 'init_process_group' #43

Closed
ewong18 opened this issue Jun 3, 2020 · 2 comments
Labels
question Further information is requested

Comments

@ewong18
Copy link

ewong18 commented Jun 3, 2020

I'm trying to run the example as-is, and i'm running into this issue. I did have to adjust the number of gpus because the VM I'm working on only has 1. I'm also working on a Windows 10 machine with pytorch version 1.5.0, CUDA version 10.1, and CUDA compiler driver v10.0.130.

| distributed init (rank 0): env://
Traceback (most recent call last):
  File "main.py", line 248, in <module>
    main(args)
  File "main.py", line 106, in main
    utils.init_distributed_mode(args)
  File "C:\Users\-user-\Documents\Projects\detr\util\misc.py", line 374, in init_distributed_mode
    torch.distributed.init_process_group(backend=args.dist_backend, init_method=args.dist_url,
AttributeError: module 'torch.distributed' has no attribute 'init_process_group'
Traceback (most recent call last):
  File "C:\Anaconda\envs\detr\lib\runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Anaconda\envs\detr\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Anaconda\envs\detr\lib\site-packages\torch\distributed\launch.py", line 263, in <module>
    main()
  File "C:\Anaconda\envs\detr\lib\site-packages\torch\distributed\launch.py", line 258, in main
    raise subprocess.CalledProcessError(returncode=process.returncode,
subprocess.CalledProcessError: Command '['C:\\Anaconda\\envs\\detr\\python.exe', '-u', 'main.py', '--coco_path', 'F:/coco-data']' returned non-zero exit status 1.```
@alcinos
Copy link
Contributor

alcinos commented Jun 3, 2020

Hi,
If you have only one gpu, you should remove the distributed launching entirely:
python main.py --coco_path /path/to/coco

This will train with a total batch size of 2, which is currently untested/unsupported (we recommend at least bs=16, which is unlikely to fit on a single gpu)

@fmassa
Copy link
Contributor

fmassa commented Jun 3, 2020

Closing following @alcinos answer, but let us know if you have further issues / questions.

@fmassa fmassa closed this as completed Jun 3, 2020
@fmassa fmassa added the question Further information is requested label Jun 3, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants