Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

module 'torch.distributed' has no attribute 'ReduceOp' #53

Closed
sctrueew opened this issue Jun 17, 2019 · 4 comments
Closed

module 'torch.distributed' has no attribute 'ReduceOp' #53

sctrueew opened this issue Jun 17, 2019 · 4 comments

Comments

@sctrueew
Copy link

Hi everyone,

I installed all requirements and when I run python eval.py I got this error:
module 'torch.distributed' has no attribute 'ReduceOp'
my Torch version is 1.1.0
thanks.

@ycszen
Copy link
Owner

ycszen commented Jul 4, 2019

Could you give more details?
Meanwhile, in the eval.py script, it didn't call the 'torch.distributed' or 'ReduceOp'.
Therefore, I need more details to know why you get this problem.

@jay1009
Copy link

jay1009 commented Jul 24, 2019

I met this problem, too.
But it occurred when I run python train.py.
Can you tell me what happened about it?
thanks.
ReduceOp

@ycszen
Copy link
Owner

ycszen commented Jul 25, 2019

@jay1009
You need to use this script to run train.py as follows:

export NGPUS=8
python -m torch.distributed.launch --nproc_per_node=$NGPUS train.py

@ycszen ycszen closed this as completed Aug 1, 2019
@jay1009
Copy link

jay1009 commented Aug 6, 2019

@ycszen
Thanks for your reply. But I only have one GPU, and after I use
export NGPUS=1
python -m torch.distributed.launch --nproc_per_node=$NGPUS train.py
It still met the same problem.
Is it necessary to use 8 gpus to run this program?

ReduceOp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants