Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

torch.distributed not supported on Windows, but throws error for non-distributed training #206

Closed
sgbaird opened this issue Feb 26, 2021 · 2 comments

Comments

@sgbaird
Copy link
Contributor

sgbaird commented Feb 26, 2021

Hi,

Can't seem to train:

runfile('main.py','--mode train --config-yml configs/s2ef/200k/cgcnn/cgcnn.yml')
Traceback (most recent call last):

  File "main.py", line 15, in <module>
    from ocpmodels.common import distutils

  File "C:\Users\sterg\Documents\GitHub\sparks-baird\ocp\ocpmodels\common\distutils.py", line 98, in <module>
    def broadcast(tensor, src, group=dist.group.WORLD, async_op=False):

AttributeError: module 'torch.distributed' has no attribute 'group'

Possibly because torch.distributed isn't supported on Windows(?).

There may have been some progress with Windows support, however, it seemed like the command I used main.py --mode train --config-yml configs/s2ef/200k/cgcnn/cgcnn.yml shouldn't require torch.distributed, correct? Is there something basic I'm missing here?

Sterling

@mshuaibii
Copy link
Collaborator

We still rely on torch.distributed for other calls throughout the repo - even if not being used for distributed training. I'd recommend using WSL if working on a Windows machine (looks like you are #207) as we haven't supported Windows at this time.

@anuroopsriram if you have anything else to add to this?

@sgbaird
Copy link
Contributor Author

sgbaird commented Apr 29, 2021

Seems that Windows is supported now (at least in some sense) pytorch/pytorch#37068 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants