Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: AttributeError: module 'torch.distributed' has no attribute '_reduce_scatter_base' #2673

Closed
CreamyLong opened this issue Feb 13, 2023 · 2 comments
Labels
bug Something isn't working

Comments

@CreamyLong
Copy link

馃悰 Describe the bug

When I run the code from https://github.com/hpcaitech/ColossalAI-Examples/tree/main/image/resnet, I got errors

AttributeError: module 'torch.distributed' has no attribute 'reduce_scatter_base

then I annotated the code in colossalai/communication/collective.py guided by online

_all_gather_func = dist._all_gather_base 
    if "all_gather_into_tensor" not in dir(dist) else dist.all_gather_into_tensor
_reduce_scatter_func = dist._reduce_scatter_base
    if "reduce_scatter_tensor" not in dir(dist) else dist.reduce_scatter_tensor

got the error

ModuleNotFoundError: No module named 'torch.fx._compatibility'

Environment

python 3.6
torch 1.9.1+cu102
gtx3090

@CreamyLong CreamyLong added the bug Something isn't working label Feb 13, 2023
@kurisusnowdeng
Copy link
Member

Thank you for the notification. It is supposed to be supported by torch>=1.10. We will improve the compatibility as soon as possible.

@binmakeswell
Copy link
Member

Hi @CreamyLong Please check the env reqiurement.
https://github.com/hpcaitech/ColossalAI#installation
This issue was closed due to inactivity. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants