-
Notifications
You must be signed in to change notification settings - Fork 21.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
import torch: AttributeError: module 'torch.distributed' has no attribute 'BuiltinCommHookType' #47153
Labels
high priority
oncall: distributed
Add this issue/PR to distributed oncall triage queue
triage review
Comments
Same problem on OS X. |
@SciPioneer This breaks any |
wayi1
pushed a commit
that referenced
this issue
Nov 2, 2020
… as built-in comm hooks" Revert the diff because of #47153 Differential Revision: [D24691866](https://our.internmc.facebook.com/intern/diff/D24691866/) [ghstack-poisoned]
wayi1
pushed a commit
that referenced
this issue
Nov 2, 2020
… as built-in comm hooks" Revert the diff because of #47153 Original PR issue: C++ DDP Communication Hook #46348 Differential Revision: [D24691866](https://our.internmc.facebook.com/intern/diff/D24691866/) [ghstack-poisoned]
wayi1
pushed a commit
that referenced
this issue
Nov 2, 2020
… as built-in comm hooks" Revert the diff because of #47153 Original PR issue: C++ DDP Communication Hook #46348 Differential Revision: [D24691866](https://our.internmc.facebook.com/intern/diff/D24691866/) ghstack-source-id: 115720415 Pull Request resolved: #47234
facebook-github-bot
pushed a commit
that referenced
this issue
Nov 3, 2020
… as built-in comm hooks" (#47234) Summary: Pull Request resolved: #47234 Revert the diff because of #47153 Original PR issue: C++ DDP Communication Hook #46348 ghstack-source-id: 115720415 Test Plan: waitforbuildbot Reviewed By: mrshenli Differential Revision: D24691866 fbshipit-source-id: 58fe0c45943a2ae2a09fe5d5eac4a4d947586539
wayi1
pushed a commit
that referenced
this issue
Nov 3, 2020
…in comm hooks This is almost same as #46959, except that in caffe2/torch/nn/parallel/distributed.py, BuiltinCommHookType is imported conditionally, only when dist.is_available(). Otherwise, this Python enum type defined in caffe2/torch/scrc/distributed/c10d/init.cpp cannot be imported, which is similar to another enum type ReduceOp defined in the same file. See #47153 To review the diff on top of #46959, compare V1 vs Latest. Main Changes in V1 (#46959): 1. Implemented the Pybind part. 2. In the reducer, once the builtin_comm_hook_type is set, a c++ comm hook instance will be created in Reducer::autograd_hook. 3. Added unit tests for the builit-in comm hooks. Original PR issue: C++ DDP Communication Hook #46348 Differential Revision: [D24700959](https://our.internmc.facebook.com/intern/diff/D24700959/) [ghstack-poisoned]
wayi1
pushed a commit
that referenced
this issue
Nov 3, 2020
…in comm hooks This is almost same as #46959, except that in caffe2/torch/nn/parallel/distributed.py, BuiltinCommHookType is imported conditionally, only when dist.is_available(). Otherwise, this Python enum type defined in caffe2/torch/scrc/distributed/c10d/init.cpp cannot be imported, which is similar to another enum type ReduceOp defined in the same file. See #47153 To review the diff on top of #46959, compare V1 vs Latest. Main Changes in V1 (#46959): 1. Implemented the Pybind part. 2. In the reducer, once the builtin_comm_hook_type is set, a c++ comm hook instance will be created in Reducer::autograd_hook. 3. Added unit tests for the builit-in comm hooks. Original PR issue: C++ DDP Communication Hook #46348 Differential Revision: [D24700959](https://our.internmc.facebook.com/intern/diff/D24700959/) ghstack-source-id: 115753518 Pull Request resolved: #47270
wayi1
pushed a commit
that referenced
this issue
Nov 3, 2020
…I as built-in comm hooks" This is almost same as #46959, except that in caffe2/torch/nn/parallel/distributed.py, BuiltinCommHookType is imported conditionally, only when dist.is_available(). Otherwise, this Python enum type defined in caffe2/torch/scrc/distributed/c10d/init.cpp cannot be imported, which is similar to another enum type ReduceOp defined in the same file. See #47153 Main Changes in #46959: 1. Implemented the Pybind part. 2. In the reducer, once the builtin_comm_hook_type is set, a c++ comm hook instance will be created in Reducer::autograd_hook. 3. Added unit tests for the builit-in comm hooks. Original PR issue: C++ DDP Communication Hook #46348 Differential Revision: [D24700959](https://our.internmc.facebook.com/intern/diff/D24700959/) [ghstack-poisoned]
wayi1
pushed a commit
that referenced
this issue
Nov 3, 2020
…in comm hooks Pull Request resolved: #47270 This is almost same as #46959, except that in caffe2/torch/nn/parallel/distributed.py, BuiltinCommHookType should be imported conditionally, only when dist.is_available(). Otherwise, this Python enum type defined in caffe2/torch/scrc/distributed/c10d/init.cpp cannot be imported. See #47153 I tried to follow another enum type enum type ReduceOp defined in the same file, but did not work, because the C++ enum class is defined torch/lib/c10d library, but BuiltinCommHookType is defined in torch/csrc/distributed library. These two libraries are compiled in two different ways. To avoid adding typing to distributed package, which can be a new project, I simply removed the arg type of BuiltinCommHookType in this file. To review the diff on top of #46959, compare V1 vs Latest: https://www.internalfb.com/diff/D24700959?src_version_fbid=270445741055617 Main Changes in V1 (#46959): 1. Implemented the Pybind part. 2. In the reducer, once the builtin_comm_hook_type is set, a c++ comm hook instance will be created in Reducer::autograd_hook. 3. Added unit tests for the builit-in comm hooks. Original PR issue: C++ DDP Communication Hook #46348 ghstack-source-id: 115783237 Differential Revision: [D24700959](https://our.internmc.facebook.com/intern/diff/D24700959/)
facebook-github-bot
pushed a commit
that referenced
this issue
Nov 4, 2020
…in comm hooks (#47270) Summary: Pull Request resolved: #47270 This is almost same as #46959, except that in caffe2/torch/nn/parallel/distributed.py, BuiltinCommHookType should be imported conditionally, only when dist.is_available(). Otherwise, this Python enum type defined in caffe2/torch/scrc/distributed/c10d/init.cpp cannot be imported. See #47153 I tried to follow another enum type enum type ReduceOp defined in the same file, but did not work, because the C++ enum class is defined torch/lib/c10d library, but BuiltinCommHookType is defined in torch/csrc/distributed library. These two libraries are compiled in two different ways. To avoid adding typing to distributed package, which can be a new project, I simply removed the arg type of BuiltinCommHookType in this file. To review the diff on top of #46959, compare V1 vs Latest: https://www.internalfb.com/diff/D24700959?src_version_fbid=270445741055617 Main Changes in V1 (#46959): 1. Implemented the Pybind part. 2. In the reducer, once the builtin_comm_hook_type is set, a c++ comm hook instance will be created in Reducer::autograd_hook. 3. Added unit tests for the builit-in comm hooks. Original PR issue: C++ DDP Communication Hook #46348 ghstack-source-id: 115783237 Test Plan: buck test mode/dev-nosan caffe2/test/distributed:c10d -- test_builtin_ddp_comm_hooks_nccl //arvr/projects/eye_tracking/Masquerade:python_test USE_DISTRIBUTED=0 USE_GLOO=0 BUILD_TEST=0 USE_CUDA=1 USE_MKLDNN=0 DEBUG=0 python setup.py install Reviewed By: mrshenli Differential Revision: D24700959 fbshipit-source-id: 69f303a48ae275aa856e6e9b50e12ad8602e1c7a
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
high priority
oncall: distributed
Add this issue/PR to distributed oncall triage queue
triage review
After building the latest master ee0033a with,
On importing torch, I get the following error,
Previously there was no issue with the same.
Probably related to ee0033a
cc @ezyang @gchanan @zou3519 @bdhirsh @heitorschueroff @pietern @mrshenli @pritamdamania87 @zhaojuanmao @satgera @rohan-varma @gqchen @aazzolini @xush6528 @osalpekar @jiayisuse @agolynski
The text was updated successfully, but these errors were encountered: