-
Notifications
You must be signed in to change notification settings - Fork 21.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add warning on ProcessGroup and ProcessGroup::Work APIs #46220
Conversation
💊 CI failures summary and remediationsAs of commit 1a36843 (more details on the Dr. CI page):
codecov.io: 1 failed
This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group. This comment has been revised 32 times. |
[ghstack-poisoned]
ghstack-source-id: 70fae3b9745be76053f12a847873e8f6a9e1900b Pull Request resolved: #46220
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @agolynski, I might not convey this clearly. The goal is that, whenever users call certain APIs on Work
, we print the warning message. We will need to add things like TORCH_WARN_ONCE
to impacted APIs. The impacted ones are those that won't be available in the new Future API. At least including the following ones.
pytorch/torch/csrc/distributed/c10d/init.cpp
Lines 979 to 981 in b98e359
.def("is_success", &::c10d::ProcessGroup::Work::isSuccess) | |
.def("exception", &::c10d::ProcessGroup::Work::exception) | |
.def("source_rank", &::c10d::ProcessGroup::Work::sourceRank) |
pytorch/torch/csrc/distributed/c10d/init.cpp
Line 987 in b98e359
.def("synchronize", &::c10d::ProcessGroup::Work::synchronize) |
@wanchaol @gmagogsfm please comment if there are more impacted APIs.
Codecov Report
@@ Coverage Diff @@
## gh/agolynski/10/base #46220 +/- ##
=====================================================
Coverage 68.33% 68.33%
=====================================================
Files 410 410
Lines 53795 53795
=====================================================
Hits 36760 36760
Misses 17035 17035
Continue to review full report at Codecov.
|
Raised the concern that APIs like
|
[ghstack-poisoned]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me, minor comments inline. Thanks for doing this!
@@ -974,15 +974,48 @@ that adds a prefix to each key inserted to the store. | |||
|
|||
shared_ptr_class_<::c10d::ProcessGroup::Work>(module, "Work") | |||
.def("is_completed", &::c10d::ProcessGroup::Work::isCompleted) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we deprecate this as well? This is called done
in future, see
pytorch/torch/csrc/jit/python/init.cpp
Line 1094 in a3caa71
"done", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is risky, as we did expose this API in our API doc. If you wanna change this, please mention it where it is defined in https://github.com/pytorch/pytorch/blob/master/docs/source/distributed.rst
as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
undone
torch/lib/c10d/ProcessGroup.hpp
Outdated
@@ -157,6 +160,12 @@ class ProcessGroup { | |||
return size_; | |||
} | |||
|
|||
// ************************************************************************* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we add this comment in the beginning of ProcessGroup
class instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved
[ghstack-poisoned]
[ghstack-poisoned]
done |
done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stamp to unblock, but please fix the error I mentioned inline.
torch/csrc/distributed/c10d/init.cpp
Outdated
"deprecated, please ping " | ||
"https://github.com/pytorch/pytorch/issues/46291 " | ||
"if you see this warning"); | ||
return work.isSuccess(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't this be isCompleted
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
torch/csrc/distributed/c10d/init.cpp
Outdated
TORCH_WARN_ONCE("ProcessGroup::Work::is_completed API is being " | ||
"deprecated, please ping " | ||
"https://github.com/pytorch/pytorch/issues/46291 " | ||
"if you see this warning"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this what clang-format gives you? The devserver clang-format gives me this:
TORCH_WARN_ONCE(
"ProcessGroup::Work::is_completed API is being "
"deprecated, please ping "
"https://github.com/pytorch/pytorch/issues/46291 "
"if you see this warning");
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And ditto for other APIs as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reformatted via vscode clang-format
torch/csrc/distributed/c10d/init.cpp
Outdated
.def( | ||
"is_completed", | ||
[](::c10d::ProcessGroup::Work& work) -> bool { | ||
TORCH_WARN_ONCE("ProcessGroup::Work::is_completed API is being " | ||
"deprecated, please ping " | ||
"https://github.com/pytorch/pytorch/issues/46291 " | ||
"if you see this warning"); | ||
return work.isSuccess(); | ||
}) | ||
.def( | ||
"is_success", | ||
[](::c10d::ProcessGroup::Work& work) -> bool { | ||
TORCH_WARN_ONCE("ProcessGroup::Work::is_success API is being " | ||
"deprecated, please ping " | ||
"https://github.com/pytorch/pytorch/issues/46291 " | ||
"if you see this warning"); | ||
return work.isSuccess(); | ||
}) | ||
.def( | ||
"exception", | ||
[](::c10d::ProcessGroup::Work& work) -> std::exception_ptr { | ||
TORCH_WARN_ONCE("ProcessGroup::Work::exception API is being " | ||
"deprecated, please ping " | ||
"https://github.com/pytorch/pytorch/issues/46291 " | ||
"if you see this warning"); | ||
return work.exception(); | ||
}) | ||
.def( | ||
"source_rank", | ||
[](::c10d::ProcessGroup::Work& work) -> int { | ||
TORCH_WARN_ONCE("ProcessGroup::Work::source_rank API is being " | ||
"deprecated, please ping " | ||
"https://github.com/pytorch/pytorch/issues/46291 " | ||
"if you see this warning"); | ||
return work.sourceRank(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like preprocessor macro, but it seem to be ideal to avoid string literal duplication, i.e. use something like
#define _DEPRECATED_DEF(name, method) \
def( name, [](::c10d::ProcessGroup::Work& work) -> int { \
TORCH_WARN_ONCE("ProcessGroup::Work::" name " API is being " \
"deprecated, please ping " \
"https://github.com/pytorch/pytorch/issues/46291 " \
"if you see this warning"); \
return work.method(); \
})
.def( | |
"is_completed", | |
[](::c10d::ProcessGroup::Work& work) -> bool { | |
TORCH_WARN_ONCE("ProcessGroup::Work::is_completed API is being " | |
"deprecated, please ping " | |
"https://github.com/pytorch/pytorch/issues/46291 " | |
"if you see this warning"); | |
return work.isSuccess(); | |
}) | |
.def( | |
"is_success", | |
[](::c10d::ProcessGroup::Work& work) -> bool { | |
TORCH_WARN_ONCE("ProcessGroup::Work::is_success API is being " | |
"deprecated, please ping " | |
"https://github.com/pytorch/pytorch/issues/46291 " | |
"if you see this warning"); | |
return work.isSuccess(); | |
}) | |
.def( | |
"exception", | |
[](::c10d::ProcessGroup::Work& work) -> std::exception_ptr { | |
TORCH_WARN_ONCE("ProcessGroup::Work::exception API is being " | |
"deprecated, please ping " | |
"https://github.com/pytorch/pytorch/issues/46291 " | |
"if you see this warning"); | |
return work.exception(); | |
}) | |
.def( | |
"source_rank", | |
[](::c10d::ProcessGroup::Work& work) -> int { | |
TORCH_WARN_ONCE("ProcessGroup::Work::source_rank API is being " | |
"deprecated, please ping " | |
"https://github.com/pytorch/pytorch/issues/46291 " | |
"if you see this warning"); | |
return work.sourceRank(); | |
._DEPRECATED_DEF("is_completed", isCompleted) | |
._DEPRECATED_DEF("is_success", isSuccess) | |
._DEPRECATED_DEF("exception", exception) | |
._DEPRECATED_DEF("source_rank", sourceRank) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another option is constexpr
+ fmt::format
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added a macro
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... and switched to fmt::format since clang-tidy doesn't like macros
Differential Revision: [D24294437](https://our.internmc.facebook.com/intern/diff/D24294437) [ghstack-poisoned]
Differential Revision: [D24294437](https://our.internmc.facebook.com/intern/diff/D24294437) [ghstack-poisoned]
torch/csrc/distributed/c10d/init.cpp
Outdated
@@ -972,17 +972,44 @@ that adds a prefix to each key inserted to the store. | |||
py::call_guard<py::gil_scoped_release>()); | |||
#endif | |||
|
|||
#define PROCESS_GROUP_DEPRECATION_WARNING(api_method) \ | |||
TORCH_WARN_ONCE(#api_method \ | |||
"API is being deprecated, please ping " \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious, do you need a space before "API"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
Differential Revision: [D24294437](https://our.internmc.facebook.com/intern/diff/D24294437) [ghstack-poisoned]
ghstack-source-id: 278069176ca85555d2d3bc458ca399bee5721a54 Pull Request resolved: #46220 Add warnings on deprecated ProcesssGroup::Work functionality ghstack-source-id: 278069176ca85555d2d3bc458ca399bee5721a54 Pull Request resolved: #46294 fix formatting ghstack-source-id: 278069176ca85555d2d3bc458ca399bee5721a54 Pull Request resolved: #46295
Differential Revision: [D24294437](https://our.internmc.facebook.com/intern/diff/D24294437) [ghstack-poisoned]
Differential Revision: [D24294437](https://our.internmc.facebook.com/intern/diff/D24294437) [ghstack-poisoned]
ghstack-source-id: ea13ae5b2b37f929408cff5ffb18995649342ef0 Pull Request resolved: #46220 Add warnings on deprecated ProcesssGroup::Work functionality ghstack-source-id: ea13ae5b2b37f929408cff5ffb18995649342ef0 Pull Request resolved: #46294 fix formatting ghstack-source-id: ea13ae5b2b37f929408cff5ffb18995649342ef0 Pull Request resolved: #46295
Clang-tidy is broken now, fix is on the way: |
Isn't the clang-tidy failure caused by this PR?
|
it should be fixed now (before tidy run) |
Differential Revision: [D24294437](https://our.internmc.facebook.com/intern/diff/D24294437) [ghstack-poisoned]
Differential Revision: [D24294437](https://our.internmc.facebook.com/intern/diff/D24294437) [ghstack-poisoned]
ghstack-source-id: f5427d315d18dc2585d68a394f36409602bbc505 Pull Request resolved: #46220
ghstack-source-id: f5427d315d18dc2585d68a394f36409602bbc505 Pull Request resolved: #46220
@gmagogsfm merged this pull request in e7e919f. |
ghstack-source-id: f5427d315d18dc2585d68a394f36409602bbc505 Pull Request resolved: #46220
Stack from ghstack:
Differential Revision: D24294437