Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RPC Framework] Support remote device format "<workername>/<device>" #46773

Closed
wants to merge 3 commits into from

Conversation

wayi1
Copy link
Contributor

@wayi1 wayi1 commented Oct 23, 2020

Stack from ghstack:

Changed the constructor of RemoteModule to accept a remote_device arg in the following format:
"/" (e.g., "trainer0/cpu", "ps0/cuda:0")

This arg merges the original on and device arg.

Original PR issue: RemoteDevice Format #46554

Differential Revision: D24482562

Changed the constructor of RemoteModule to accept a `remote_device` arg in the following format:
"<workername>/<device>" (e.g., "trainer0/cpu", "ps0/cuda:0")

This arg merges the original `on` and `device` arg.

Differential Revision: [D24482562](https://our.internmc.facebook.com/intern/diff/D24482562/)

[ghstack-poisoned]
@facebook-github-bot facebook-github-bot added the oncall: distributed Add this issue/PR to distributed oncall triage queue label Oct 23, 2020
wayi1 pushed a commit that referenced this pull request Oct 23, 2020
Changed the constructor of RemoteModule to accept a `remote_device` arg in the following format:
"<workername>/<device>" (e.g., "trainer0/cpu", "ps0/cuda:0")

This arg merges the original `on` and `device` arg.

Differential Revision: [D24482562](https://our.internmc.facebook.com/intern/diff/D24482562/)

ghstack-source-id: 115042926
Pull Request resolved: #46773
@wayi1 wayi1 changed the title Support remote device format "<workername>/<device>". [RPC Framework] Support remote device format "<workername>/<device>" Oct 23, 2020
@codecov
Copy link

codecov bot commented Oct 23, 2020

Codecov Report

Merging #46773 into gh/SciPioneer/18/base will decrease coverage by 0.00%.
The diff coverage is 11.11%.

@@                    Coverage Diff                    @@
##           gh/SciPioneer/18/base   #46773      +/-   ##
=========================================================
- Coverage                  68.44%   68.43%   -0.01%     
=========================================================
  Files                        413      413              
  Lines                      54366    54368       +2     
=========================================================
- Hits                       37210    37209       -1     
- Misses                     17156    17159       +3     

@dr-ci
Copy link

dr-ci bot commented Oct 23, 2020

💊 CI failures summary and remediations

As of commit 3a89ecc (more details on the Dr. CI page):


  • 1/1 failures possibly* introduced in this PR
    • 1/1 non-CircleCI failure(s)

codecov.io: 1 failed


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 9 times.

@pritamdamania87
Copy link
Contributor

Would be nice to mention the original issue in the PR summary: #46554

Copy link
Contributor

@pritamdamania87 pritamdamania87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for getting this PR up quickly!

…/<device>""


Changed the constructor of RemoteModule to accept a `remote_device` arg in the following format:
"<workername>/<device>" (e.g., "trainer0/cpu", "ps0/cuda:0")

This arg merges the original `on` and `device` arg.

Differential Revision: [D24482562](https://our.internmc.facebook.com/intern/diff/D24482562/)

[ghstack-poisoned]
wayi1 pushed a commit that referenced this pull request Oct 28, 2020
Pull Request resolved: #46773

Changed the constructor of RemoteModule to accept a `remote_device` arg in the following format:
"<workername>/<device>" (e.g., "trainer0/cpu", "ps0/cuda:0")

This arg merges the original `on` and `device` arg.
ghstack-source-id: 115340442

Differential Revision: [D24482562](https://our.internmc.facebook.com/intern/diff/D24482562/)
torch/distributed/nn/api/remote_module.py Outdated Show resolved Hide resolved
torch/distributed/nn/api/remote_module.py Outdated Show resolved Hide resolved
…/<device>""


Changed the constructor of RemoteModule to accept a `remote_device` arg in the following format:
"<workername>/<device>" (e.g., "trainer0/cpu", "ps0/cuda:0")

This arg merges the original `on` and `device` arg.

Differential Revision: [D24482562](https://our.internmc.facebook.com/intern/diff/D24482562/)

[ghstack-poisoned]
wayi1 pushed a commit that referenced this pull request Oct 29, 2020
Pull Request resolved: #46773

Changed the constructor of RemoteModule to accept a `remote_device` arg in the following format:
"<workername>/<device>" (e.g., "trainer0/cpu", "ps0/cuda:0")

This arg merges the original `on` and `device` arg.

Original PR issue: RemoteDevice Format #46554
ghstack-source-id: 115448051

Differential Revision: [D24482562](https://our.internmc.facebook.com/intern/diff/D24482562/)
@wayi1
Copy link
Contributor Author

wayi1 commented Oct 29, 2020

Would be nice to mention the original issue in the PR summary: #46554

Thanks for the reminder!

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in cab32d9.

@facebook-github-bot facebook-github-bot deleted the gh/SciPioneer/18/head branch November 1, 2020 15:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Merged oncall: distributed Add this issue/PR to distributed oncall triage queue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants