convert output_device at data_parallel from torch.device to index #10189

weiyangfb · 2018-08-02T23:14:53Z

fixes torch.device and torch.nn.parallel.data_parallel compatibility #9984

weiyangfb · 2018-08-03T00:19:51Z

@pytorchbot retest this please

vishwakftw · 2018-08-03T01:22:16Z

I believe there are some instances of the same case in nn/parallel/distributed.py and nn/parallel/distributed_c10d.py. Could those be changed too?

weiyangfb · 2018-08-03T03:03:22Z

@vishwakftw I see, I will change them as well

test/test_nn.py

+        # test output_device
+        l = nn.Linear(10, 5).float().cuda()
+        i = Variable(torch.randn(20, 10).float().cuda())
+        out = dp.data_parallel(l, i, (0, 1), torch.device('cuda'))


li-roy

This looks good to me.

li-roy · 2018-08-16T22:59:11Z

actually, can we add a test for the other two code paths as well?

ssnl

The better fix is to make scatter, gather, parallel_apply, etc. to accept device objects (vs. converting to idx in DP). You can also make device_ids to support device objects this way.

fmassa · 2018-08-28T16:39:12Z

ping @weiyangfb on @ssnl suggestion.

ssnl · 2018-08-28T16:40:57Z

you can probably use/adapt torch.cuda._get_device_index now to do that after #10833 .

facebook-github-bot

weiyangfb has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot

weiyangfb has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

weiyangfb · 2018-08-29T18:40:28Z

is this good to go? @ssnl

facebook-github-bot

weiyangfb has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

weiyangfb · 2018-09-05T16:56:57Z

is this good to go? @ssnl @teng-li @ailzhang

torch/nn/parallel/parallel_apply.py

@@ -36,7 +37,7 @@ def parallel_apply(modules, inputs, kwargs_tup=None, devices=None):
        assert len(modules) == len(devices)
    else:
        devices = [None] * len(modules)
-
+    devices = list(map(lambda x: _get_device_index(x, True), devices))


facebook-github-bot

weiyangfb has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

test/test_nn.py

@@ -2922,6 +2922,18 @@ def test_data_parallel_small_back(self):
        out = dp.data_parallel(l, i, (0, 1))
        self.assertEqual(out, l(i))

+        # test output_device
+        l = nn.Linear(10, 5).float().cuda()


torch/nn/parallel/data_parallel.py

-        device_ids: CUDA devices (default: all devices)
-        output_device: device location of output (default: device_ids[0])
+        module (Module): module to be parallelized
+        device_ids (list of int or Device): CUDA devices (default: all devices)


torch/nn/parallel/replicate.py

@@ -1,10 +1,12 @@
 import torch.cuda.comm as comm
+from torch.cuda._utils import _get_device_index


 def replicate(network, devices, detach=False):
    from ._functions import Broadcast

    devices = tuple(devices)


facebook-github-bot

weiyangfb has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

weiyangfb · 2018-09-10T18:11:41Z

@ssnl would you like to take quick pass on this? The updates are separated test function and doc fixes. Thanks!

ssnl

lgtm, but there is one remaining nit to be addressed

facebook-github-bot

weiyangfb has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

weiyangfb · 2018-09-11T05:12:50Z

@ssnl I see, fixed more places with Device -> torch.device

facebook-github-bot

weiyangfb has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

…d APIs 1. convert torch.device to device.index in APIs 2. docs fixes

facebook-github-bot

weiyangfb has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

weiyangfb requested review from apaszke, colesbury, ezyang, gchanan, soumith and zdevito as code owners August 2, 2018 23:14

ailzhang reviewed Aug 3, 2018

View reviewed changes

weiyangfb added the ready for review (this tag is deprecated) All PRs are ready for review unless they are draft, WIP, or have undismissed requested changes label Aug 14, 2018

weiyangfb force-pushed the parallel_device branch from 6606483 to d7c16b6 Compare August 15, 2018 18:27

li-roy approved these changes Aug 16, 2018

View reviewed changes

ssnl requested changes Aug 21, 2018

View reviewed changes

weiyangfb force-pushed the parallel_device branch from d7c16b6 to c70b9d2 Compare August 28, 2018 20:00

weiyangfb requested review from pietern and teng-li as code owners August 28, 2018 20:00

facebook-github-bot reviewed Aug 28, 2018

View reviewed changes

facebook-github-bot reviewed Aug 29, 2018

View reviewed changes

weiyangfb force-pushed the parallel_device branch from c60a6a6 to 9e500da Compare September 4, 2018 20:12

facebook-github-bot reviewed Sep 4, 2018

View reviewed changes

teng-li reviewed Sep 5, 2018

View reviewed changes

facebook-github-bot reviewed Sep 5, 2018

View reviewed changes

ssnl reviewed Sep 6, 2018

View reviewed changes

weiyangfb force-pushed the parallel_device branch from 25b90d1 to 4ed0d8f Compare September 6, 2018 21:26

facebook-github-bot reviewed Sep 6, 2018

View reviewed changes

ssnl approved these changes Sep 11, 2018

View reviewed changes

weiyangfb force-pushed the parallel_device branch from 4ed0d8f to 3a4e211 Compare September 11, 2018 05:11

facebook-github-bot reviewed Sep 11, 2018

View reviewed changes

weiyangfb mentioned this pull request Sep 11, 2018

[wip] allow torch.device as input for device args in data_parallel and distributed APIs #11530

Closed

weiyangfb force-pushed the parallel_device branch from 3a4e211 to f360f36 Compare September 11, 2018 18:16

facebook-github-bot reviewed Sep 11, 2018

View reviewed changes

allow torch.device as devices at args of data_parallel and distribute…

d5721ff

…d APIs 1. convert torch.device to device.index in APIs 2. docs fixes

weiyangfb force-pushed the parallel_device branch from f360f36 to d5721ff Compare September 11, 2018 22:48

facebook-github-bot reviewed Sep 11, 2018

View reviewed changes

facebook-github-bot closed this in 54107ae Sep 12, 2018

ezyang added the merged label Jun 26, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

convert output_device at data_parallel from torch.device to index #10189

convert output_device at data_parallel from torch.device to index #10189

weiyangfb commented Aug 2, 2018

weiyangfb commented Aug 3, 2018

vishwakftw commented Aug 3, 2018

weiyangfb commented Aug 3, 2018

This comment was marked as off-topic.

This comment was marked as off-topic.

li-roy left a comment

li-roy commented Aug 16, 2018

ssnl left a comment

fmassa commented Aug 28, 2018

ssnl commented Aug 28, 2018

facebook-github-bot left a comment

facebook-github-bot left a comment

weiyangfb commented Aug 29, 2018

facebook-github-bot left a comment

weiyangfb commented Sep 5, 2018

This comment was marked as off-topic.

facebook-github-bot left a comment

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

facebook-github-bot left a comment

weiyangfb commented Sep 10, 2018

ssnl left a comment

facebook-github-bot left a comment

weiyangfb commented Sep 11, 2018

facebook-github-bot left a comment

facebook-github-bot left a comment

convert output_device at data_parallel from torch.device to index #10189

convert output_device at data_parallel from torch.device to index #10189

Conversation

weiyangfb commented Aug 2, 2018

weiyangfb commented Aug 3, 2018

vishwakftw commented Aug 3, 2018

weiyangfb commented Aug 3, 2018

This comment was marked as off-topic.

This comment was marked as off-topic.

li-roy left a comment

Choose a reason for hiding this comment

li-roy commented Aug 16, 2018

ssnl left a comment

Choose a reason for hiding this comment

fmassa commented Aug 28, 2018

ssnl commented Aug 28, 2018

facebook-github-bot left a comment

Choose a reason for hiding this comment

facebook-github-bot left a comment

Choose a reason for hiding this comment

weiyangfb commented Aug 29, 2018

facebook-github-bot left a comment

Choose a reason for hiding this comment

weiyangfb commented Sep 5, 2018

This comment was marked as off-topic.

facebook-github-bot left a comment

Choose a reason for hiding this comment

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

facebook-github-bot left a comment

Choose a reason for hiding this comment

weiyangfb commented Sep 10, 2018

ssnl left a comment

Choose a reason for hiding this comment

facebook-github-bot left a comment

Choose a reason for hiding this comment

weiyangfb commented Sep 11, 2018

facebook-github-bot left a comment

Choose a reason for hiding this comment

facebook-github-bot left a comment

Choose a reason for hiding this comment