Move device type init from BackendSelect to backend kernels #37402

bhosmer · 2020-04-28T07:20:32Z

Stack from ghstack:

Move device type init from BackendSelect to backend kernels #37402 Move device type init from BackendSelect to backend kernels

Previously, BackendSelect kernels did just-in-time device type
initialization by calling LegacyTypeDispatch.initForDispatchKey()
with a computed dispatch key. Here we move the initialization to
the backend kernels themselves, where we can call the device-
specific initializer directly.

Note on HIPification: this PR introduces direct calls to device-specific initializers in generated code, in particular globalLegacyTypeDispatch().initCUDA() is called in factory kernels defined in CUDAType.cpp. No changes have been made to the conversion defined in https://github.com/pytorch/pytorch/tree/master/torch/utils/hipify and run by build_amd.py, so these calls remain in HIPified code. This isn't unusual - HIPified code contains unrenamed functions whose behavior has been repurposed. Relying on tests to verify that this call performs the correct initialization on HIPified builds.

Differential Revision: D21282974

Previously, BackendSelect kernels did just-in-time device type initialization by calling `LegacyTypeDispatch.initForDispatchKey()` with a computed dispatch key. Here we move the initialization to the backend kernels themselves, where we can call the device- specific initializer directly. Putting this up to run tests on it, but a couple questions remain: * why were only BackendSelect kernels doing this initialization? Not all factory ops appear there, nor are all the ops that do appear there factory ops. Currently we generate init code for exactly the BackendSelect ops, but the choice should be better motivated. * the previous scheme maps HIP to its own legacy type dispatch entry, but the logic assumes it's exclusive with CUDA, and no ops appear to mention HIP explicitly, so the new logic doesn't expose a static entry point for it. Needs to be verified. [ghstack-poisoned]

Previously, BackendSelect kernels did just-in-time device type initialization by calling `LegacyTypeDispatch.initForDispatchKey()` with a computed dispatch key. Here we move the initialization to the backend kernels themselves, where we can call the device- specific initializer directly. Putting this up to run tests on it, but a couple questions remain: * why were only BackendSelect kernels doing this initialization? Not all factory ops appear there, nor are all the ops that do appear there factory ops. Currently we generate init code for exactly the BackendSelect ops, but the choice should be better motivated. * the previous scheme maps HIP to its own legacy type dispatch entry, but the logic assumes it's exclusive with CUDA, and no ops appear to mention HIP explicitly, so the new logic doesn't expose a static entry point for it. Needs to be verified. ghstack-source-id: 51d4dca Pull Request resolved: #37402

dr-ci · 2020-04-28T07:21:40Z

💊 Build failures summary and remediations

As of commit 4a27ab6 (more details on the Dr. CI page):

💚 💚 Looks good so far! There are no failures yet. 💚 💚

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker.

See how this bot performed.

This comment has been revised 38 times.

Previously, BackendSelect kernels did just-in-time device type initialization by calling `LegacyTypeDispatch.initForDispatchKey()` with a computed dispatch key. Here we move the initialization to the backend kernels themselves, where we can call the device- specific initializer directly. Putting this up to run tests on it, but a couple questions remain: * why were only BackendSelect kernels doing this initialization? Not all factory ops appear there, nor are all the ops that do appear there factory ops. Currently we generate init code for exactly the BackendSelect ops, but the choice should be better motivated. * the previous scheme maps HIP to its own legacy type dispatch entry, but the logic assumes it's exclusive with CUDA, and no ops appear to mention HIP explicitly, so the new logic doesn't expose a static entry point for it. Needs to be verified. [ghstack-poisoned]

Previously, BackendSelect kernels did just-in-time device type initialization by calling `LegacyTypeDispatch.initForDispatchKey()` with a computed dispatch key. Here we move the initialization to the backend kernels themselves, where we can call the device- specific initializer directly. Putting this up to run tests on it, but a couple questions remain: * why were only BackendSelect kernels doing this initialization? Not all factory ops appear there, nor are all the ops that do appear there factory ops. Currently we generate init code for exactly the BackendSelect ops, but the choice should be better motivated. * the previous scheme maps HIP to its own legacy type dispatch entry, but the logic assumes it's exclusive with CUDA, and no ops appear to mention HIP explicitly, so the new logic doesn't expose a static entry point for it. Needs to be verified. Differential Revision: [D21282974](https://our.internmc.facebook.com/intern/diff/D21282974) [ghstack-poisoned]

Previously, BackendSelect kernels did just-in-time device type initialization by calling `LegacyTypeDispatch.initForDispatchKey()` with a computed dispatch key. Here we move the initialization to the backend kernels themselves, where we can call the device- specific initializer directly. Putting this up to run tests on it, but a couple questions remain: * the previous scheme maps HIP to its own legacy type dispatch entry, but the logic assumes it's exclusive with CUDA, and no ops appear to mention HIP explicitly, so the new logic doesn't expose a static entry point for it. Needs to be verified. ghstack-source-id: 9b8000a Pull Request resolved: #37402

Previously, BackendSelect kernels did just-in-time device type initialization by calling `LegacyTypeDispatch.initForDispatchKey()` with a computed dispatch key. Here we move the initialization to the backend kernels themselves, where we can call the device- specific initializer directly. Putting this up to run tests on it, but a couple questions remain: * why were only BackendSelect kernels doing this initialization? Not all factory ops appear there, nor are all the ops that do appear there factory ops. Currently we generate init code for exactly the BackendSelect ops, but the choice should be better motivated. * the previous scheme maps HIP to its own legacy type dispatch entry, but the logic assumes it's exclusive with CUDA, and no ops appear to mention HIP explicitly, so the new logic doesn't expose a static entry point for it. Needs to be verified. Differential Revision: [D21282974](https://our.internmc.facebook.com/intern/diff/D21282974) [ghstack-poisoned]

Previously, BackendSelect kernels did just-in-time device type initialization by calling `LegacyTypeDispatch.initForDispatchKey()` with a computed dispatch key. Here we move the initialization to the backend kernels themselves, where we can call the device- specific initializer directly. Putting this up to run tests on it, but a couple questions remain: * the previous scheme maps HIP to its own legacy type dispatch entry, but the logic assumes it's exclusive with CUDA, and no ops appear to mention HIP explicitly, so the new logic doesn't expose a static entry point for it. Needs to be verified. ghstack-source-id: ec9f3d3 Pull Request resolved: #37402

Previously, BackendSelect kernels did just-in-time device type initialization by calling `LegacyTypeDispatch.initForDispatchKey()` with a computed dispatch key. Here we move the initialization to the backend kernels themselves, where we can call the device- specific initializer directly. Putting this up to run tests on it, but a couple questions remain: * why were only BackendSelect kernels doing this initialization? Not all factory ops appear there, nor are all the ops that do appear there factory ops. Currently we generate init code for exactly the BackendSelect ops, but the choice should be better motivated. * the previous scheme maps HIP to its own legacy type dispatch entry, but the logic assumes it's exclusive with CUDA, and no ops appear to mention HIP explicitly, so the new logic doesn't expose a static entry point for it. Needs to be verified. Differential Revision: [D21282974](https://our.internmc.facebook.com/intern/diff/D21282974) [ghstack-poisoned]

Previously, BackendSelect kernels did just-in-time device type initialization by calling `LegacyTypeDispatch.initForDispatchKey()` with a computed dispatch key. Here we move the initialization to the backend kernels themselves, where we can call the device- specific initializer directly. Putting this up to run tests on it, but a couple questions remain: * the previous scheme maps HIP to its own legacy type dispatch entry, but the logic assumes it's exclusive with CUDA, and no ops appear to mention HIP explicitly, so the new logic doesn't expose a static entry point for it. Needs to be verified. ghstack-source-id: e574477 Pull Request resolved: #37402

Previously, BackendSelect kernels did just-in-time device type initialization by calling `LegacyTypeDispatch.initForDispatchKey()` with a computed dispatch key. Here we move the initialization to the backend kernels themselves, where we can call the device- specific initializer directly. Putting this up to run tests on it, but a couple questions remain: * why were only BackendSelect kernels doing this initialization? Not all factory ops appear there, nor are all the ops that do appear there factory ops. Currently we generate init code for exactly the BackendSelect ops, but the choice should be better motivated. * the previous scheme maps HIP to its own legacy type dispatch entry, but the logic assumes it's exclusive with CUDA, and no ops appear to mention HIP explicitly, so the new logic doesn't expose a static entry point for it. Needs to be verified. Differential Revision: [D21282974](https://our.internmc.facebook.com/intern/diff/D21282974) [ghstack-poisoned]

Previously, BackendSelect kernels did just-in-time device type initialization by calling `LegacyTypeDispatch.initForDispatchKey()` with a computed dispatch key. Here we move the initialization to the backend kernels themselves, where we can call the device- specific initializer directly. ghstack-source-id: fa2bbeb Pull Request resolved: #37402

Previously, BackendSelect kernels did just-in-time device type initialization by calling `LegacyTypeDispatch.initForDispatchKey()` with a computed dispatch key. Here we move the initialization to the backend kernels themselves, where we can call the device- specific initializer directly. **Note on HIPification**: this PR introduces direct calls to device-specific initializers in generated code, in particular `globalLegacyTypeDispatch().initCUDA()` is called in factory kernels defined in `CUDAType.cpp`. No changes have been made to the conversion defined in https://github.com/pytorch/pytorch/tree/master/torch/utils/hipify and run by [build_amd.py](https://github.com/pytorch/pytorch/blob/master/tools/amd_build/build_amd.py), so these calls remain in HIPified code. This isn't unusual - HIPified code contains unrenamed functions whose behavior has been repurposed. Relying on tests to verify that this call performs the correct initialization on HIPified builds. Differential Revision: [D21282974](https://our.internmc.facebook.com/intern/diff/D21282974) [ghstack-poisoned]

Previously, BackendSelect kernels did just-in-time device type initialization by calling `LegacyTypeDispatch.initForDispatchKey()` with a computed dispatch key. Here we move the initialization to the backend kernels themselves, where we can call the device- specific initializer directly. ghstack-source-id: 829336a Pull Request resolved: #37402

Previously, BackendSelect kernels did just-in-time device type initialization by calling `LegacyTypeDispatch.initForDispatchKey()` with a computed dispatch key. Here we move the initialization to the backend kernels themselves, where we can call the device- specific initializer directly. **Note on HIPification**: this PR introduces direct calls to device-specific initializers in generated code, in particular `globalLegacyTypeDispatch().initCUDA()` is called in factory kernels defined in `CUDAType.cpp`. No changes have been made to the conversion defined in https://github.com/pytorch/pytorch/tree/master/torch/utils/hipify and run by [build_amd.py](https://github.com/pytorch/pytorch/blob/master/tools/amd_build/build_amd.py), so these calls remain in HIPified code. This isn't unusual - HIPified code contains unrenamed functions whose behavior has been repurposed. Relying on tests to verify that this call performs the correct initialization on HIPified builds. Differential Revision: [D21282974](https://our.internmc.facebook.com/intern/diff/D21282974) [ghstack-poisoned]

Previously, BackendSelect kernels did just-in-time device type initialization by calling `LegacyTypeDispatch.initForDispatchKey()` with a computed dispatch key. Here we move the initialization to the backend kernels themselves, where we can call the device- specific initializer directly. ghstack-source-id: e91e8f7 Pull Request resolved: #37402

ezyang · 2020-05-04T20:20:28Z

aten/src/ATen/core/LegacyTypeDispatch.h

@@ -20,26 +20,26 @@ namespace at {

 class CAFFE2_API LegacyTypeDispatch {


Probably an even further improvement (which is not for this PR) would be to eliminate the dynamic dispatch indirection, now that the initializations are being done from code that has direct access to it (i.e., is not from torch_cpu)

ezyang · 2020-05-04T20:23:33Z

aten/src/ATen/function_wrapper.py

+def is_factory(option):
+    # type: (FunctionOption) -> bool
+    formals = option['formals_list']
+    return find_formal_by_type('TensorOptions', formals) is not None and 'method' not in option['variants']


I guess these are all non-substantive refactors?

Yeah - sorry about the noisy diff. This just factors out the old RHS of is_factory_method (line 1224) so it can be used to gate the device init code as well. find_formal_by_name is just the old find_formal moved up and out to accompany.

facebook-github-bot · 2020-05-05T02:13:28Z

@bhosmer merged this pull request in 209c6f9.

…37402) Summary: Pull Request resolved: pytorch#37402 Previously, BackendSelect kernels did just-in-time device type initialization by calling `LegacyTypeDispatch.initForDispatchKey()` with a computed dispatch key. Here we move the initialization to the backend kernels themselves, where we can call the device- specific initializer directly. Putting this up to run tests on it, but a couple questions remain: * why were only BackendSelect kernels doing this initialization? Not all factory ops appear there, nor are all the ops that do appear there factory ops. Currently we generate init code for exactly the BackendSelect ops, but the choice should be better motivated. * the previous scheme maps HIP to its own legacy type dispatch entry, but the logic assumes it's exclusive with CUDA, and no ops appear to mention HIP explicitly, so the new logic doesn't expose a static entry point for it. Needs to be verified. Test Plan: Imported from OSS Differential Revision: D21282974 Pulled By: bhosmer fbshipit-source-id: cd46eb788596948e0572a15fac0f8b43feca5d75

bhosmer requested a review from smessmer May 1, 2020 23:05

bhosmer requested a review from ezyang May 4, 2020 16:21

ezyang reviewed May 4, 2020

View reviewed changes

ezyang approved these changes May 4, 2020

View reviewed changes

facebook-github-bot closed this in 209c6f9 May 5, 2020

facebook-github-bot added the merged label May 5, 2020

facebook-github-bot deleted the gh/bhosmer/27/head branch May 8, 2020 14:16

mruberry added the Merged label Oct 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Move device type init from BackendSelect to backend kernels #37402

Move device type init from BackendSelect to backend kernels #37402

Uh oh!

bhosmer commented Apr 28, 2020 •

edited

Loading

Uh oh!

dr-ci bot commented Apr 28, 2020 •

edited

Loading

Uh oh!

ezyang May 4, 2020

Uh oh!

ezyang May 4, 2020

Uh oh!

bhosmer May 5, 2020

Uh oh!

facebook-github-bot commented May 5, 2020

Uh oh!

Uh oh!

		@@ -20,26 +20,26 @@ namespace at {

		class CAFFE2_API LegacyTypeDispatch {

Move device type init from BackendSelect to backend kernels #37402

Move device type init from BackendSelect to backend kernels #37402

Uh oh!

Conversation

bhosmer commented Apr 28, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dr-ci bot commented Apr 28, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 Build failures summary and remediations

Uh oh!

ezyang May 4, 2020

Choose a reason for hiding this comment

Uh oh!

ezyang May 4, 2020

Choose a reason for hiding this comment

Uh oh!

bhosmer May 5, 2020

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented May 5, 2020

Uh oh!

Uh oh!

bhosmer commented Apr 28, 2020 •

edited

Loading

dr-ci bot commented Apr 28, 2020 •

edited

Loading