Make codegen backend agnostic minus fallbacks #2944

bdhirsh · 2021-05-11T18:35:41Z

This PR removes all xla-specific logic from the pytorch codegen, except for the CPU fallback logic (working on in a later PR here. The corresponding pytorch-side changes are pytorch/pytorch#58064 and pytorch/pytorch#58568.

Main changes:

Rename aten_xla_type.h -> XLANativeFunctions.h, which is a little more in-line with how we name other in-tree generated files. Also updated the corresponding documentation.
I deleted InitializeAtenBindings() mostly because it looks like legacy (it's a no-op), and it would be an extra hoop to force every external backend to implement the same function
Removed the AtenXlaType class. XLA lowering are now just defined under torch_xla::{op}.
I also started passing a few xla macros and headers directly through the yaml, instead of hardcoding them in the codegen. I'm actually hoping to remove them though, since the logging information seems redundant: they're used in the generate out/inplace wrappers, but those wrapper just call into the handwritten xla functional lowering, which also has its own logging information.

…nus CPU fallback)" **Summary** This PR tries to remove all xla-specific logic from the codegen except for two places: - renaming the `aten_xla_type.h/cpp` template files; Going to do that in a separate PR just to make the diff easier to understand - CPU fallback logic (everything in `aten_xla_type_default.h/cpp` and `gen_external_aten_fallbacks.py`). I'm trying to kill all of that logic in a subsequent PR by making the CPU fallback a boxed kernel, so it felt unnecessary to go through it all and remove the xla references here. **Notable changes** The xla codegen includes some custom logging in each kernel wrapper, so I added a few new knobs to the external yaml, that we now test. I have a corresponding [xla-side PR](pytorch/xla#2944) with the new yaml changes, which look like this: ``` per_op_log: XLA_FN_TRACK(3) per_argument_log: TF_VLOG(3) cpu_fallback_counter: XLA_COUNTER("aten::{name}", 1) extra_headers: > #include <tensorflow/compiler/xla/xla_client/debug_macros.h> #include <tensorflow/compiler/xla/xla_client/metrics.h> #include <tensorflow/compiler/xla/xla_client/tf_logging.h> #include <torch_xla/csrc/function_call_tracker.h> #include <torch_xla/csrc/aten_xla_type.h> #include <torch_xla/csrc/aten_xla_type_default.h> ``` [ghstack-poisoned]

**Summary** This PR tries to remove all xla-specific logic from the codegen except for two places: - renaming the `aten_xla_type.h/cpp` template files; Going to do that in a separate PR just to make the diff easier to understand - CPU fallback logic (everything in `aten_xla_type_default.h/cpp` and `gen_external_aten_fallbacks.py`). I'm trying to kill all of that logic in a subsequent PR by making the CPU fallback a boxed kernel, so it felt unnecessary to go through it all and remove the xla references here. **Notable changes** The xla codegen includes some custom logging in each kernel wrapper, so I added a few new knobs to the external yaml, that we now test. I have a corresponding [xla-side PR](pytorch/xla#2944) with the new yaml changes, which look like this: ``` per_op_log: XLA_FN_TRACK(3) per_argument_log: TF_VLOG(3) cpu_fallback_counter: XLA_COUNTER("aten::{name}", 1) extra_headers: > #include <tensorflow/compiler/xla/xla_client/debug_macros.h> #include <tensorflow/compiler/xla/xla_client/metrics.h> #include <tensorflow/compiler/xla/xla_client/tf_logging.h> #include <torch_xla/csrc/function_call_tracker.h> #include <torch_xla/csrc/aten_xla_type.h> #include <torch_xla/csrc/aten_xla_type_default.h> ``` [ghstack-poisoned]

…nus CPU fallback)" **Summary** This PR tries to remove all xla-specific logic from the codegen except for two places: - renaming the `aten_xla_type.h/cpp` template files; Going to do that in a separate PR just to make the diff easier to understand - CPU fallback logic (everything in `aten_xla_type_default.h/cpp` and `gen_external_aten_fallbacks.py`). I'm trying to kill all of that logic in a subsequent PR by making the CPU fallback a boxed kernel, so it felt unnecessary to go through it all and remove the xla references here. **Notable changes** The xla codegen includes some custom logging in each kernel wrapper, so I added a few new knobs to the external yaml, that we now test. I have a corresponding [xla-side PR](pytorch/xla#2944) with the new yaml changes, which look like this: ``` per_op_log: XLA_FN_TRACK(3) per_argument_log: TF_VLOG(3) cpu_fallback_counter: XLA_COUNTER("aten::{name}", 1) extra_headers: > #include <tensorflow/compiler/xla/xla_client/debug_macros.h> #include <tensorflow/compiler/xla/xla_client/metrics.h> #include <tensorflow/compiler/xla/xla_client/tf_logging.h> #include <torch_xla/csrc/function_call_tracker.h> #include <torch_xla/csrc/aten_xla_type.h> #include <torch_xla/csrc/aten_xla_type_default.h> ``` [ghstack-poisoned]

**Summary** This PR tries to remove all xla-specific logic from the codegen except for two places: - renaming the `aten_xla_type.h/cpp` template files; Going to do that in a separate PR just to make the diff easier to understand - CPU fallback logic (everything in `aten_xla_type_default.h/cpp` and `gen_external_aten_fallbacks.py`). I'm trying to kill all of that logic in a subsequent PR by making the CPU fallback a boxed kernel, so it felt unnecessary to go through it all and remove the xla references here. **Notable changes** The xla codegen includes some custom logging in each kernel wrapper, so I added a few new knobs to the external yaml, that we now test. I have a corresponding [xla-side PR](pytorch/xla#2944) with the new yaml changes, which look like this: ``` per_op_log: XLA_FN_TRACK(3) per_argument_log: TF_VLOG(3) cpu_fallback_counter: XLA_COUNTER("aten::{name}", 1) extra_headers: > #include <tensorflow/compiler/xla/xla_client/debug_macros.h> #include <tensorflow/compiler/xla/xla_client/metrics.h> #include <tensorflow/compiler/xla/xla_client/tf_logging.h> #include <torch_xla/csrc/function_call_tracker.h> #include <torch_xla/csrc/aten_xla_type.h> #include <torch_xla/csrc/aten_xla_type_default.h> ``` [ghstack-poisoned]

bdhirsh · 2021-05-19T17:48:08Z

xla_native_functions.yaml

  - convolution_overrideable
  - convolution_backward_overrideable
  - _copy_from
+  - _copy_from_and_resize


Huh, not actually sure why these are showing up in the diff, since they're already there in master.

…nus CPU fallback)" **Summary** This PR tries to remove all xla-specific logic from the codegen except for two places: - renaming the `aten_xla_type.h/cpp` template files; Going to do that in a separate PR just to make the diff easier to understand - CPU fallback logic (everything in `aten_xla_type_default.h/cpp` and `gen_external_aten_fallbacks.py`). I'm trying to kill all of that logic in a subsequent PR by making the CPU fallback a boxed kernel, so it felt unnecessary to go through it all and remove the xla references here. **Notable changes** The xla codegen includes some custom logging in each kernel wrapper, so I added a few new knobs to the external yaml, that we now test. I have a corresponding [xla-side PR](pytorch/xla#2944) with the new yaml changes, which look like this: ``` per_op_log: XLA_FN_TRACK(3) per_argument_log: TF_VLOG(3) cpu_fallback_counter: XLA_COUNTER("aten::{name}", 1) extra_headers: > #include <tensorflow/compiler/xla/xla_client/debug_macros.h> #include <tensorflow/compiler/xla/xla_client/metrics.h> #include <tensorflow/compiler/xla/xla_client/tf_logging.h> #include <torch_xla/csrc/function_call_tracker.h> #include <torch_xla/csrc/aten_xla_type.h> #include <torch_xla/csrc/aten_xla_type_default.h> ``` [ghstack-poisoned]

**Summary** This PR tries to remove all xla-specific logic from the codegen except for two places: - renaming the `aten_xla_type.h/cpp` template files; Going to do that in a separate PR just to make the diff easier to understand - CPU fallback logic (everything in `aten_xla_type_default.h/cpp` and `gen_external_aten_fallbacks.py`). I'm trying to kill all of that logic in a subsequent PR by making the CPU fallback a boxed kernel, so it felt unnecessary to go through it all and remove the xla references here. **Notable changes** The xla codegen includes some custom logging in each kernel wrapper, so I added a few new knobs to the external yaml, that we now test. I have a corresponding [xla-side PR](pytorch/xla#2944) with the new yaml changes, which look like this: ``` per_op_log: XLA_FN_TRACK(3) per_argument_log: TF_VLOG(3) cpu_fallback_counter: XLA_COUNTER("aten::{name}", 1) extra_headers: > #include <tensorflow/compiler/xla/xla_client/debug_macros.h> #include <tensorflow/compiler/xla/xla_client/metrics.h> #include <tensorflow/compiler/xla/xla_client/tf_logging.h> #include <torch_xla/csrc/function_call_tracker.h> #include <torch_xla/csrc/aten_xla_type.h> #include <torch_xla/csrc/aten_xla_type_default.h> ``` [ghstack-poisoned]

bdhirsh · 2021-05-19T19:21:04Z

xla_native_functions.yaml

 backend: XLA
 cpp_namespace: torch_xla
+per_op_log: XLA_FN_TRACK(3)
+per_argument_log: TF_VLOG(3)


@ailzhang @JackCaoG I'm actually hoping that I can just completely remove this, but I wanted to confirm with you. Right now, the auto-generated add_out() kernel for XLA has the same logging behavior as the original XLA codegen, and looks like this:

at::Tensor wrapper_Tensor_add(const at::Tensor &self, const at::Tensor &other, const at::Scalar &alpha) { // This is the actual XLA add kernel return torch_xla::add(self, other, alpha); } } // anonymous namespace at::Tensor &wrapper_out_add_out(const at::Tensor &self, const at::Tensor &other, const at::Scalar &alpha, at::Tensor &out) { XLA_FN_TRACK(3); TF_VLOG(3) << "XLA wrapper_out_add_out :" << " self=" << self.toString() << " other=" << other.toString() << " out=" << out.toString(); auto wrapper_out_add_out_tmp = wrapper_Tensor_add(self, other, alpha); at::_copy_from_and_resize(wrapper_out_add_out_tmp, out); return out; }

That logging information in the generated add_out kernel seems unnecessary though, since it ends up calling directly into torch_xla::add, which has its own logging too..

Passing C++ macros directly into yaml so the codegen can plop them into a C++ function is also pretty fragile; what if another backend wants some other more custom logging.

So I think it would be cleaner to kill that code-generated logging and just expect all logging to be done in the functional kernel lowerings, but I wanted to see what your thoughts are!

I agree that auto-generated add_out() should not have logging and counter if the actual kernel has the logging and counter already.

…nus CPU fallback)" **Summary** This PR tries to remove all xla-specific logic from the codegen except for two places: - renaming the `aten_xla_type.h/cpp` template files; Going to do that in a separate PR just to make the diff easier to understand - CPU fallback logic (everything in `aten_xla_type_default.h/cpp` and `gen_external_aten_fallbacks.py`). I'm trying to kill all of that logic in a subsequent PR by making the CPU fallback a boxed kernel, so it felt unnecessary to go through it all and remove the xla references here. **Notable changes** The xla codegen includes some custom logging in each kernel wrapper, so I added a few new knobs to the external yaml, that we now test. I have a corresponding [xla-side PR](pytorch/xla#2944) with the new yaml changes, which look like this: ``` per_op_log: XLA_FN_TRACK(3) per_argument_log: TF_VLOG(3) cpu_fallback_counter: XLA_COUNTER("aten::{name}", 1) extra_headers: > #include <tensorflow/compiler/xla/xla_client/debug_macros.h> #include <tensorflow/compiler/xla/xla_client/metrics.h> #include <tensorflow/compiler/xla/xla_client/tf_logging.h> #include <torch_xla/csrc/function_call_tracker.h> #include <torch_xla/csrc/aten_xla_type.h> #include <torch_xla/csrc/aten_xla_type_default.h> ``` [ghstack-poisoned]

**Summary** This PR tries to remove all xla-specific logic from the codegen except for two places: - renaming the `aten_xla_type.h/cpp` template files; Going to do that in a separate PR just to make the diff easier to understand - CPU fallback logic (everything in `aten_xla_type_default.h/cpp` and `gen_external_aten_fallbacks.py`). I'm trying to kill all of that logic in a subsequent PR by making the CPU fallback a boxed kernel, so it felt unnecessary to go through it all and remove the xla references here. **Notable changes** The xla codegen includes some custom logging in each kernel wrapper, so I added a few new knobs to the external yaml, that we now test. I have a corresponding [xla-side PR](pytorch/xla#2944) with the new yaml changes, which look like this: ``` per_op_log: XLA_FN_TRACK(3) per_argument_log: TF_VLOG(3) cpu_fallback_counter: XLA_COUNTER("aten::{name}", 1) extra_headers: > #include <tensorflow/compiler/xla/xla_client/debug_macros.h> #include <tensorflow/compiler/xla/xla_client/metrics.h> #include <tensorflow/compiler/xla/xla_client/tf_logging.h> #include <torch_xla/csrc/function_call_tracker.h> #include <torch_xla/csrc/aten_xla_type.h> #include <torch_xla/csrc/aten_xla_type_default.h> ``` [ghstack-poisoned]

…nus CPU fallback)" **Summary** This PR tries to remove all xla-specific logic from the codegen except for two places: - renaming the `aten_xla_type.h/cpp` template files; Going to do that in a separate PR just to make the diff easier to understand - CPU fallback logic (everything in `aten_xla_type_default.h/cpp` and `gen_external_aten_fallbacks.py`). I'm trying to kill all of that logic in a subsequent PR by making the CPU fallback a boxed kernel, so it felt unnecessary to go through it all and remove the xla references here. **Notable changes** The xla codegen includes some custom logging in each kernel wrapper, so I added a few new knobs to the external yaml, that we now test. I have a corresponding [xla-side PR](pytorch/xla#2944) with the new yaml changes, which look like this: ``` per_op_log: XLA_FN_TRACK(3) per_argument_log: TF_VLOG(3) cpu_fallback_counter: XLA_COUNTER("aten::{name}", 1) extra_headers: > #include <tensorflow/compiler/xla/xla_client/debug_macros.h> #include <tensorflow/compiler/xla/xla_client/metrics.h> #include <tensorflow/compiler/xla/xla_client/tf_logging.h> #include <torch_xla/csrc/function_call_tracker.h> #include <torch_xla/csrc/aten_xla_type.h> #include <torch_xla/csrc/aten_xla_type_default.h> ``` [ghstack-poisoned]

**Summary** This PR tries to remove all xla-specific logic from the codegen except for two places: - renaming the `aten_xla_type.h/cpp` template files; Going to do that in a separate PR just to make the diff easier to understand - CPU fallback logic (everything in `aten_xla_type_default.h/cpp` and `gen_external_aten_fallbacks.py`). I'm trying to kill all of that logic in a subsequent PR by making the CPU fallback a boxed kernel, so it felt unnecessary to go through it all and remove the xla references here. **Notable changes** The xla codegen includes some custom logging in each kernel wrapper, so I added a few new knobs to the external yaml, that we now test. I have a corresponding [xla-side PR](pytorch/xla#2944) with the new yaml changes, which look like this: ``` per_op_log: XLA_FN_TRACK(3) per_argument_log: TF_VLOG(3) cpu_fallback_counter: XLA_COUNTER("aten::{name}", 1) extra_headers: > #include <tensorflow/compiler/xla/xla_client/debug_macros.h> #include <tensorflow/compiler/xla/xla_client/metrics.h> #include <tensorflow/compiler/xla/xla_client/tf_logging.h> #include <torch_xla/csrc/function_call_tracker.h> #include <torch_xla/csrc/aten_xla_type.h> #include <torch_xla/csrc/aten_xla_type_default.h> ``` [ghstack-poisoned]

…nus CPU fallback)" **Summary** This PR tries to remove all xla-specific logic from the codegen except for two places: - renaming the `aten_xla_type.h/cpp` template files; Going to do that in a separate PR just to make the diff easier to understand - CPU fallback logic (everything in `aten_xla_type_default.h/cpp` and `gen_external_aten_fallbacks.py`). I'm trying to kill all of that logic in a subsequent PR by making the CPU fallback a boxed kernel, so it felt unnecessary to go through it all and remove the xla references here. **Notable changes** The xla codegen includes some custom logging in each kernel wrapper, so I added a few new knobs to the external yaml, that we now test. I have a corresponding [xla-side PR](pytorch/xla#2944) with the new yaml changes, which look like this: ``` per_op_log: XLA_FN_TRACK(3) per_argument_log: TF_VLOG(3) cpu_fallback_counter: XLA_COUNTER("aten::{name}", 1) extra_headers: > #include <tensorflow/compiler/xla/xla_client/debug_macros.h> #include <tensorflow/compiler/xla/xla_client/metrics.h> #include <tensorflow/compiler/xla/xla_client/tf_logging.h> #include <torch_xla/csrc/function_call_tracker.h> #include <torch_xla/csrc/aten_xla_type.h> #include <torch_xla/csrc/aten_xla_type_default.h> ``` [ghstack-poisoned]

**Summary** This PR tries to remove all xla-specific logic from the codegen except for two places: - renaming the `aten_xla_type.h/cpp` template files; Going to do that in a separate PR just to make the diff easier to understand - CPU fallback logic (everything in `aten_xla_type_default.h/cpp` and `gen_external_aten_fallbacks.py`). I'm trying to kill all of that logic in a subsequent PR by making the CPU fallback a boxed kernel, so it felt unnecessary to go through it all and remove the xla references here. **Notable changes** The xla codegen includes some custom logging in each kernel wrapper, so I added a few new knobs to the external yaml, that we now test. I have a corresponding [xla-side PR](pytorch/xla#2944) with the new yaml changes, which look like this: ``` per_op_log: XLA_FN_TRACK(3) per_argument_log: TF_VLOG(3) cpu_fallback_counter: XLA_COUNTER("aten::{name}", 1) extra_headers: > #include <tensorflow/compiler/xla/xla_client/debug_macros.h> #include <tensorflow/compiler/xla/xla_client/metrics.h> #include <tensorflow/compiler/xla/xla_client/tf_logging.h> #include <torch_xla/csrc/function_call_tracker.h> #include <torch_xla/csrc/aten_xla_type.h> #include <torch_xla/csrc/aten_xla_type_default.h> ``` [ghstack-poisoned]

JackCaoG

Mostly LGTM, some minor nits.

JackCaoG · 2021-05-22T22:31:47Z

OP_LOWERING_GUIDE.md

-2. `aten_xla_type.h/.cpp` are entry points of PyTorch to the pytorch_xla world. `aten_xla_type.h` is auto-generated through a combination of `xla_native_functions.yaml` and the PyTorch core `native_functions.yaml` file, and contains declarations for kernels that need to be defined in `aten_xla_type.cpp`. The kernels written here need to construct 'XLATensor' using the input `at::Tensor` and other parameters. The resulting `XLATensor` needs to be converted back to the `at::Tensor` before returning to the PyTorch world.
+1. `xla_native_functions.yaml` contains the list of all operators that are lowered. Each operator name must directly match a pytorch operator listed in [native_functions.yaml](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/native_functions.yaml). This file serves as the interface to adding new xla operators, and is an input to PyTorch's [codegen machinery](https://github.com/pytorch/pytorch/blob/master/tools/codegen/gen_backend_stubs.py). It generates the below 3 files: `XLANativeFunctions.h`, `aten_xla_type_default.h`, and `aten_xla_type_default.cpp`
+2. `XLANativeFunctions.h` and `aten_xla_type.cpp` are entry points of PyTorch to the pytorch_xla world, and contain the manually written lowerings to XLA for each operator. `XLANativeFunctions.h` is auto-generated through a combination of `xla_native_functions.yaml` and the PyTorch core `native_functions.yaml` file, and contains declarations for kernels that need to be defined in `aten_xla_type.cpp`. The kernels written here need to construct 'XLATensor' using the input `at::Tensor` and other parameters. The resulting `XLATensor` needs to be converted back to the `at::Tensor` before returning to the PyTorch world.
+2. `RegisterXLA.cpp` and `RegisterAutogradXLA.cpp` are auto-generated files that register all lowerings to the PyTorch Dispatcher. They also include auto-generated wrapper implementations of `out=` and `inplace` operators.


Could you change 2. to 3. and update the order in below.

JackCaoG · 2021-05-22T22:34:40Z

OP_LOWERING_GUIDE.md

+2. `RegisterXLA.cpp` and `RegisterAutogradXLA.cpp` are auto-generated files that register all lowerings to the PyTorch Dispatcher. They also include auto-generated wrapper implementations of `out=` and `inplace` operators.
 3. `aten_xla_type_default.h/.cpp` are also auto-generated, and contain our default implementation of the PyTorch operations which simply fall back to the underlying CPU implementation. Functions in here will be used if lowering is not explicitly defined in `xla_native_functions.yaml` + `aten_xla_type.cpp`.
-4. `tensor.h` contains the `XLATensor` declarations. These declarations are one to one mapping of the `at::Tensor` nodes we declared in `aten_xla_type.h`
+4. `tensor.h` contains the `XLATensor` declarations. These declarations are one to one mapping of the `at::Tensor` nodes we declared in `XLANativeFunctions.h`


Not directly related to your change but could you update "These declarations are one to one mapping" to something similar to "These declarations are usually one to one mapping". We sometimes reuse XLATensor for multiple atenXlaType functions.

bdhirsh · 2021-05-24T14:29:05Z

Fixed the docs. I also removed the logging from the generated inplace/out wrappers, which simplifies some of the codegen and makes the yaml more robust.

JackCaoG

Thanks @bdhirsh !

…orporate #2891

…e kernels when possible

…orporate #2891

…nus CPU fallback)" **Summary** This PR tries to remove all xla-specific logic from the codegen except for two places: - renaming the `aten_xla_type.h/cpp` template files; Going to do that in a separate PR just to make the diff easier to understand - CPU fallback logic (everything in `aten_xla_type_default.h/cpp` and `gen_external_aten_fallbacks.py`). I'm trying to kill all of that logic in a subsequent PR by making the CPU fallback a boxed kernel, so it felt unnecessary to go through it all and remove the xla references here. **Notable changes** The xla codegen includes some custom logging in each kernel wrapper, so I added a few new knobs to the external yaml, that we now test. I have a corresponding [xla-side PR](pytorch/xla#2944) with the new yaml changes, which look like this: ``` per_op_log: XLA_FN_TRACK(3) per_argument_log: TF_VLOG(3) cpu_fallback_counter: XLA_COUNTER("aten::{name}", 1) extra_headers: > #include <tensorflow/compiler/xla/xla_client/debug_macros.h> #include <tensorflow/compiler/xla/xla_client/metrics.h> #include <tensorflow/compiler/xla/xla_client/tf_logging.h> #include <torch_xla/csrc/function_call_tracker.h> #include <torch_xla/csrc/aten_xla_type.h> #include <torch_xla/csrc/aten_xla_type_default.h> ``` [ghstack-poisoned]

**Summary** This PR tries to remove all xla-specific logic from the codegen except for two places: - renaming the `aten_xla_type.h/cpp` template files; Going to do that in a separate PR just to make the diff easier to understand - CPU fallback logic (everything in `aten_xla_type_default.h/cpp` and `gen_external_aten_fallbacks.py`). I'm trying to kill all of that logic in a subsequent PR by making the CPU fallback a boxed kernel, so it felt unnecessary to go through it all and remove the xla references here. **Notable changes** The xla codegen includes some custom logging in each kernel wrapper, so I added a few new knobs to the external yaml, that we now test. I have a corresponding [xla-side PR](pytorch/xla#2944) with the new yaml changes, which look like this: ``` per_op_log: XLA_FN_TRACK(3) per_argument_log: TF_VLOG(3) cpu_fallback_counter: XLA_COUNTER("aten::{name}", 1) extra_headers: > #include <tensorflow/compiler/xla/xla_client/debug_macros.h> #include <tensorflow/compiler/xla/xla_client/metrics.h> #include <tensorflow/compiler/xla/xla_client/tf_logging.h> #include <torch_xla/csrc/function_call_tracker.h> #include <torch_xla/csrc/aten_xla_type.h> #include <torch_xla/csrc/aten_xla_type_default.h> ``` [ghstack-poisoned]

Summary: Pull Request resolved: #58064 **Summary** This PR tries to remove all xla-specific logic from the codegen except for two places: - renaming the `aten_xla_type.h/cpp` template files; Going to do that in a separate PR just to make the diff easier to understand - CPU fallback logic (everything in `aten_xla_type_default.h/cpp` and `gen_external_aten_fallbacks.py`). I'm trying to kill all of that logic in a subsequent PR by making the CPU fallback a boxed kernel, so it felt unnecessary to go through it all and remove the xla references here. **Notable changes** The xla codegen includes some custom logging in each kernel wrapper, so I added a few new knobs to the external yaml, that we now test. I have a corresponding [xla-side PR](pytorch/xla#2944) with the new yaml changes, which look like this: ``` per_op_log: XLA_FN_TRACK(3) per_argument_log: TF_VLOG(3) cpu_fallback_counter: XLA_COUNTER("aten::{name}", 1) extra_headers: > #include <tensorflow/compiler/xla/xla_client/debug_macros.h> #include <tensorflow/compiler/xla/xla_client/metrics.h> #include <tensorflow/compiler/xla/xla_client/tf_logging.h> #include <torch_xla/csrc/function_call_tracker.h> #include <torch_xla/csrc/aten_xla_type.h> #include <torch_xla/csrc/aten_xla_type_default.h> ``` Test Plan: Imported from OSS Reviewed By: anjali411 Differential Revision: D28711095 Pulled By: bdhirsh fbshipit-source-id: 90a48440f2e865a948184e2fb167ea240ada47bb

…nus CPU fallback)" **Summary** This PR tries to remove all xla-specific logic from the codegen except for two places: - renaming the `aten_xla_type.h/cpp` template files; Going to do that in a separate PR just to make the diff easier to understand - CPU fallback logic (everything in `aten_xla_type_default.h/cpp` and `gen_external_aten_fallbacks.py`). I'm trying to kill all of that logic in a subsequent PR by making the CPU fallback a boxed kernel, so it felt unnecessary to go through it all and remove the xla references here. **Notable changes** The xla codegen includes some custom logging in each kernel wrapper, so I added a few new knobs to the external yaml, that we now test. I have a corresponding [xla-side PR](pytorch/xla#2944) with the new yaml changes, which look like this: ``` per_op_log: XLA_FN_TRACK(3) per_argument_log: TF_VLOG(3) cpu_fallback_counter: XLA_COUNTER("aten::{name}", 1) extra_headers: > #include <tensorflow/compiler/xla/xla_client/debug_macros.h> #include <tensorflow/compiler/xla/xla_client/metrics.h> #include <tensorflow/compiler/xla/xla_client/tf_logging.h> #include <torch_xla/csrc/function_call_tracker.h> #include <torch_xla/csrc/aten_xla_type.h> #include <torch_xla/csrc/aten_xla_type_default.h> ``` Differential Revision: [D28711095](https://our.internmc.facebook.com/intern/diff/D28711095) [ghstack-poisoned]

**Summary** This PR tries to remove all xla-specific logic from the codegen except for two places: - renaming the `aten_xla_type.h/cpp` template files; Going to do that in a separate PR just to make the diff easier to understand - CPU fallback logic (everything in `aten_xla_type_default.h/cpp` and `gen_external_aten_fallbacks.py`). I'm trying to kill all of that logic in a subsequent PR by making the CPU fallback a boxed kernel, so it felt unnecessary to go through it all and remove the xla references here. **Notable changes** The xla codegen includes some custom logging in each kernel wrapper, so I added a few new knobs to the external yaml, that we now test. I have a corresponding [xla-side PR](pytorch/xla#2944) with the new yaml changes, which look like this: ``` per_op_log: XLA_FN_TRACK(3) per_argument_log: TF_VLOG(3) cpu_fallback_counter: XLA_COUNTER("aten::{name}", 1) extra_headers: > #include <tensorflow/compiler/xla/xla_client/debug_macros.h> #include <tensorflow/compiler/xla/xla_client/metrics.h> #include <tensorflow/compiler/xla/xla_client/tf_logging.h> #include <torch_xla/csrc/function_call_tracker.h> #include <torch_xla/csrc/aten_xla_type.h> #include <torch_xla/csrc/aten_xla_type_default.h> ``` Differential Revision: [D28711095](https://our.internmc.facebook.com/intern/diff/D28711095) [ghstack-poisoned]

…58064) Summary: Pull Request resolved: pytorch#58064 **Summary** This PR tries to remove all xla-specific logic from the codegen except for two places: - renaming the `aten_xla_type.h/cpp` template files; Going to do that in a separate PR just to make the diff easier to understand - CPU fallback logic (everything in `aten_xla_type_default.h/cpp` and `gen_external_aten_fallbacks.py`). I'm trying to kill all of that logic in a subsequent PR by making the CPU fallback a boxed kernel, so it felt unnecessary to go through it all and remove the xla references here. **Notable changes** The xla codegen includes some custom logging in each kernel wrapper, so I added a few new knobs to the external yaml, that we now test. I have a corresponding [xla-side PR](pytorch/xla#2944) with the new yaml changes, which look like this: ``` per_op_log: XLA_FN_TRACK(3) per_argument_log: TF_VLOG(3) cpu_fallback_counter: XLA_COUNTER("aten::{name}", 1) extra_headers: > #include <tensorflow/compiler/xla/xla_client/debug_macros.h> #include <tensorflow/compiler/xla/xla_client/metrics.h> #include <tensorflow/compiler/xla/xla_client/tf_logging.h> #include <torch_xla/csrc/function_call_tracker.h> #include <torch_xla/csrc/aten_xla_type.h> #include <torch_xla/csrc/aten_xla_type_default.h> ``` Test Plan: Imported from OSS Reviewed By: anjali411 Differential Revision: D28711095 Pulled By: bdhirsh fbshipit-source-id: 90a48440f2e865a948184e2fb167ea240ada47bb

bdhirsh force-pushed the make_codegen_backend_agnostic_minus_fallbacks branch from 535ff1a to 6e49287 Compare May 11, 2021 22:45

bdhirsh force-pushed the remove_bridge_api_from_codegen branch from e2fe1f4 to 1db52f3 Compare May 12, 2021 14:13

bdhirsh force-pushed the make_codegen_backend_agnostic_minus_fallbacks branch from 6e49287 to fc9f54d Compare May 12, 2021 14:13

bdhirsh mentioned this pull request May 12, 2021

remove xla-specific stuff from codegen (minus CPU fallback) pytorch/pytorch#58064

Closed

Base automatically changed from remove_bridge_api_from_codegen to master May 17, 2021 22:17

ailzhang added the REMOVE_TORCH_PIN label May 18, 2021

bdhirsh commented May 19, 2021

View reviewed changes

bdhirsh force-pushed the make_codegen_backend_agnostic_minus_fallbacks branch from c8c7681 to d2ba37d Compare May 19, 2021 18:52

bdhirsh commented May 19, 2021

View reviewed changes

bdhirsh requested review from JackCaoG and ailzhang May 19, 2021 19:21

JackCaoG requested changes May 22, 2021

View reviewed changes

JackCaoG approved these changes May 24, 2021

View reviewed changes

bdhirsh added 4 commits May 24, 2021 14:50

use in-tree public codegen API.

8f48d8d

use in-tree public codegen API.

b596ac5

use in-tree public codegen API.

22b8e69

stop pinning to feature branch

7b0db89

bdhirsh added 18 commits May 24, 2021 14:50

remove torch_pin file before merging. preemptively update yaml to inc…

9d44eaf

…orporate #2891

use in-tree public codegen API.

a5e7ec9

stop pinning to feature branch

22ca5ce

[WIP] remove bridge:: api from codegen. Start auto-gen'ing out/inplac…

5552894

…e kernels when possible

updated torch pin

dade8d5

fix missing ops in yaml

0848526

remove torch_pin file before merging. preemptively update yaml to inc…

586fe1d

…orporate #2891

make codegen backend-agnostic, except for CPU fallbacks

83a524b

remove logging

a77abeb

fix straggler AtenXlaType ops

6e63286

lint

7f6aecc

rename aten_xla_type.h -> XLANativeFunctions.h

a45b969

rename aten_xla_type.h -> XLANativeFunctions.h

5569b93

fix rename, delete aten_xla_type.h

6e80841

update torch_pin after the file rename

9f3525c

remove logging info from yaml

31094d8

lint

5d09c0f

doc fix

c223040

bdhirsh force-pushed the make_codegen_backend_agnostic_minus_fallbacks branch from 49791eb to c223040 Compare May 24, 2021 21:54

rebase fix

a98dea6

remove torch pin

2b17af3

bdhirsh merged commit dbe4e72 into master May 26, 2021

bdhirsh deleted the make_codegen_backend_agnostic_minus_fallbacks branch May 26, 2021 20:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Make codegen backend agnostic minus fallbacks #2944

Make codegen backend agnostic minus fallbacks #2944

Uh oh!

bdhirsh commented May 11, 2021 •

edited

Loading

Uh oh!

bdhirsh May 19, 2021

Uh oh!

bdhirsh May 19, 2021

Uh oh!

bdhirsh May 19, 2021

Uh oh!

JackCaoG May 19, 2021

Uh oh!

JackCaoG left a comment

Uh oh!

JackCaoG May 22, 2021

Uh oh!

JackCaoG May 22, 2021

Uh oh!

bdhirsh commented May 24, 2021

Uh oh!

JackCaoG left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Make codegen backend agnostic minus fallbacks #2944

Make codegen backend agnostic minus fallbacks #2944

Uh oh!

Conversation

bdhirsh commented May 11, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bdhirsh May 19, 2021

Choose a reason for hiding this comment

Uh oh!

bdhirsh May 19, 2021

Choose a reason for hiding this comment

Uh oh!

bdhirsh May 19, 2021

Choose a reason for hiding this comment

Uh oh!

JackCaoG May 19, 2021

Choose a reason for hiding this comment

Uh oh!

JackCaoG left a comment

Choose a reason for hiding this comment

Uh oh!

JackCaoG May 22, 2021

Choose a reason for hiding this comment

Uh oh!

JackCaoG May 22, 2021

Choose a reason for hiding this comment

Uh oh!

bdhirsh commented May 24, 2021

Uh oh!

JackCaoG left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

bdhirsh commented May 11, 2021 •

edited

Loading