Skip to content

Conversation

ezyang
Copy link
Contributor

@ezyang ezyang commented May 8, 2022

Stack from ghstack (oldest at bottom):

Prior to this PR, we had a mish-mash of ways of getting unconventional
sizes/strides behavior:

  • In OSS (but not in fbcode), some methods are virtual and you
    can override them directly

  • There is a is_contiguous policy which is a bitfield tag that lets
    you toggle is_contiguous to error or hit a virtual method
    is_contiguous_custom if it is set. Ordinarily is_contiguous()
    is virtual and you can just override it, but this works EVEN IF
    is_contiguous() is non-virtual (e.g., in fbcode)

  • There is also a sizes policy which is the same idea but for sizes

This PR unifies these mechanisms, and in doing so, eliminates the
maybe virtual/not-virtualness of the methods in question. The primary
downside of this change is that it is BC-breaking (but the BC break is
very easy to fix!)

The new scheme works like this: we have three levels of policy for
sizes/strides (order matters).

  • The Default policy is a conventional dense tensor, where we use
    all of the built-in fields to directly represent the
    sizes/strides/numel/contiguity of the tensor, and it is possible
    to bypass virtual call entirely.

  • The CustomStrides policy represent tensors which have a custom
    notion of strides (most typically, that they don't support them),
    shunting strides() and is_contiguous() to virtual methods
    strides_custom() and is_contiguous_custom(). This INCLUDES handling
    for contiguity, since they typically go hand-in-hand (although
    the situation is murky with batched tensors). The default
    implementations of these functions raise errors saying the tensor
    doesn't support them.

  • The CustomSizes policy represent tensors which have a custom
    notion of sizes (the two notable examples are nested tensor, which
    doesn't have a representation of sizes in the conventional form, and
    XLA/LTC tensor, which synchronizes its sizes with an underlying
    compiler backend). This shunts sizes(), numel() and dim() (along
    with everything from strides) to _custom() variants.

There is no special policy for erroring; instead, we just do a vcall
and expect the virtual method to raise an exception (the performance
hit from the vcall doesn't matter because you're about to raise a C++
exception anyway). The default implementations of all overridable
functions are available at _default() which is helpful in some
situations when you just want to do a "sync" and then run the
conventional semantics.

This PR could be extended further in two ways but I did not do them
due to time constraints:

  • Ideally, all TENSORIMPL_MAYBE_VIRTUAL would be eliminated from
    TensorImpl, by using the same policy trick.

  • set_size and set_stride are still virtual; it's not entirely clear
    the same trick should be used here though as these methods are
    deprecated.

Signed-off-by: Edward Z. Yang ezyang@fb.com

Prior to this PR, we had a mish-mash of ways of getting unconventional
sizes/strides behavior:

- In OSS (but not in fbcode), some methods are virtual and you
  can override them directly

- There is a is_contiguous policy which is a bitfield tag that lets
  you toggle is_contiguous to error or hit a virtual method
  is_contiguous_custom if it is set.  Ordinarily is_contiguous()
  is virtual and you can just override it, but this works EVEN IF
  is_contiguous() is non-virtual (e.g., in fbcode)

- There is also a sizes policy which is the same idea but for sizes

This PR unifies these mechanisms, and in doing so, eliminates the
maybe virtual/not-virtualness of the methods in question.  The primary
downside of this change is that it is BC-breaking (but the BC break is
very easy to fix!)

The new scheme works like this: we have three levels of policy for
sizes/strides (order matters).

- The Default policy is a conventional dense tensor, where we use
  all of the built-in fields to directly represent the
  sizes/strides/numel/contiguity of the tensor, and it is possible
  to bypass virtual call entirely.

- The CustomStrides policy represent tensors which have a custom
  notion of strides (most typically, that they don't support them),
  shunting strides() and is_contiguous() to virtual methods
  strides_custom() and is_contiguous_custom().  This INCLUDES handling
  for contiguity, since they typically go hand-in-hand (although
  the situation is murky with batched tensors).  The default
  implementations of these functions raise errors saying the tensor
  doesn't support them.

- The CustomSizes policy represent tensors which have a custom
  notion of sizes (the two notable examples are nested tensor, which
  doesn't have a representation of sizes in the conventional form, and
  XLA/LTC tensor, which synchronizes its sizes with an underlying
  compiler backend).  This shunts sizes(), numel() and dim() (along
  with everything from strides) to _custom() variants.

There is no special policy for erroring; instead, we just do a vcall
and expect the virtual method to raise an exception (the performance
hit from the vcall doesn't matter because you're about to raise a C++
exception anyway).  The default implementations of all overridable
functions are available at _default() which is helpful in some
situations when you just want to do a "sync" and then run the
conventional semantics.

This PR could be extended further in two ways but I did not do them
due to time constraints:

- Ideally, all TENSORIMPL_MAYBE_VIRTUAL would be eliminated from
  TensorImpl, by using the same policy trick.

- set_size and set_stride are still virtual; it's not entirely clear
  the same trick should be used here though as these methods are
  deprecated.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented May 8, 2022

🔗 Helpful links

❌ 2 New Failures

As of commit ba86a68 (more details on the Dr. CI page):

Expand to see more
  • 2/2 failures introduced in this PR

🕵️ 2 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages

See GitHub Actions build pull / linux-xenial-py3.7-gcc5.4 / test (backwards_compat, 1, 1, linux.2xlarge) (1/2)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-05-08T19:15:24.0997509Z The PR is introduc...m to confirm whether this change is wanted or not.
2022-05-08T19:15:24.0983309Z processing existing schema:  text(__torch__.torch.classes.profiling.SourceRef _0) -> (str _0)
2022-05-08T19:15:24.0984718Z processing existing schema:  count(__torch__.torch.classes.profiling.InstructionStats _0) -> (int _0)
2022-05-08T19:15:24.0986156Z processing existing schema:  duration_ns(__torch__.torch.classes.profiling.InstructionStats _0) -> (int _0)
2022-05-08T19:15:24.0987623Z processing existing schema:  source(__torch__.torch.classes.profiling.SourceStats _0) -> (__torch__.torch.classes.profiling.SourceRef _0)
2022-05-08T19:15:24.0989571Z processing existing schema:  line_map(__torch__.torch.classes.profiling.SourceStats _0) -> (Dict(int, __torch__.torch.classes.profiling.InstructionStats) _0)
2022-05-08T19:15:24.0991030Z processing existing schema:  __init__(__torch__.torch.classes.profiling._ScriptProfile _0) -> (NoneType _0)
2022-05-08T19:15:24.0992370Z processing existing schema:  enable(__torch__.torch.classes.profiling._ScriptProfile _0) -> (NoneType _0)
2022-05-08T19:15:24.0993748Z processing existing schema:  disable(__torch__.torch.classes.profiling._ScriptProfile _0) -> (NoneType _0)
2022-05-08T19:15:24.0995664Z processing existing schema:  _dump_stats(__torch__.torch.classes.profiling._ScriptProfile _0) -> (__torch__.torch.classes.profiling.SourceStats[] _0)
2022-05-08T19:15:24.0997085Z processing existing schema:  __init__(__torch__.torch.classes.dist_rpc.WorkerInfo _0, str _1, int _2) -> (NoneType _0)
2022-05-08T19:15:24.0997509Z The PR is introducing backward incompatible changes to the operator library. Please contact PyTorch team to confirm whether this change is wanted or not. 
2022-05-08T19:15:24.0997532Z 
2022-05-08T19:15:24.0997668Z Broken ops: [
2022-05-08T19:15:24.0998125Z 	aten::set_.source_Tensor_storage_offset(Tensor(a!) self, Tensor source, int storage_offset, int[] size, int[] stride=[]) -> (Tensor(a!))
2022-05-08T19:15:24.0998175Z ]
2022-05-08T19:15:24.2022944Z + cleanup
2022-05-08T19:15:24.2023067Z + retcode=1
2022-05-08T19:15:24.2023139Z + set +x
2022-05-08T19:15:24.2054458Z ##[error]Process completed with exit code 1.
2022-05-08T19:15:24.2092165Z ##[group]Run pytorch/pytorch/.github/actions/get-workflow-job-id@master
2022-05-08T19:15:24.2092236Z with:

See GitHub Actions build pull / pytorch-xla-linux-bionic-py3.7-clang8 / test (xla, 1, 1, linux.2xlarge) (2/2)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-05-08T20:07:45.1955046Z /var/lib/jenkins/w... virtual member functions can be marked 'override'
2022-05-08T20:07:45.1949561Z                                 ^~~~~~~~
2022-05-08T20:07:45.1950269Z /var/lib/jenkins/workspace/xla/torch_xla/csrc/tensor_impl.h:35:23: error: only virtual member functions can be marked 'override'
2022-05-08T20:07:45.1950793Z   int64_t dim() const override;
2022-05-08T20:07:45.1951087Z                       ^~~~~~~~
2022-05-08T20:07:45.1951751Z /var/lib/jenkins/workspace/xla/torch_xla/csrc/tensor_impl.h:37:25: error: only virtual member functions can be marked 'override'
2022-05-08T20:07:45.1952246Z   int64_t numel() const override;
2022-05-08T20:07:45.1952699Z                         ^~~~~~~~
2022-05-08T20:07:45.1953367Z /var/lib/jenkins/workspace/xla/torch_xla/csrc/tensor_impl.h:39:60: error: only virtual member functions can be marked 'override'
2022-05-08T20:07:45.1953955Z   bool is_contiguous(at::MemoryFormat memory_format) const override;
2022-05-08T20:07:45.1954373Z                                                            ^~~~~~~~
2022-05-08T20:07:45.1955046Z /var/lib/jenkins/workspace/xla/torch_xla/csrc/tensor_impl.h:41:33: error: only virtual member functions can be marked 'override'
2022-05-08T20:07:45.1955546Z   int64_t size(int64_t d) const override;
2022-05-08T20:07:45.1955866Z                                 ^~~~~~~~
2022-05-08T20:07:45.1956164Z 5 errors generated.
2022-05-08T20:07:45.7850047Z [9/179] clang++-8 -MMD -MF /var/lib/jenkins/workspace/xla/build/temp.linux-x86_64-3.7/torch_xla/csrc/computation.o.d -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -D_GLIBCXX_USE_CXX11_ABI=1 -fPIC -I/var/lib/jenkins/workspace/xla -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-bin -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/protobuf_archive/src -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/com_google_protobuf/src -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/eigen_archive -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/com_google_absl -I/var/lib/jenkins/workspace -I/var/lib/jenkins/workspace/torch/csrc -I/var/lib/jenkins/workspace/torch/lib/tmp_install/include -I/opt/conda/lib/python3.7/site-packages/torch/include -I/opt/conda/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.7/site-packages/torch/include/TH -I/opt/conda/lib/python3.7/site-packages/torch/include/THC -I/opt/conda/include/python3.7m -c -c /var/lib/jenkins/workspace/xla/torch_xla/csrc/computation.cpp -o /var/lib/jenkins/workspace/xla/build/temp.linux-x86_64-3.7/torch_xla/csrc/computation.o -std=c++14 -Wno-sign-compare -Wno-deprecated-declarations -Wno-return-type -Wno-macro-redefined -Wno-return-std-move -DNDEBUG -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_clang"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1002"' -DTORCH_EXTENSION_NAME=_XLAC -D_GLIBCXX_USE_CXX11_ABI=1
2022-05-08T20:07:47.0956043Z [10/179] clang++-8 -MMD -MF /var/lib/jenkins/workspace/xla/build/temp.linux-x86_64-3.7/torch_xla/csrc/convert_ops.o.d -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -D_GLIBCXX_USE_CXX11_ABI=1 -fPIC -I/var/lib/jenkins/workspace/xla -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-bin -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/protobuf_archive/src -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/com_google_protobuf/src -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/eigen_archive -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/com_google_absl -I/var/lib/jenkins/workspace -I/var/lib/jenkins/workspace/torch/csrc -I/var/lib/jenkins/workspace/torch/lib/tmp_install/include -I/opt/conda/lib/python3.7/site-packages/torch/include -I/opt/conda/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.7/site-packages/torch/include/TH -I/opt/conda/lib/python3.7/site-packages/torch/include/THC -I/opt/conda/include/python3.7m -c -c /var/lib/jenkins/workspace/xla/torch_xla/csrc/convert_ops.cpp -o /var/lib/jenkins/workspace/xla/build/temp.linux-x86_64-3.7/torch_xla/csrc/convert_ops.o -std=c++14 -Wno-sign-compare -Wno-deprecated-declarations -Wno-return-type -Wno-macro-redefined -Wno-return-std-move -DNDEBUG -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_clang"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1002"' -DTORCH_EXTENSION_NAME=_XLAC -D_GLIBCXX_USE_CXX11_ABI=1
2022-05-08T20:08:09.0503967Z [11/179] clang++-8 -MMD -MF /var/lib/jenkins/workspace/xla/build/temp.linux-x86_64-3.7/torch_xla/csrc/RegisterXLA.o.d -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -D_GLIBCXX_USE_CXX11_ABI=1 -fPIC -I/var/lib/jenkins/workspace/xla -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-bin -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/protobuf_archive/src -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/com_google_protobuf/src -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/eigen_archive -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/com_google_absl -I/var/lib/jenkins/workspace -I/var/lib/jenkins/workspace/torch/csrc -I/var/lib/jenkins/workspace/torch/lib/tmp_install/include -I/opt/conda/lib/python3.7/site-packages/torch/include -I/opt/conda/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.7/site-packages/torch/include/TH -I/opt/conda/lib/python3.7/site-packages/torch/include/THC -I/opt/conda/include/python3.7m -c -c /var/lib/jenkins/workspace/xla/torch_xla/csrc/RegisterXLA.cpp -o /var/lib/jenkins/workspace/xla/build/temp.linux-x86_64-3.7/torch_xla/csrc/RegisterXLA.o -std=c++14 -Wno-sign-compare -Wno-deprecated-declarations -Wno-return-type -Wno-macro-redefined -Wno-return-std-move -DNDEBUG -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_clang"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1002"' -DTORCH_EXTENSION_NAME=_XLAC -D_GLIBCXX_USE_CXX11_ABI=1
2022-05-08T20:08:09.0506300Z /var/lib/jenkins/workspace/xla/torch_xla/csrc/RegisterXLA.cpp:63:6: warning: unused function 'resize_out' [-Wunused-function]
2022-05-08T20:08:09.0506734Z void resize_out(const Tensor &out, IntArrayRef sizes, IntArrayRef strides, const TensorOptions &options) {
2022-05-08T20:08:09.0506988Z      ^
2022-05-08T20:08:09.0507395Z /var/lib/jenkins/workspace/xla/torch_xla/csrc/RegisterXLA.cpp:82:6: warning: unused function 'check_inplace' [-Wunused-function]

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

…fied policy"

Prior to this PR, we had a mish-mash of ways of getting unconventional
sizes/strides behavior:

- In OSS (but not in fbcode), some methods are virtual and you
  can override them directly

- There is a is_contiguous policy which is a bitfield tag that lets
  you toggle is_contiguous to error or hit a virtual method
  is_contiguous_custom if it is set.  Ordinarily is_contiguous()
  is virtual and you can just override it, but this works EVEN IF
  is_contiguous() is non-virtual (e.g., in fbcode)

- There is also a sizes policy which is the same idea but for sizes

This PR unifies these mechanisms, and in doing so, eliminates the
maybe virtual/not-virtualness of the methods in question.  The primary
downside of this change is that it is BC-breaking (but the BC break is
very easy to fix!)

The new scheme works like this: we have three levels of policy for
sizes/strides (order matters).

- The Default policy is a conventional dense tensor, where we use
  all of the built-in fields to directly represent the
  sizes/strides/numel/contiguity of the tensor, and it is possible
  to bypass virtual call entirely.

- The CustomStrides policy represent tensors which have a custom
  notion of strides (most typically, that they don't support them),
  shunting strides() and is_contiguous() to virtual methods
  strides_custom() and is_contiguous_custom().  This INCLUDES handling
  for contiguity, since they typically go hand-in-hand (although
  the situation is murky with batched tensors).  The default
  implementations of these functions raise errors saying the tensor
  doesn't support them.

- The CustomSizes policy represent tensors which have a custom
  notion of sizes (the two notable examples are nested tensor, which
  doesn't have a representation of sizes in the conventional form, and
  XLA/LTC tensor, which synchronizes its sizes with an underlying
  compiler backend).  This shunts sizes(), numel() and dim() (along
  with everything from strides) to _custom() variants.

There is no special policy for erroring; instead, we just do a vcall
and expect the virtual method to raise an exception (the performance
hit from the vcall doesn't matter because you're about to raise a C++
exception anyway).  The default implementations of all overridable
functions are available at _default() which is helpful in some
situations when you just want to do a "sync" and then run the
conventional semantics.

This PR could be extended further in two ways but I did not do them
due to time constraints:

- Ideally, all TENSORIMPL_MAYBE_VIRTUAL would be eliminated from
  TensorImpl, by using the same policy trick.

- set_size and set_stride are still virtual; it's not entirely clear
  the same trick should be used here though as these methods are
  deprecated.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

[ghstack-poisoned]
…fied policy"

Prior to this PR, we had a mish-mash of ways of getting unconventional
sizes/strides behavior:

- In OSS (but not in fbcode), some methods are virtual and you
  can override them directly

- There is a is_contiguous policy which is a bitfield tag that lets
  you toggle is_contiguous to error or hit a virtual method
  is_contiguous_custom if it is set.  Ordinarily is_contiguous()
  is virtual and you can just override it, but this works EVEN IF
  is_contiguous() is non-virtual (e.g., in fbcode)

- There is also a sizes policy which is the same idea but for sizes

This PR unifies these mechanisms, and in doing so, eliminates the
maybe virtual/not-virtualness of the methods in question.  The primary
downside of this change is that it is BC-breaking (but the BC break is
very easy to fix!)

The new scheme works like this: we have three levels of policy for
sizes/strides (order matters).

- The Default policy is a conventional dense tensor, where we use
  all of the built-in fields to directly represent the
  sizes/strides/numel/contiguity of the tensor, and it is possible
  to bypass virtual call entirely.

- The CustomStrides policy represent tensors which have a custom
  notion of strides (most typically, that they don't support them),
  shunting strides() and is_contiguous() to virtual methods
  strides_custom() and is_contiguous_custom().  This INCLUDES handling
  for contiguity, since they typically go hand-in-hand (although
  the situation is murky with batched tensors).  The default
  implementations of these functions raise errors saying the tensor
  doesn't support them.

- The CustomSizes policy represent tensors which have a custom
  notion of sizes (the two notable examples are nested tensor, which
  doesn't have a representation of sizes in the conventional form, and
  XLA/LTC tensor, which synchronizes its sizes with an underlying
  compiler backend).  This shunts sizes(), numel() and dim() (along
  with everything from strides) to _custom() variants.

There is no special policy for erroring; instead, we just do a vcall
and expect the virtual method to raise an exception (the performance
hit from the vcall doesn't matter because you're about to raise a C++
exception anyway).  The default implementations of all overridable
functions are available at _default() which is helpful in some
situations when you just want to do a "sync" and then run the
conventional semantics.

This PR could be extended further in two ways but I did not do them
due to time constraints:

- Ideally, all TENSORIMPL_MAYBE_VIRTUAL would be eliminated from
  TensorImpl, by using the same policy trick.

- set_size and set_stride are still virtual; it's not entirely clear
  the same trick should be used here though as these methods are
  deprecated.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

[ghstack-poisoned]
…fied policy"

Prior to this PR, we had a mish-mash of ways of getting unconventional
sizes/strides behavior:

- In OSS (but not in fbcode), some methods are virtual and you
  can override them directly

- There is a is_contiguous policy which is a bitfield tag that lets
  you toggle is_contiguous to error or hit a virtual method
  is_contiguous_custom if it is set.  Ordinarily is_contiguous()
  is virtual and you can just override it, but this works EVEN IF
  is_contiguous() is non-virtual (e.g., in fbcode)

- There is also a sizes policy which is the same idea but for sizes

This PR unifies these mechanisms, and in doing so, eliminates the
maybe virtual/not-virtualness of the methods in question.  The primary
downside of this change is that it is BC-breaking (but the BC break is
very easy to fix!)

The new scheme works like this: we have three levels of policy for
sizes/strides (order matters).

- The Default policy is a conventional dense tensor, where we use
  all of the built-in fields to directly represent the
  sizes/strides/numel/contiguity of the tensor, and it is possible
  to bypass virtual call entirely.

- The CustomStrides policy represent tensors which have a custom
  notion of strides (most typically, that they don't support them),
  shunting strides() and is_contiguous() to virtual methods
  strides_custom() and is_contiguous_custom().  This INCLUDES handling
  for contiguity, since they typically go hand-in-hand (although
  the situation is murky with batched tensors).  The default
  implementations of these functions raise errors saying the tensor
  doesn't support them.

- The CustomSizes policy represent tensors which have a custom
  notion of sizes (the two notable examples are nested tensor, which
  doesn't have a representation of sizes in the conventional form, and
  XLA/LTC tensor, which synchronizes its sizes with an underlying
  compiler backend).  This shunts sizes(), numel() and dim() (along
  with everything from strides) to _custom() variants.

There is no special policy for erroring; instead, we just do a vcall
and expect the virtual method to raise an exception (the performance
hit from the vcall doesn't matter because you're about to raise a C++
exception anyway).  The default implementations of all overridable
functions are available at _default() which is helpful in some
situations when you just want to do a "sync" and then run the
conventional semantics.

This PR could be extended further in two ways but I did not do them
due to time constraints:

- Ideally, all TENSORIMPL_MAYBE_VIRTUAL would be eliminated from
  TensorImpl, by using the same policy trick.

- set_size and set_stride are still virtual; it's not entirely clear
  the same trick should be used here though as these methods are
  deprecated.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

[ghstack-poisoned]
…fied policy"

Prior to this PR, we had a mish-mash of ways of getting unconventional
sizes/strides behavior:

- In OSS (but not in fbcode), some methods are virtual and you
  can override them directly

- There is a is_contiguous policy which is a bitfield tag that lets
  you toggle is_contiguous to error or hit a virtual method
  is_contiguous_custom if it is set.  Ordinarily is_contiguous()
  is virtual and you can just override it, but this works EVEN IF
  is_contiguous() is non-virtual (e.g., in fbcode)

- There is also a sizes policy which is the same idea but for sizes

This PR unifies these mechanisms, and in doing so, eliminates the
maybe virtual/not-virtualness of the methods in question.  The primary
downside of this change is that it is BC-breaking (but the BC break is
very easy to fix!)

The new scheme works like this: we have three levels of policy for
sizes/strides (order matters).

- The Default policy is a conventional dense tensor, where we use
  all of the built-in fields to directly represent the
  sizes/strides/numel/contiguity of the tensor, and it is possible
  to bypass virtual call entirely.

- The CustomStrides policy represent tensors which have a custom
  notion of strides (most typically, that they don't support them),
  shunting strides() and is_contiguous() to virtual methods
  strides_custom() and is_contiguous_custom().  This INCLUDES handling
  for contiguity, since they typically go hand-in-hand (although
  the situation is murky with batched tensors).  The default
  implementations of these functions raise errors saying the tensor
  doesn't support them.

- The CustomSizes policy represent tensors which have a custom
  notion of sizes (the two notable examples are nested tensor, which
  doesn't have a representation of sizes in the conventional form, and
  XLA/LTC tensor, which synchronizes its sizes with an underlying
  compiler backend).  This shunts sizes(), numel() and dim() (along
  with everything from strides) to _custom() variants.

There is no special policy for erroring; instead, we just do a vcall
and expect the virtual method to raise an exception (the performance
hit from the vcall doesn't matter because you're about to raise a C++
exception anyway).  The default implementations of all overridable
functions are available at _default() which is helpful in some
situations when you just want to do a "sync" and then run the
conventional semantics.

This PR could be extended further in two ways but I did not do them
due to time constraints:

- Ideally, all TENSORIMPL_MAYBE_VIRTUAL would be eliminated from
  TensorImpl, by using the same policy trick.

- set_size and set_stride are still virtual; it's not entirely clear
  the same trick should be used here though as these methods are
  deprecated.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

[ghstack-poisoned]
ezyang added a commit to pytorch/xla that referenced this pull request May 8, 2022
Companion PR for pytorch/pytorch#77036

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
@ezyang
Copy link
Contributor Author

ezyang commented May 8, 2022

Companion XLA PR: pytorch/xla#3555

@ezyang ezyang requested a review from zou3519 May 8, 2022 02:00
ezyang added a commit to pytorch/functorch that referenced this pull request May 8, 2022
Companion PR for pytorch/pytorch#77036

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
…fied policy"

Prior to this PR, we had a mish-mash of ways of getting unconventional
sizes/strides behavior:

- In OSS (but not in fbcode), some methods are virtual and you
  can override them directly

- There is a is_contiguous policy which is a bitfield tag that lets
  you toggle is_contiguous to error or hit a virtual method
  is_contiguous_custom if it is set.  Ordinarily is_contiguous()
  is virtual and you can just override it, but this works EVEN IF
  is_contiguous() is non-virtual (e.g., in fbcode)

- There is also a sizes policy which is the same idea but for sizes

This PR unifies these mechanisms, and in doing so, eliminates the
maybe virtual/not-virtualness of the methods in question.  The primary
downside of this change is that it is BC-breaking (but the BC break is
very easy to fix!)

The new scheme works like this: we have three levels of policy for
sizes/strides (order matters).

- The Default policy is a conventional dense tensor, where we use
  all of the built-in fields to directly represent the
  sizes/strides/numel/contiguity of the tensor, and it is possible
  to bypass virtual call entirely.

- The CustomStrides policy represent tensors which have a custom
  notion of strides (most typically, that they don't support them),
  shunting strides() and is_contiguous() to virtual methods
  strides_custom() and is_contiguous_custom().  This INCLUDES handling
  for contiguity, since they typically go hand-in-hand (although
  the situation is murky with batched tensors).  The default
  implementations of these functions raise errors saying the tensor
  doesn't support them.

- The CustomSizes policy represent tensors which have a custom
  notion of sizes (the two notable examples are nested tensor, which
  doesn't have a representation of sizes in the conventional form, and
  XLA/LTC tensor, which synchronizes its sizes with an underlying
  compiler backend).  This shunts sizes(), numel() and dim() (along
  with everything from strides) to _custom() variants.

There is no special policy for erroring; instead, we just do a vcall
and expect the virtual method to raise an exception (the performance
hit from the vcall doesn't matter because you're about to raise a C++
exception anyway).  The default implementations of all overridable
functions are available at _default() which is helpful in some
situations when you just want to do a "sync" and then run the
conventional semantics.

This PR could be extended further in two ways but I did not do them
due to time constraints:

- Ideally, all TENSORIMPL_MAYBE_VIRTUAL would be eliminated from
  TensorImpl, by using the same policy trick.

- set_size and set_stride are still virtual; it's not entirely clear
  the same trick should be used here though as these methods are
  deprecated.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request May 8, 2022
Prior to this PR, we had a mish-mash of ways of getting unconventional
sizes/strides behavior:

- In OSS (but not in fbcode), some methods are virtual and you
  can override them directly

- There is a is_contiguous policy which is a bitfield tag that lets
  you toggle is_contiguous to error or hit a virtual method
  is_contiguous_custom if it is set.  Ordinarily is_contiguous()
  is virtual and you can just override it, but this works EVEN IF
  is_contiguous() is non-virtual (e.g., in fbcode)

- There is also a sizes policy which is the same idea but for sizes

This PR unifies these mechanisms, and in doing so, eliminates the
maybe virtual/not-virtualness of the methods in question.  The primary
downside of this change is that it is BC-breaking (but the BC break is
very easy to fix!)

The new scheme works like this: we have three levels of policy for
sizes/strides (order matters).

- The Default policy is a conventional dense tensor, where we use
  all of the built-in fields to directly represent the
  sizes/strides/numel/contiguity of the tensor, and it is possible
  to bypass virtual call entirely.

- The CustomStrides policy represent tensors which have a custom
  notion of strides (most typically, that they don't support them),
  shunting strides() and is_contiguous() to virtual methods
  strides_custom() and is_contiguous_custom().  This INCLUDES handling
  for contiguity, since they typically go hand-in-hand (although
  the situation is murky with batched tensors).  The default
  implementations of these functions raise errors saying the tensor
  doesn't support them.

- The CustomSizes policy represent tensors which have a custom
  notion of sizes (the two notable examples are nested tensor, which
  doesn't have a representation of sizes in the conventional form, and
  XLA/LTC tensor, which synchronizes its sizes with an underlying
  compiler backend).  This shunts sizes(), numel() and dim() (along
  with everything from strides) to _custom() variants.

There is no special policy for erroring; instead, we just do a vcall
and expect the virtual method to raise an exception (the performance
hit from the vcall doesn't matter because you're about to raise a C++
exception anyway).  The default implementations of all overridable
functions are available at _default() which is helpful in some
situations when you just want to do a "sync" and then run the
conventional semantics.

This PR could be extended further in two ways but I did not do them
due to time constraints:

- Ideally, all TENSORIMPL_MAYBE_VIRTUAL would be eliminated from
  TensorImpl, by using the same policy trick.

- set_size and set_stride are still virtual; it's not entirely clear
  the same trick should be used here though as these methods are
  deprecated.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

ghstack-source-id: af02861
Pull Request resolved: #77036
@ngimel
Copy link
Collaborator

ngimel commented May 8, 2022

Can you add shunting for is_non_overlapping_and_dense to go with is_contiguous? #48892

…fied policy"

Prior to this PR, we had a mish-mash of ways of getting unconventional
sizes/strides behavior:

- In OSS (but not in fbcode), some methods are virtual and you
  can override them directly

- There is a is_contiguous policy which is a bitfield tag that lets
  you toggle is_contiguous to error or hit a virtual method
  is_contiguous_custom if it is set.  Ordinarily is_contiguous()
  is virtual and you can just override it, but this works EVEN IF
  is_contiguous() is non-virtual (e.g., in fbcode)

- There is also a sizes policy which is the same idea but for sizes

This PR unifies these mechanisms, and in doing so, eliminates the
maybe virtual/not-virtualness of the methods in question.  The primary
downside of this change is that it is BC-breaking (but the BC break is
very easy to fix!)

The new scheme works like this: we have three levels of policy for
sizes/strides (order matters).

- The Default policy is a conventional dense tensor, where we use
  all of the built-in fields to directly represent the
  sizes/strides/numel/contiguity of the tensor, and it is possible
  to bypass virtual call entirely.

- The CustomStrides policy represent tensors which have a custom
  notion of strides (most typically, that they don't support them),
  shunting strides() and is_contiguous() to virtual methods
  strides_custom() and is_contiguous_custom().  This INCLUDES handling
  for contiguity, since they typically go hand-in-hand (although
  the situation is murky with batched tensors).  The default
  implementations of these functions raise errors saying the tensor
  doesn't support them.

- The CustomSizes policy represent tensors which have a custom
  notion of sizes (the two notable examples are nested tensor, which
  doesn't have a representation of sizes in the conventional form, and
  XLA/LTC tensor, which synchronizes its sizes with an underlying
  compiler backend).  This shunts sizes(), numel() and dim() (along
  with everything from strides) to _custom() variants.

There is no special policy for erroring; instead, we just do a vcall
and expect the virtual method to raise an exception (the performance
hit from the vcall doesn't matter because you're about to raise a C++
exception anyway).  The default implementations of all overridable
functions are available at _default() which is helpful in some
situations when you just want to do a "sync" and then run the
conventional semantics.

This PR could be extended further in two ways but I did not do them
due to time constraints:

- Ideally, all TENSORIMPL_MAYBE_VIRTUAL would be eliminated from
  TensorImpl, by using the same policy trick.

- set_size and set_stride are still virtual; it's not entirely clear
  the same trick should be used here though as these methods are
  deprecated.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request May 8, 2022
Prior to this PR, we had a mish-mash of ways of getting unconventional
sizes/strides behavior:

- In OSS (but not in fbcode), some methods are virtual and you
  can override them directly

- There is a is_contiguous policy which is a bitfield tag that lets
  you toggle is_contiguous to error or hit a virtual method
  is_contiguous_custom if it is set.  Ordinarily is_contiguous()
  is virtual and you can just override it, but this works EVEN IF
  is_contiguous() is non-virtual (e.g., in fbcode)

- There is also a sizes policy which is the same idea but for sizes

This PR unifies these mechanisms, and in doing so, eliminates the
maybe virtual/not-virtualness of the methods in question.  The primary
downside of this change is that it is BC-breaking (but the BC break is
very easy to fix!)

The new scheme works like this: we have three levels of policy for
sizes/strides (order matters).

- The Default policy is a conventional dense tensor, where we use
  all of the built-in fields to directly represent the
  sizes/strides/numel/contiguity of the tensor, and it is possible
  to bypass virtual call entirely.

- The CustomStrides policy represent tensors which have a custom
  notion of strides (most typically, that they don't support them),
  shunting strides() and is_contiguous() to virtual methods
  strides_custom() and is_contiguous_custom().  This INCLUDES handling
  for contiguity, since they typically go hand-in-hand (although
  the situation is murky with batched tensors).  The default
  implementations of these functions raise errors saying the tensor
  doesn't support them.

- The CustomSizes policy represent tensors which have a custom
  notion of sizes (the two notable examples are nested tensor, which
  doesn't have a representation of sizes in the conventional form, and
  XLA/LTC tensor, which synchronizes its sizes with an underlying
  compiler backend).  This shunts sizes(), numel() and dim() (along
  with everything from strides) to _custom() variants.

There is no special policy for erroring; instead, we just do a vcall
and expect the virtual method to raise an exception (the performance
hit from the vcall doesn't matter because you're about to raise a C++
exception anyway).  The default implementations of all overridable
functions are available at _default() which is helpful in some
situations when you just want to do a "sync" and then run the
conventional semantics.

This PR could be extended further in two ways but I did not do them
due to time constraints:

- Ideally, all TENSORIMPL_MAYBE_VIRTUAL would be eliminated from
  TensorImpl, by using the same policy trick.

- set_size and set_stride are still virtual; it's not entirely clear
  the same trick should be used here though as these methods are
  deprecated.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

ghstack-source-id: ba43663
Pull Request resolved: #77036
@ezyang
Copy link
Contributor Author

ezyang commented May 8, 2022

@ngimel This seems reasonable but not obviously semantics preserving (so I may have to fix up code/tests; the current PR is supposed to be semantics preserving). I'll give it a try in a follow up after I get this PR green.

@ezyang
Copy link
Contributor Author

ezyang commented May 8, 2022

Companion functorch PR: pytorch/functorch#785

…fied policy"

Prior to this PR, we had a mish-mash of ways of getting unconventional
sizes/strides behavior:

- In OSS (but not in fbcode), some methods are virtual and you
  can override them directly

- There is a is_contiguous policy which is a bitfield tag that lets
  you toggle is_contiguous to error or hit a virtual method
  is_contiguous_custom if it is set.  Ordinarily is_contiguous()
  is virtual and you can just override it, but this works EVEN IF
  is_contiguous() is non-virtual (e.g., in fbcode)

- There is also a sizes policy which is the same idea but for sizes

This PR unifies these mechanisms, and in doing so, eliminates the
maybe virtual/not-virtualness of the methods in question.  The primary
downside of this change is that it is BC-breaking (but the BC break is
very easy to fix!)

The new scheme works like this: we have three levels of policy for
sizes/strides (order matters).

- The Default policy is a conventional dense tensor, where we use
  all of the built-in fields to directly represent the
  sizes/strides/numel/contiguity of the tensor, and it is possible
  to bypass virtual call entirely.

- The CustomStrides policy represent tensors which have a custom
  notion of strides (most typically, that they don't support them),
  shunting strides() and is_contiguous() to virtual methods
  strides_custom() and is_contiguous_custom().  This INCLUDES handling
  for contiguity, since they typically go hand-in-hand (although
  the situation is murky with batched tensors).  The default
  implementations of these functions raise errors saying the tensor
  doesn't support them.

- The CustomSizes policy represent tensors which have a custom
  notion of sizes (the two notable examples are nested tensor, which
  doesn't have a representation of sizes in the conventional form, and
  XLA/LTC tensor, which synchronizes its sizes with an underlying
  compiler backend).  This shunts sizes(), numel() and dim() (along
  with everything from strides) to _custom() variants.

There is no special policy for erroring; instead, we just do a vcall
and expect the virtual method to raise an exception (the performance
hit from the vcall doesn't matter because you're about to raise a C++
exception anyway).  The default implementations of all overridable
functions are available at _default() which is helpful in some
situations when you just want to do a "sync" and then run the
conventional semantics.

This PR could be extended further in two ways but I did not do them
due to time constraints:

- Ideally, all TENSORIMPL_MAYBE_VIRTUAL would be eliminated from
  TensorImpl, by using the same policy trick.

- set_size and set_stride are still virtual; it's not entirely clear
  the same trick should be used here though as these methods are
  deprecated.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

[ghstack-poisoned]
…fied policy"

Prior to this PR, we had a mish-mash of ways of getting unconventional
sizes/strides behavior:

- In OSS (but not in fbcode), some methods are virtual and you
  can override them directly

- There is a is_contiguous policy which is a bitfield tag that lets
  you toggle is_contiguous to error or hit a virtual method
  is_contiguous_custom if it is set.  Ordinarily is_contiguous()
  is virtual and you can just override it, but this works EVEN IF
  is_contiguous() is non-virtual (e.g., in fbcode)

- There is also a sizes policy which is the same idea but for sizes

This PR unifies these mechanisms, and in doing so, eliminates the
maybe virtual/not-virtualness of the methods in question.  The primary
downside of this change is that it is BC-breaking (but the BC break is
very easy to fix!)

The new scheme works like this: we have three levels of policy for
sizes/strides (order matters).

- The Default policy is a conventional dense tensor, where we use
  all of the built-in fields to directly represent the
  sizes/strides/numel/contiguity of the tensor, and it is possible
  to bypass virtual call entirely.

- The CustomStrides policy represent tensors which have a custom
  notion of strides (most typically, that they don't support them),
  shunting strides() and is_contiguous() to virtual methods
  strides_custom() and is_contiguous_custom().  This INCLUDES handling
  for contiguity, since they typically go hand-in-hand (although
  the situation is murky with batched tensors).  The default
  implementations of these functions raise errors saying the tensor
  doesn't support them.

- The CustomSizes policy represent tensors which have a custom
  notion of sizes (the two notable examples are nested tensor, which
  doesn't have a representation of sizes in the conventional form, and
  XLA/LTC tensor, which synchronizes its sizes with an underlying
  compiler backend).  This shunts sizes(), numel() and dim() (along
  with everything from strides) to _custom() variants.

There is no special policy for erroring; instead, we just do a vcall
and expect the virtual method to raise an exception (the performance
hit from the vcall doesn't matter because you're about to raise a C++
exception anyway).  The default implementations of all overridable
functions are available at _default() which is helpful in some
situations when you just want to do a "sync" and then run the
conventional semantics.

This PR could be extended further in two ways but I did not do them
due to time constraints:

- Ideally, all TENSORIMPL_MAYBE_VIRTUAL would be eliminated from
  TensorImpl, by using the same policy trick.

- set_size and set_stride are still virtual; it's not entirely clear
  the same trick should be used here though as these methods are
  deprecated.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

[ghstack-poisoned]
…fied policy"

Prior to this PR, we had a mish-mash of ways of getting unconventional
sizes/strides behavior:

- In OSS (but not in fbcode), some methods are virtual and you
  can override them directly

- There is a is_contiguous policy which is a bitfield tag that lets
  you toggle is_contiguous to error or hit a virtual method
  is_contiguous_custom if it is set.  Ordinarily is_contiguous()
  is virtual and you can just override it, but this works EVEN IF
  is_contiguous() is non-virtual (e.g., in fbcode)

- There is also a sizes policy which is the same idea but for sizes

This PR unifies these mechanisms, and in doing so, eliminates the
maybe virtual/not-virtualness of the methods in question.  The primary
downside of this change is that it is BC-breaking (but the BC break is
very easy to fix!)

The new scheme works like this: we have three levels of policy for
sizes/strides (order matters).

- The Default policy is a conventional dense tensor, where we use
  all of the built-in fields to directly represent the
  sizes/strides/numel/contiguity of the tensor, and it is possible
  to bypass virtual call entirely.

- The CustomStrides policy represent tensors which have a custom
  notion of strides (most typically, that they don't support them),
  shunting strides() and is_contiguous() to virtual methods
  strides_custom() and is_contiguous_custom().  This INCLUDES handling
  for contiguity, since they typically go hand-in-hand (although
  the situation is murky with batched tensors).  The default
  implementations of these functions raise errors saying the tensor
  doesn't support them.

- The CustomSizes policy represent tensors which have a custom
  notion of sizes (the two notable examples are nested tensor, which
  doesn't have a representation of sizes in the conventional form, and
  XLA/LTC tensor, which synchronizes its sizes with an underlying
  compiler backend).  This shunts sizes(), numel() and dim() (along
  with everything from strides) to _custom() variants.

There is no special policy for erroring; instead, we just do a vcall
and expect the virtual method to raise an exception (the performance
hit from the vcall doesn't matter because you're about to raise a C++
exception anyway).  The default implementations of all overridable
functions are available at _default() which is helpful in some
situations when you just want to do a "sync" and then run the
conventional semantics.

This PR could be extended further in two ways but I did not do them
due to time constraints:

- Ideally, all TENSORIMPL_MAYBE_VIRTUAL would be eliminated from
  TensorImpl, by using the same policy trick.

- set_size and set_stride are still virtual; it's not entirely clear
  the same trick should be used here though as these methods are
  deprecated.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request May 8, 2022
Prior to this PR, we had a mish-mash of ways of getting unconventional
sizes/strides behavior:

- In OSS (but not in fbcode), some methods are virtual and you
  can override them directly

- There is a is_contiguous policy which is a bitfield tag that lets
  you toggle is_contiguous to error or hit a virtual method
  is_contiguous_custom if it is set.  Ordinarily is_contiguous()
  is virtual and you can just override it, but this works EVEN IF
  is_contiguous() is non-virtual (e.g., in fbcode)

- There is also a sizes policy which is the same idea but for sizes

This PR unifies these mechanisms, and in doing so, eliminates the
maybe virtual/not-virtualness of the methods in question.  The primary
downside of this change is that it is BC-breaking (but the BC break is
very easy to fix!)

The new scheme works like this: we have three levels of policy for
sizes/strides (order matters).

- The Default policy is a conventional dense tensor, where we use
  all of the built-in fields to directly represent the
  sizes/strides/numel/contiguity of the tensor, and it is possible
  to bypass virtual call entirely.

- The CustomStrides policy represent tensors which have a custom
  notion of strides (most typically, that they don't support them),
  shunting strides() and is_contiguous() to virtual methods
  strides_custom() and is_contiguous_custom().  This INCLUDES handling
  for contiguity, since they typically go hand-in-hand (although
  the situation is murky with batched tensors).  The default
  implementations of these functions raise errors saying the tensor
  doesn't support them.

- The CustomSizes policy represent tensors which have a custom
  notion of sizes (the two notable examples are nested tensor, which
  doesn't have a representation of sizes in the conventional form, and
  XLA/LTC tensor, which synchronizes its sizes with an underlying
  compiler backend).  This shunts sizes(), numel() and dim() (along
  with everything from strides) to _custom() variants.

There is no special policy for erroring; instead, we just do a vcall
and expect the virtual method to raise an exception (the performance
hit from the vcall doesn't matter because you're about to raise a C++
exception anyway).  The default implementations of all overridable
functions are available at _default() which is helpful in some
situations when you just want to do a "sync" and then run the
conventional semantics.

This PR could be extended further in two ways but I did not do them
due to time constraints:

- Ideally, all TENSORIMPL_MAYBE_VIRTUAL would be eliminated from
  TensorImpl, by using the same policy trick.

- set_size and set_stride are still virtual; it's not entirely clear
  the same trick should be used here though as these methods are
  deprecated.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

ghstack-source-id: 492943c
Pull Request resolved: #77036
@ezyang ezyang requested a review from albanD May 10, 2022 19:41
Copy link
Collaborator

@ngimel ngimel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome!

@ezyang
Copy link
Contributor Author

ezyang commented May 11, 2022

@pytorchbot merge this

@github-actions
Copy link
Contributor

Hey @ezyang.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

ezyang added a commit to pytorch/xla that referenced this pull request May 11, 2022
* Update for consolidated contiguous/sizes policy

Companion PR for pytorch/pytorch#77036

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* fixup

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Delete .torch_pin
ezyang added a commit to pytorch/functorch that referenced this pull request May 11, 2022
Companion PR for pytorch/pytorch#77036

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
@suo
Copy link
Member

suo commented May 11, 2022

@ezyang I suspect this broke metal builds that are run periodically, see: https://hud.pytorch.org/minihud?name_filter=periodic%20/%20ios-12-5-1-arm64-metal%20/%20build

mind looking?

@ezyang
Copy link
Contributor Author

ezyang commented May 11, 2022

#77247

bdhirsh added a commit that referenced this pull request May 12, 2022
`FunctionalTensorWrapper` has been sneakily relying on `shallow_copy_from` to mirror metadata from the inner tensor onto the wrapper. The new changes from #77036 actually break this logic because we blindly mirror the contiguity policy from the inner tensor to the wrapper. But `FunctionalTensorWrapper` should always have the default policy - it has valid sizes and strides!

I originally wrote it this way to avoid having to alter `TensorImpl.h` just for functionalization, but now we have to; `shallow_copy_from()` copies the metadata and then immediately calls `refresh_sizes()`, which fails on `FunctionalTensorWrapper` if it was just given a non-default policy (because we don't have a `sizes_custom()` method override defined).

Instead, I tried to split out the copying methods in `TensorImpl.h` into a base method that copies all metadata relevant to wrappers. And I updated `shallow_copy_from` to use that, so the metadata copying isn't duplicated anywhere.

Companion functorch patch: pytorch/functorch#800




[ghstack-poisoned]
facebook-github-bot pushed a commit that referenced this pull request May 13, 2022
…#77036)

Summary:
Prior to this PR, we had a mish-mash of ways of getting unconventional
sizes/strides behavior:

- In OSS (but not in fbcode), some methods are virtual and you
  can override them directly

- There is a is_contiguous policy which is a bitfield tag that lets
  you toggle is_contiguous to error or hit a virtual method
  is_contiguous_custom if it is set.  Ordinarily is_contiguous()
  is virtual and you can just override it, but this works EVEN IF
  is_contiguous() is non-virtual (e.g., in fbcode)

- There is also a sizes policy which is the same idea but for sizes

This PR unifies these mechanisms, and in doing so, eliminates the
maybe virtual/not-virtualness of the methods in question.  The primary
downside of this change is that it is BC-breaking (but the BC break is
very easy to fix!)

The new scheme works like this: we have three levels of policy for
sizes/strides (order matters).

- The Default policy is a conventional dense tensor, where we use
  all of the built-in fields to directly represent the
  sizes/strides/numel/contiguity of the tensor, and it is possible
  to bypass virtual call entirely.

- The CustomStrides policy represent tensors which have a custom
  notion of strides (most typically, that they don't support them),
  shunting strides() and is_contiguous() to virtual methods
  strides_custom() and is_contiguous_custom().  This INCLUDES handling
  for contiguity, since they typically go hand-in-hand (although
  the situation is murky with batched tensors).  The default
  implementations of these functions raise errors saying the tensor
  doesn't support them.

- The CustomSizes policy represent tensors which have a custom
  notion of sizes (the two notable examples are nested tensor, which
  doesn't have a representation of sizes in the conventional form, and
  XLA/LTC tensor, which synchronizes its sizes with an underlying
  compiler backend).  This shunts sizes(), numel() and dim() (along
  with everything from strides) to _custom() variants.

There is no special policy for erroring; instead, we just do a vcall
and expect the virtual method to raise an exception (the performance
hit from the vcall doesn't matter because you're about to raise a C++
exception anyway).  The default implementations of all overridable
functions are available at _default() which is helpful in some
situations when you just want to do a "sync" and then run the
conventional semantics.

This PR could be extended further in two ways but I did not do them
due to time constraints:

- Ideally, all TENSORIMPL_MAYBE_VIRTUAL would be eliminated from
  TensorImpl, by using the same policy trick.

- set_size and set_stride are still virtual; it's not entirely clear
  the same trick should be used here though as these methods are
  deprecated.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: #77036

Approved by: https://github.com/bdhirsh

Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/2896f81dd4f51062f4bf5338e31d15cc3c599eb7

Reviewed By: atalman

Differential Revision: D36322300

Pulled By: ezyang

fbshipit-source-id: 6723db4f277837467a0f1fe65e34b9c11c615178
@facebook-github-bot facebook-github-bot deleted the gh/ezyang/1148/head branch May 14, 2022 14:17
zou3519 pushed a commit to zou3519/pytorch that referenced this pull request Jul 20, 2022
bigfootjon pushed a commit that referenced this pull request Jul 21, 2022
…functorch#785)

Companion PR for #77036

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants