[primTorch] Enforces stride metadata #77542

mruberry · 2022-05-16T13:25:41Z

This PR...

Filed the Following Issues

Testing

Updates test_dtypes to longer attempt to test the backward of sample inputs where no inputs require grad
Adds a new test_python_reference_errors; it ensures the meta operations for references throw errors as expected
Updates compare_tensor_meta to better handle CUDA devices, and (temporarily) restricts stride checking to the CUDA device type
Elementwise unary and elementwise binary operators now have arbitrarily strided reference inputs
Reference inputs for _like functions are added
An OpInfo for torch.empty is added
Reference inputs for torch.clone are added
A NumPy reference for clone is added
Adds OpInfos for refs.empty and refs.empty_like

Prims

Renames the "max" and "min" prims have been renamed to "maximum" and "minimum," respectively, to better conform to their ATen names
Adds the empty, empty_like, full, and full_like prims
Fixes the elementwise meta function's stride propagation
Fixes clone's meta function's stride propagation
Fixes convert_element_type's meta's stride propagation
Adds a (temporary) _to_dtype pprivate prim that casts a tensor while preserving its stride permutation
Removes the _set prim comment
Adds utils.compute_elementwise_output_strides, which computes the correct output strides for elementwise operations
Corrects an issue where utils.make_contiguous_strides_for was creating the incorrect strides for tensors with no elements

References

Adds the empty, empty_like, full, full_like, and ones_like refs
Extends make_elementwise_unary_reference to accept an additional callable to perform extra input validation
Adds an extra validation function to handle refs.neg(BoolTensor)
Updates the isfinite ref to call ones_like when appropriate
Models Python scalar handling for elementwise binary operations
Added a 64 dim check for the amin and amax references
opmath is now a flag that can be set separately for cpu and CUDA

facebook-github-bot · 2022-05-16T13:25:46Z

🔗 Helpful links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/77542
📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓Need help or want to give feedback on the CI? Visit our office hours
↩️ [fb-only] Re-run with SSH instructions

❌ 1 New Failures

As of commit 5166070 (more details on the Dr. CI page):

Expand to see more

1/1 failures introduced in this PR

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages

pull / linux-xenial-cuda11.3-py3.7-gcc7-bazel-test / build-and-test (1/1)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-05-18T10:09:48.3150056Z ##[error]Process completed with exit code 1.

2022-05-18T10:09:48.2940528Z �[1A�[K�[31m�[1mFAILED:�[0m Build did NOT complete successfully (34 packages loaded, 3468 targets \
2022-05-18T10:09:48.2941203Z configured)
2022-05-18T10:09:48.2954904Z �[31m�[1mERROR: �[0mCouldn't start the build. Unable to run tests
2022-05-18T10:09:48.2997532Z 
2022-05-18T10:09:48.2997963Z �[1A�[K
2022-05-18T10:09:48.3001491Z �[1A�[K�[31m�[1mFAILED:�[0m Build did NOT complete successfully (34 packages loaded, 3468 targets \
2022-05-18T10:09:48.3001956Z configured)
2022-05-18T10:09:48.3093802Z �[0m+ cleanup
2022-05-18T10:09:48.3094117Z + retcode=1
2022-05-18T10:09:48.3094396Z + set +x
2022-05-18T10:09:48.3150056Z ##[error]Process completed with exit code 1.
2022-05-18T10:09:48.3202248Z Prepare all required actions
2022-05-18T10:09:48.3218739Z ##[group]Run ./.github/actions/chown-workspace
2022-05-18T10:09:48.3218956Z env:
2022-05-18T10:09:48.3219100Z   IN_CI: 1
2022-05-18T10:09:48.3219265Z   IS_GHA: 1
2022-05-18T10:09:48.3219451Z   GIT_DEFAULT_BRANCH: master
2022-05-18T10:09:48.3219628Z ##[endgroup]
2022-05-18T10:09:48.3233897Z ##[group]Run docker run --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
2022-05-18T10:09:48.3234244Z �[36;1mdocker run --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .�[0m
2022-05-18T10:09:48.3245726Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

ezyang · 2022-05-16T14:20:19Z

torch/testing/_internal/common_methods_invocations.py

+            # Test doesn't support non-tensor inputs
+            DecorateInfo(unittest.expectedFailure,
+                         'TestMathBits',
+                         'test_neg_view'),


join @zou3519 and I in advocating these skips should be automatically generated and saved ;) It's very time consuming to manually track all of these down

+1 -> #74642

Yeah but not in this PR

I think you may already be past the tipping point where it will be faster to sit down and add this infrastructure than to play popcorn with the CI for the next week

ezyang · 2022-05-16T14:21:41Z

torch/_prims/wrappers.py

+            # the kernel is invoked on cpu, so it makes strides contiguous
+            if a.device.type == "cpu":
+                return prims.convert_element_type(a, dtype)
+            return prims._to_dtype(a, dtype)


The comment is very reasonable, but I still do not understand why cpu gets special cased in the condition here.

I'm tweaking this now to see if I can get CPU strides validated

ezyang · 2022-05-16T14:22:07Z

torch/_prims/utils.py

        device = inferred_device if device is None else device

+        if isinstance(device, str):
+            device = torch.device(device)


This is impossible according to the type signature. Relax the type signature?

Good point - fixed

ezyang · 2022-05-16T14:22:59Z

torch/_prims/utils.py

    if a.device != b.device:
-        msg = "Devices {0} and {1} are not equal!".format(a.device, b.device)
-        raise AssertionError(msg)
+        # Handles special cuda:0 vs cuda case


Why is this necessary?

Somehow we're getting both values

If you query the device of a tensor, it should always have an index. If there is no index then there is some invariant violation when we are creating the tensors in the first place (we can probably force an index in TensorMeta's constructor)

I agree with you and it would be interesting to hunt it down, this PR is already a little sprawling, though

ezyang · 2022-05-16T14:24:22Z

torch/_prims/utils.py

+
+    # NOTE: currently we are only validating strides on CUDA, because
+    # we are using opmath on both CPU and CUDA, which causes
+    # divergance stride behavior vs. the CPU, which does not use opmath


the "we" here is ambiguous; I assume you're talking about refs, compared to the reference CPU implementations? But it's also surprising that CPU TensorIterator doesn't preserve strides because it "lost" the information when doing a dtype conversion for type promotion. Isn't that a bug?

It's not so nice that strides can only be validated on CUDA; this means that if you're working on strides it's mandatory to be on a CUDA machine (for me at least, my default dev env is non-CUDA)

it may be a bug but I'm trying to get the tests to pass at the moment by modeling the CPU behavior

ezyang · 2022-05-16T14:47:24Z

torch/_prims/utils.py

+# NOTE: Based on the implementation in TensorIterator.cpp, but note that
+# the note [Computing output strides] is incorrect, because it
+# says that strides will be preserved even if they are not
+# "non overlapping and dense", but this is incorrect. The


Overlapping/sparse strides get preserved in the sense that they implicitly define some permutation, and that permutation is preserved in the (contiguous) output strides

Then the note in C++ should say that instead of what it does

ezyang · 2022-05-16T14:49:17Z

torch/_prims/utils.py

+    if ndim == 0:
+        return ()
+    if ndim == 1:
+        return (1,)


TBH, I'm not sure PrimTorch should be in the business of defining these short circuits, if the general algorithm works for these cases too.

I added a comment to review removing them if they're unnecessary

ezyang · 2022-05-16T14:50:40Z

torch/_prims/utils.py

+        return 0
+
+    perm = tuple(range(ndim))
+    perm = sorted(perm, key=cmp_to_key(_cmp), reverse=True)


Why not define perm as a list and then .sort() it?

Yeah that would have worked, too

ezyang · 2022-05-16T14:52:49Z

torch/_prims/utils.py

+    perm = tuple(range(ndim))
+    perm = sorted(perm, key=cmp_to_key(_cmp), reverse=True)
+
+    permuted_shape = [-1] * ndim


Initializing these with None is safer, because -1 is a valid index

-1 is a valid index for a dimension but not a valid dimension length, and initializing with None would change the type

ezyang · 2022-05-16T16:21:23Z

torch/_prims/utils.py

+        relevant_pairs.append((x, y))
+
+    expected = 1
+    for x, y in sorted(relevant_pairs, key=lambda p: p[1]):


From a tracing perspective, this sort is terrifying

Luckily the final version of the PR didn't include this function, and the sort in the stride comparison function shouldn't define any validity conditions, although it is an example of how we may have to just run the meta functions for our ops to understand what the intermediate metadata values of certain tensors are

So, you are team "no symbolic strides"? Our current default assumption is that strides are symbolic, because from a design perspective that is easier. To make them not symbolic we will have to work (because strides are computed from symbolic quantities aka shapes).

ezyang · 2022-05-16T16:31:54Z

torch/_prims/__init__.py


+# NOTE: _to_dtype
+# This private op casts the input to the desired type while preserving its stride
+# permutation, unlike .to(dtype) which will create a tensor with contiguous strides


to() is supposed to preserve strides (that's why its memory format defaults to preserve_format). File a bug?

Good call -- #77600

ezyang · 2022-05-16T16:32:42Z

torch/_prims/__init__.py

+    try:
+        requires_grad = a.requires_grad
+    except Exception as e:
+        requires_grad = False


@eellison if we fully replace TensorMeta with FakeTensor I think it will fix this

ezyang · 2022-05-16T16:35:21Z

torch/_prims/__init__.py

+
+    result = empty_like(a, device=a.device, dtype=dtype, requires_grad=requires_grad)
+
+    # TODO: review if the no_grad context is the best way to model this


The entire autograd story here is a bit wishy washy. But my default assumption was that each prim in primtorch would have an autograd formula explicitly defined for it. So then no_grad here doesn't matter, because a use of _to_dtype should only ever be in a context where there's going to be an explicit autograd formula.

You're correct, but not a direction we've been focused on modeling yet

ezyang · 2022-05-16T16:35:54Z

torch/_prims/__init__.py

    doc="",
 )
+
+# TODO: layout, pin_memory, memory_format


Somewhat surprised the meta tests aren't complaining loudly at you on this ;)

The samples don't set these options

ezyang · 2022-05-16T16:40:05Z

torch/_prims/__init__.py

+"""
+
+empty = _make_prim(
+    schema="empty(int[] shape, *, ScalarType dtype, Device device, bool requires_grad) -> Tensor",


Should empty really have a requires_grad argument in PrimTorch? From the perspective of a backend implementer requires_grad ought to have been long erased; there's nothing they're going to be usefully able to do with it.

It's certainly interesting to consider; we can always make it exclusive to the ref later when we get into autograd

ezyang · 2022-05-16T16:41:10Z

torch/_prims/__init__.py

+    impl_aten=_empty_like_aten,
+    return_type=RETURN_TYPE.NEW,
+    doc=_empty_like_doc,
+)


How come this is a prim? It doesn't seem very primitive to me.

full_like is a jax.lax operation and to make this (torch.empty_like) non-prim in the current system we'd have to do empty+as_strided, and as_strided is an operation we generally don't want to call

per the below thinking, full_like can be made a ref by combining empty_like + fill

Edit: clarified what "this" was referring to and updated per comment below

empty_strided is probably better as a prim, as it is more powerful than empty_like, and empty_like can easily be expressed with it?

ezyang · 2022-05-16T16:42:14Z

torch/_prims/__init__.py

+    impl_aten=_full_aten,
+    return_type=RETURN_TYPE.NEW,
+    doc=_full_doc,
+)


ditto here, in primitives I'd expect an empty allocation and then an inplace fill afterwards (you do have inplace in primtorch, right?)

We don't have fill at this time, full and full_like are jax.lax operators and they're kind of natural prims, but yes we'll likely model them as references in the future

ezyang · 2022-05-16T16:46:22Z

test/test_ops.py

+                            if _tensor_requires_grad(a):
+                                return True
+                    if isinstance(x, torch.Tensor) and x.requires_grad:
+                        return True


Why not use tree_map here?

yeah, that would work, too

ezyang · 2022-05-16T16:46:47Z

test/test_ops.py

        def _to_tensormeta(x):
            if isinstance(x, torch.Tensor):
                return prims.utils.TensorMeta(x)
+            return x


ezyang · 2022-05-16T16:47:44Z

test/test_ops.py

+        for ei in error_inputs:
+            si = ei.sample_input
+            meta_sample = si.transform(_to_tensormeta)
+            # TODO: match strings


expect tests would be very helpful here, then you wouldn't have to manually type in the correct strings everywhere

ezyang · 2022-05-16T16:53:52Z

torch/_prims/__init__.py

    device: torch.device,
    requires_grad: bool,
 ) -> Tensor:
+    # Note that Mypy thinks torch.full can't accept a complex fill_value


That just means full's pyi annotation is incorrect, need to be generalized a little then

ezyang · 2022-05-16T16:55:29Z

torch/_refs/__init__.py

+    type_promotion_kind,
+    use_opmath,
+    CPU_use_opmath=None,
+    CUDA_use_opmath=None,


why not just lower case here?

yeah that'd be reasonable -- they're capitalized in a lot of the test suite today so I suppose I was thinking of it

ezyang

The CR from me is non-substantive, assuming you can get this to pass tests, merge this whenever the tests are passing. The longer we wait the harder it will be to enforce this.

ngimel · 2022-05-16T22:12:38Z

torch/_prims/__init__.py

+    impl_aten=_empty_like_aten,
+    return_type=RETURN_TYPE.NEW,
+    doc=_empty_like_doc,
+)


empty_strided is probably better as a prim, as it is more powerful than empty_like, and empty_like can easily be expressed with it?

ngimel · 2022-05-16T22:42:24Z

torch/_prims/utils.py

+    for idx, x in enumerate(perm):
+        permuted_shape[idx] = shape[x]
+
+    new_strides = make_contiguous_strides_for(permuted_shape)


nit: I'd expect permuted_strides correspond to permuted_shape, and what you are returning are output strides.

ngimel · 2022-05-16T22:48:46Z

torch/_refs/__init__.py

-    prims.abs, type_promotion_kind=ELEMENTWISE_TYPE_PROMOTION_KIND.COMPLEX_TO_FLOAT
+    prims.abs,
+    type_promotion_kind=ELEMENTWISE_TYPE_PROMOTION_KIND.COMPLEX_TO_FLOAT,
+    use_opmath=False,


…rch_strides_meta

mruberry · 2022-05-18T09:09:14Z

@pytorchbot merge on green

pytorchmergebot · 2022-05-18T09:14:08Z

Merge failed due to Refusing to merge as mandatory check(s) Lint are not yet run for rule superuser
Raised by https://github.com/pytorch/pytorch/actions/runs/2344451181

mruberry · 2022-05-18T13:56:13Z

@pytorchmergebot merge this

jjsjann123 · 2022-05-18T16:35:47Z

torch/_prims/utils.py

+    check_same_shape(*tensors, allow_cpu_scalar_tensors=True)
+
+    # Filters the tensors to actual tensors
+    all_tensors = all(isinstance(a, TensorLike) for a in tensors)


all_tensors is not used.

doh! you're right -- thanks @jjsjann123! I'll get the cleaned up

Summary: This PR... **Filed the Following Issues** - #77553 - #77526 - #77600 **Testing** - Updates test_dtypes to longer attempt to test the backward of sample inputs where no inputs require grad - Adds a new test_python_reference_errors; it ensures the meta operations for references throw errors as expected - Updates compare_tensor_meta to better handle CUDA devices, and (temporarily) restricts stride checking to the CUDA device type - Elementwise unary and elementwise binary operators now have arbitrarily strided reference inputs - Reference inputs for _like functions are added - An OpInfo for torch.empty is added - Reference inputs for torch.clone are added - A NumPy reference for clone is added - Adds OpInfos for refs.empty and refs.empty_like **Prims** - Renames the "max" and "min" prims have been renamed to "maximum" and "minimum," respectively, to better conform to their ATen names - Adds the empty, empty_like, full, and full_like prims - Fixes the elementwise meta function's stride propagation - Fixes clone's meta function's stride propagation - Fixes convert_element_type's meta's stride propagation - Adds a (temporary) _to_dtype pprivate prim that casts a tensor while preserving its stride permutation - Removes the _set prim comment - Adds utils.compute_elementwise_output_strides, which computes the correct output strides for elementwise operations - Corrects an issue where utils.make_contiguous_strides_for was creating the incorrect strides for tensors with no elements **References** - Adds the empty, empty_like, full, full_like, and ones_like refs - Extends make_elementwise_unary_reference to accept an additional callable to perform extra input validation - Adds an extra validation function to handle refs.neg(BoolTensor) - Updates the isfinite ref to call ones_like when appropriate - Models Python scalar handling for elementwise binary operations - Added a 64 dim check for the amin and amax references - opmath is now a flag that can be set separately for cpu and CUDA Pull Request resolved: #77542 Approved by: https://github.com/ezyang Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/580a053832cea61affce5fdb61c737036c8954af Reviewed By: seemethere Differential Revision: D36494082 Pulled By: mruberry fbshipit-source-id: 1f833e53bbd1f50d8658d41dfed8cced99d0ea93

stride prop

e9fa99e

mruberry requested a review from ezyang May 16, 2022 13:25

mruberry requested a review from ngimel as a code owner May 16, 2022 13:25

facebook-github-bot added the cla signed label May 16, 2022

mruberry changed the title ~~stride prop~~ [primTorch] Enforces stride metadata May 16, 2022

Mike Ruberry added 3 commits May 16, 2022 09:51

Merge branch 'master' into primtorch_strides_meta

05976c5

stashes

cbb3bf1

merges

48b587b

ezyang reviewed May 16, 2022

View reviewed changes

updates

2ec69c5

ezyang reviewed May 16, 2022

View reviewed changes

test/test_ops.py

def _to_tensormeta(x):

if isinstance(x, torch.Tensor):

return prims.utils.TensorMeta(x)

return x

Copy link

Contributor

ezyang May 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops lol

ezyang reviewed May 16, 2022

View reviewed changes

ezyang approved these changes May 16, 2022

View reviewed changes

Mike Ruberry added 2 commits May 16, 2022 11:50

abs issue

63e90ab

Merge branch 'master' into primtorch_strides_meta

2a24bcd

mruberry mentioned this pull request May 16, 2022

PrimTorch clone prim meta is incorrect #77360

Closed

ngimel reviewed May 16, 2022

View reviewed changes

Mike Ruberry added 9 commits May 17, 2022 10:45

Merge branch 'master' of ssh://github.com/pytorch/pytorch into primto…

7be8115

…rch_strides_meta

test fixes

858817b

stashes

62fddb9

stashes

d1768cc

Merge branch 'master' of ssh://github.com/pytorch/pytorch into primto…

3c85487

…rch_strides_meta

Merge branch 'master' of ssh://github.com/pytorch/pytorch into primto…

ce5abca

…rch_strides_meta

stashes

5d7b7f6

Merge branch 'master' of ssh://github.com/pytorch/pytorch into primto…

29ce1a4

…rch_strides_meta

rebases

3a4f22e

mruberry added ciflow/all topic: not user facing topic category labels May 18, 2022

test fix

5166070

pytorchmergebot added the Merged label May 18, 2022

pytorchmergebot closed this in 580a053 May 18, 2022

jjsjann123 reviewed May 18, 2022

View reviewed changes

mruberry mentioned this pull request May 18, 2022

[primTorch] Adding activation references for celu, mish, selu, softplus, and tanh #77473

Closed

mruberry deleted the primtorch_strides_meta branch May 19, 2022 18:50

bdhirsh mentioned this pull request Sep 14, 2023

Register decomposition for empty.memory_format into empty_strided #109197

Closed


		result = empty_like(a, device=a.device, dtype=dtype, requires_grad=requires_grad)

		# TODO: review if the no_grad context is the best way to model this

[primTorch] Enforces stride metadata #77542

[primTorch] Enforces stride metadata #77542

Uh oh!

Conversation

mruberry commented May 16, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

facebook-github-bot commented May 16, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful links

❌ 1 New Failures

🕵️ 1 new failure recognized by patterns

pull / linux-xenial-cuda11.3-py3.7-gcc7-bazel-test / build-and-test (1/1)

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mruberry commented May 16, 2022 •

edited

Loading

facebook-github-bot commented May 16, 2022 •

edited

Loading

mruberry May 16, 2022 •

edited

Loading