Structured kernels generate Meta registrations #48116

ezyang · 2020-11-17T19:51:51Z

Stack from ghstack:

Delete NativeFunctions.h include from Functions.h #48687 Delete NativeFunctions.h include from Functions.h
Move var and std overloads to Functions.cpp and remove native:: reference #48683 Move var and std overloads to Functions.cpp and remove native:: reference
Refactor TensorIterator to do allocations via MetaBase::set_output #48659 Refactor TensorIterator to do allocations via MetaBase::set_output
Move argument grouping into FunctionSchema #48195 Move argument grouping into FunctionSchema
Refactor argument fields in FunctionSchema to Arguments #48182 Refactor argument fields in FunctionSchema to Arguments
Structured kernels generate Meta registrations #48116 Structured kernels generate Meta registrations

If you port kernels to be structured, you get Meta kernels automatically
generated for you. This is one payoff of structured kernels.

Code generation was mercifully really simple, although at risk of
"swiss cheese" syndrome: there's two new conditionals in the codegen
to tweak behavior when generating for meta keys. It's not too bad
right now but there's a risk of things getting out of hand. One
way to rationalize the logic here would be to transmit "TensorMeta-ness"
inside the TensorOptions (so tensor_from_meta can deal with it); then
the "Meta" kernel magic would literally just be generating empty
out_impls to call after all the scaffolding is done. But I didn't
do this because it seemed like it would be more annoying short term.

Also had to teach resize_ to work on meta tensors, since we use them
to implement the out kernels.

Signed-off-by: Edward Z. Yang ezyang@fb.com

Differential Revision: D25056640

If you port kernels to be structured, you get Meta kernels automatically generated for you. This is one payoff of structured kernels. Code generation was mercifully really simple, although at risk of "swiss cheese" syndrome: there's two new conditionals in the codegen to tweak behavior when generating for meta keys. It's not too bad right now but there's a risk of things getting out of hand. One way to rationalize the logic here would be to transmit "TensorMeta-ness" inside the TensorOptions (so tensor_from_meta can deal with it); then the "Meta" kernel magic would literally just be generating empty out_impls to call after all the scaffolding is done. But I didn't do this because it seemed like it would be more annoying short term. Also had to teach resize_ to work on meta tensors, since we use them to implement the out kernels. Signed-off-by: Edward Z. Yang <ezyang@fb.com> [ghstack-poisoned]

If you port kernels to be structured, you get Meta kernels automatically generated for you. This is one payoff of structured kernels. Code generation was mercifully really simple, although at risk of "swiss cheese" syndrome: there's two new conditionals in the codegen to tweak behavior when generating for meta keys. It's not too bad right now but there's a risk of things getting out of hand. One way to rationalize the logic here would be to transmit "TensorMeta-ness" inside the TensorOptions (so tensor_from_meta can deal with it); then the "Meta" kernel magic would literally just be generating empty out_impls to call after all the scaffolding is done. But I didn't do this because it seemed like it would be more annoying short term. Also had to teach resize_ to work on meta tensors, since we use them to implement the out kernels. Signed-off-by: Edward Z. Yang <ezyang@fb.com> ghstack-source-id: b11c064 Pull Request resolved: #48116

dr-ci · 2020-11-17T23:40:41Z

💊 CI failures summary and remediations

As of commit 69b0f4d (more details on the Dr. CI page):

✅ None of the CI failures appear to be your fault 💚

2/2 broken upstream at merge base f798696 on Dec 01 from 1:08pm to 6:50pm PDT (18 commits; c5f1117 - 25e367e)

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 20 times.

If you port kernels to be structured, you get Meta kernels automatically generated for you. This is one payoff of structured kernels. Code generation was mercifully really simple, although at risk of "swiss cheese" syndrome: there's two new conditionals in the codegen to tweak behavior when generating for meta keys. It's not too bad right now but there's a risk of things getting out of hand. One way to rationalize the logic here would be to transmit "TensorMeta-ness" inside the TensorOptions (so tensor_from_meta can deal with it); then the "Meta" kernel magic would literally just be generating empty out_impls to call after all the scaffolding is done. But I didn't do this because it seemed like it would be more annoying short term. Also had to teach resize_ to work on meta tensors, since we use them to implement the out kernels. Signed-off-by: Edward Z. Yang <ezyang@fb.com> [ghstack-poisoned]

If you port kernels to be structured, you get Meta kernels automatically generated for you. This is one payoff of structured kernels. Code generation was mercifully really simple, although at risk of "swiss cheese" syndrome: there's two new conditionals in the codegen to tweak behavior when generating for meta keys. It's not too bad right now but there's a risk of things getting out of hand. One way to rationalize the logic here would be to transmit "TensorMeta-ness" inside the TensorOptions (so tensor_from_meta can deal with it); then the "Meta" kernel magic would literally just be generating empty out_impls to call after all the scaffolding is done. But I didn't do this because it seemed like it would be more annoying short term. Also had to teach resize_ to work on meta tensors, since we use them to implement the out kernels. Signed-off-by: Edward Z. Yang <ezyang@fb.com> ghstack-source-id: c8e0cba Pull Request resolved: #48116

bdhirsh · 2020-11-18T18:21:24Z

aten/src/ATen/TensorMeta.h

+  // TODO: eliminate indirection
+  return at::empty_meta(meta.sizes, meta.options);
+}
+
 inline Tensor tensor_from_meta(const TensorMeta& meta) {
  // TODO: eliminate indirection


What's the downside of this indirection? I think the wrapper function name makes it a little clearer what the goal is in the codegen.

It's actually a different indirection; at::empty_meta is making a hop in the dispatcher but it's unnecessary, we could hardcode at::native::empty_meta here. I haven't done it yet because I want to keep implementation simple for now.

bdhirsh · 2020-11-18T18:26:48Z

aten/src/ATen/native/native_functions.yaml

@@ -1672,6 +1672,7 @@
    CPU: resize_
    CUDA: resize_cuda_
    QuantizedCPU: quantized_resize_cpu_
+    Meta: resize_meta_


This means that we can't make resize_ structured, right? You have a check in gen_structured to assert that people don't explicitly include their own structured implementations

Possible alternative: Just call the meta function (resize_meta_) directly rather than through the dispatcher in our codegen. I think that just means that we'll need to call meta_tensor_from_meta to convert it to a tensor directly in the codegen as well. Although you'll only want to do that when the dispatch key is meta, which means emitting more different code for different kernels :/

aside: I'm not clear on the use case for out-variant meta functions. If your goal is to find the meta-info of the output tensor, you can just examine out argument directly, right?

I could imagine the goal being to take an existing model, stick the Meta dispatch key on all the input tensors, and run everything without having to make changes to the model. That's probably it.

This means that we can't make resize_ structured, right?

This is correct. In fact, we can't make resize_ structured anyway, because there is no out variant nor purely functional variant (nor would they make sense).

Possible alternative: Just call the meta function (resize_meta_) directly rather than through the dispatcher in our codegen.

We should do this anyway.

I could imagine the goal being to take an existing model, stick the Meta dispatch key on all the input tensors, and run everything without having to make changes to the model. That's probably it.

Yep!

bdhirsh · 2020-11-18T18:28:28Z

test/test_torch.py

+            # (not sure why passing None here doesn't work? How strange...)
+            z = torch.empty_meta(0)
+            torch._C._nn.upsample_nearest1d(x, (4 * 10 ** 10,), 2, out=z)
+            self.assertEqual(z.size(), (2 * 10 ** 10, 3, 4 * 10 ** 10))


maybe worth asserting that we didn't actually perform the computation? Also totally understand that the meta testing is subject to change.

I'm not... sure exactly how to test that haha. I'm sort of trying to implicitly test this by using ludicrously large sizes here; if you actually allocated these you'd probably OOM.

I was just thinking running the meta op, running the actual op, then asserting that their out results aren't equal (I'm not actually sure what the contents of the empta_meta are- if it's uninitialized memory we could just zero it out first).

A meta tensor should give a predictable error if you try to access its data, right? Can we just test by expecting that error here?

In master torch.empty_meta(10)[0] gets you a RuntimeError:

RuntimeError: Could not run 'aten::as_strided' with arguments from the 'Meta' backend.

Expecting a RuntimeError on data access (not on as_strided, just in general) seems like a reasonable proxy for "didn't compute"

If you port kernels to be structured, you get Meta kernels automatically generated for you. This is one payoff of structured kernels. Code generation was mercifully really simple, although at risk of "swiss cheese" syndrome: there's two new conditionals in the codegen to tweak behavior when generating for meta keys. It's not too bad right now but there's a risk of things getting out of hand. One way to rationalize the logic here would be to transmit "TensorMeta-ness" inside the TensorOptions (so tensor_from_meta can deal with it); then the "Meta" kernel magic would literally just be generating empty out_impls to call after all the scaffolding is done. But I didn't do this because it seemed like it would be more annoying short term. Also had to teach resize_ to work on meta tensors, since we use them to implement the out kernels. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: [D25056640](https://our.internmc.facebook.com/intern/diff/D25056640) [ghstack-poisoned]

ezyang · 2020-11-30T20:22:52Z

The ASAN failure on this PR is legit; but it looks like it's just because I picked input sizes that are too big

bhosmer · 2020-11-30T20:51:51Z

test/test_torch.py

+            # (not sure why passing None here doesn't work? How strange...)
+            z = torch.empty_meta(0)
+            torch._C._nn.upsample_nearest1d(x, (4 * 10 ** 10,), 2, out=z)
+            self.assertEqual(z.size(), (2 * 10 ** 10, 3, 4 * 10 ** 10))


A meta tensor should give a predictable error if you try to access its data, right? Can we just test by expecting that error here?

In master torch.empty_meta(10)[0] gets you a RuntimeError:

RuntimeError: Could not run 'aten::as_strided' with arguments from the 'Meta' backend.

Expecting a RuntimeError on data access (not on as_strided, just in general) seems like a reasonable proxy for "didn't compute"

If you port kernels to be structured, you get Meta kernels automatically generated for you. This is one payoff of structured kernels. Code generation was mercifully really simple, although at risk of "swiss cheese" syndrome: there's two new conditionals in the codegen to tweak behavior when generating for meta keys. It's not too bad right now but there's a risk of things getting out of hand. One way to rationalize the logic here would be to transmit "TensorMeta-ness" inside the TensorOptions (so tensor_from_meta can deal with it); then the "Meta" kernel magic would literally just be generating empty out_impls to call after all the scaffolding is done. But I didn't do this because it seemed like it would be more annoying short term. Also had to teach resize_ to work on meta tensors, since we use them to implement the out kernels. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: [D25056640](https://our.internmc.facebook.com/intern/diff/D25056640) [ghstack-poisoned]

If you port kernels to be structured, you get Meta kernels automatically generated for you. This is one payoff of structured kernels. Code generation was mercifully really simple, although at risk of "swiss cheese" syndrome: there's two new conditionals in the codegen to tweak behavior when generating for meta keys. It's not too bad right now but there's a risk of things getting out of hand. One way to rationalize the logic here would be to transmit "TensorMeta-ness" inside the TensorOptions (so tensor_from_meta can deal with it); then the "Meta" kernel magic would literally just be generating empty out_impls to call after all the scaffolding is done. But I didn't do this because it seemed like it would be more annoying short term. Also had to teach resize_ to work on meta tensors, since we use them to implement the out kernels. Signed-off-by: Edward Z. Yang <ezyang@fb.com> ghstack-source-id: 9e2c1fd Pull Request resolved: pytorch#48116

If you port kernels to be structured, you get Meta kernels automatically generated for you. This is one payoff of structured kernels. Code generation was mercifully really simple, although at risk of "swiss cheese" syndrome: there's two new conditionals in the codegen to tweak behavior when generating for meta keys. It's not too bad right now but there's a risk of things getting out of hand. One way to rationalize the logic here would be to transmit "TensorMeta-ness" inside the TensorOptions (so tensor_from_meta can deal with it); then the "Meta" kernel magic would literally just be generating empty out_impls to call after all the scaffolding is done. But I didn't do this because it seemed like it would be more annoying short term. Also had to teach resize_ to work on meta tensors, since we use them to implement the out kernels. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: [D25056640](https://our.internmc.facebook.com/intern/diff/D25056640) [ghstack-poisoned]

If you port kernels to be structured, you get Meta kernels automatically generated for you. This is one payoff of structured kernels. Code generation was mercifully really simple, although at risk of "swiss cheese" syndrome: there's two new conditionals in the codegen to tweak behavior when generating for meta keys. It's not too bad right now but there's a risk of things getting out of hand. One way to rationalize the logic here would be to transmit "TensorMeta-ness" inside the TensorOptions (so tensor_from_meta can deal with it); then the "Meta" kernel magic would literally just be generating empty out_impls to call after all the scaffolding is done. But I didn't do this because it seemed like it would be more annoying short term. Also had to teach resize_ to work on meta tensors, since we use them to implement the out kernels. Signed-off-by: Edward Z. Yang <ezyang@fb.com> ghstack-source-id: c1642f6 Pull Request resolved: pytorch#48116

facebook-github-bot · 2020-12-02T17:12:26Z

@ezyang merged this pull request in b4f5efa.

Signed-off-by: Edward Z. Yang <ezyang@fb.com> [ghstack-poisoned]

Summary: Pull Request resolved: pytorch#48116 If you port kernels to be structured, you get Meta kernels automatically generated for you. This is one payoff of structured kernels. Code generation was mercifully really simple, although at risk of "swiss cheese" syndrome: there's two new conditionals in the codegen to tweak behavior when generating for meta keys. It's not too bad right now but there's a risk of things getting out of hand. One way to rationalize the logic here would be to transmit "TensorMeta-ness" inside the TensorOptions (so tensor_from_meta can deal with it); then the "Meta" kernel magic would literally just be generating empty out_impls to call after all the scaffolding is done. But I didn't do this because it seemed like it would be more annoying short term. Also had to teach resize_ to work on meta tensors, since we use them to implement the out kernels. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: bhosmer, ailzhang Differential Revision: D25056640 Pulled By: ezyang fbshipit-source-id: f8fcfa0dbb58a94d9b4196748f56e155f83b1521

Signed-off-by: Edward Z. Yang <ezyang@fb.com> ghstack-source-id: 1de9cce Pull Request resolved: pytorch#48731

Summary: Pull Request resolved: #48731 Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: bhosmer Differential Revision: D25278034 Pulled By: ezyang fbshipit-source-id: 73652311b48d8d80c06e9385b7ff18ef3a158ae8

This was referenced Nov 17, 2020

Structured kernel definitions #45277

Closed

Get TestTorch.test_empty_meta working again #48113

Closed

facebook-github-bot added the cla signed label Nov 17, 2020

ezyang requested review from bhosmer and bdhirsh November 17, 2020 20:45

ezyang mentioned this pull request Nov 18, 2020

Refactor argument fields in FunctionSchema to Arguments #48182

Closed

bdhirsh reviewed Nov 18, 2020

View reviewed changes

This was referenced Nov 18, 2020

Move argument grouping into FunctionSchema #48195

Closed

[POC] Class-based scaffolding for structured functions #48262

Merged

bhosmer approved these changes Nov 30, 2020

View reviewed changes

ezyang mentioned this pull request Dec 1, 2020

Refactor TensorIterator to do allocations via MetaBase::set_output #48659

Closed

This was referenced Dec 2, 2020

Move var and std overloads to Functions.cpp and remove native:: reference #48683

Closed

Delete NativeFunctions.h include from Functions.h #48687

Closed

facebook-github-bot closed this in b4f5efa Dec 2, 2020

facebook-github-bot added the Merged label Dec 2, 2020

ezyang mentioned this pull request Dec 2, 2020

Generalize some TensorIterator consumers to take TensorIteratorBase #48727

Closed

ezyang added a commit that referenced this pull request Dec 2, 2020

Fix code review from #48659 and #48116

789e601

Signed-off-by: Edward Z. Yang <ezyang@fb.com> [ghstack-poisoned]

This was referenced Dec 2, 2020

Fix code review from #48659 and #48116 #48731

Closed

Header cleanup #48728

Closed

Move TensorIterator from ATen/native to ATen #48729

Closed

Class-based structured kernels, with migration of add to framework #48718

Closed

ezyang added a commit that referenced this pull request Dec 2, 2020

Update on "Fix code review from #48659 and #48116"

1ebd284

Signed-off-by: Edward Z. Yang <ezyang@fb.com> [ghstack-poisoned]

ezyang added a commit to ezyang/pytorch that referenced this pull request Dec 3, 2020

Fix code review from pytorch#48659 and pytorch#48116

3d330c4

Signed-off-by: Edward Z. Yang <ezyang@fb.com> ghstack-source-id: 1de9cce Pull Request resolved: pytorch#48731

facebook-github-bot deleted the gh/ezyang/867/head branch December 6, 2020 15:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Structured kernels generate Meta registrations #48116

Structured kernels generate Meta registrations #48116

Uh oh!

ezyang commented Nov 17, 2020 •

edited

Loading

Uh oh!

dr-ci bot commented Nov 17, 2020 •

edited

Loading

Uh oh!

bdhirsh Nov 18, 2020

Uh oh!

ezyang Nov 18, 2020

Uh oh!

bdhirsh Nov 18, 2020

Uh oh!

bdhirsh Nov 18, 2020

Uh oh!

bdhirsh Nov 18, 2020 •

edited

Loading

Uh oh!

bdhirsh Nov 18, 2020 •

edited

Loading

Uh oh!

ezyang Nov 18, 2020

Uh oh!

bdhirsh Nov 18, 2020

Uh oh!

ezyang Nov 18, 2020

Uh oh!

bdhirsh Nov 18, 2020

Uh oh!

bhosmer Nov 30, 2020

Uh oh!

ezyang commented Nov 30, 2020 •

edited

Loading

Uh oh!

bhosmer Nov 30, 2020

Uh oh!

facebook-github-bot commented Dec 2, 2020

Uh oh!

Uh oh!

Structured kernels generate Meta registrations #48116

Structured kernels generate Meta registrations #48116

Uh oh!

Conversation

ezyang commented Nov 17, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dr-ci bot commented Nov 17, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 CI failures summary and remediations

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bdhirsh Nov 18, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bdhirsh Nov 18, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ezyang commented Nov 30, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Dec 2, 2020

Uh oh!

Uh oh!

ezyang commented Nov 17, 2020 •

edited

Loading

dr-ci bot commented Nov 17, 2020 •

edited

Loading

bdhirsh Nov 18, 2020 •

edited

Loading

bdhirsh Nov 18, 2020 •

edited

Loading

ezyang commented Nov 30, 2020 •

edited

Loading