Skip to content

Conversation

ezyang
Copy link
Contributor

@ezyang ezyang commented Nov 17, 2020

Stack from ghstack:

If you port kernels to be structured, you get Meta kernels automatically
generated for you. This is one payoff of structured kernels.

Code generation was mercifully really simple, although at risk of
"swiss cheese" syndrome: there's two new conditionals in the codegen
to tweak behavior when generating for meta keys. It's not too bad
right now but there's a risk of things getting out of hand. One
way to rationalize the logic here would be to transmit "TensorMeta-ness"
inside the TensorOptions (so tensor_from_meta can deal with it); then
the "Meta" kernel magic would literally just be generating empty
out_impls to call after all the scaffolding is done. But I didn't
do this because it seemed like it would be more annoying short term.

Also had to teach resize_ to work on meta tensors, since we use them
to implement the out kernels.

Signed-off-by: Edward Z. Yang ezyang@fb.com

Differential Revision: D25056640

If you port kernels to be structured, you get Meta kernels automatically
generated for you.  This is one payoff of structured kernels.

Code generation was mercifully really simple, although at risk of
"swiss cheese" syndrome: there's two new conditionals in the codegen
to tweak behavior when generating for meta keys.  It's not too bad
right now but there's a risk of things getting out of hand.  One
way to rationalize the logic here would be to transmit "TensorMeta-ness"
inside the TensorOptions (so tensor_from_meta can deal with it); then
the "Meta" kernel magic would literally just be generating empty
out_impls to call after all the scaffolding is done.  But I didn't
do this because it seemed like it would be more annoying short term.

Also had to teach resize_ to work on meta tensors, since we use them
to implement the out kernels.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request Nov 17, 2020
If you port kernels to be structured, you get Meta kernels automatically
generated for you.  This is one payoff of structured kernels.

Code generation was mercifully really simple, although at risk of
"swiss cheese" syndrome: there's two new conditionals in the codegen
to tweak behavior when generating for meta keys.  It's not too bad
right now but there's a risk of things getting out of hand.  One
way to rationalize the logic here would be to transmit "TensorMeta-ness"
inside the TensorOptions (so tensor_from_meta can deal with it); then
the "Meta" kernel magic would literally just be generating empty
out_impls to call after all the scaffolding is done.  But I didn't
do this because it seemed like it would be more annoying short term.

Also had to teach resize_ to work on meta tensors, since we use them
to implement the out kernels.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

ghstack-source-id: b11c064
Pull Request resolved: #48116
@ezyang ezyang requested review from bhosmer and bdhirsh November 17, 2020 20:45
@dr-ci
Copy link

dr-ci bot commented Nov 17, 2020

💊 CI failures summary and remediations

As of commit 69b0f4d (more details on the Dr. CI page):


None of the CI failures appear to be your fault 💚



This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 20 times.

If you port kernels to be structured, you get Meta kernels automatically
generated for you.  This is one payoff of structured kernels.

Code generation was mercifully really simple, although at risk of
"swiss cheese" syndrome: there's two new conditionals in the codegen
to tweak behavior when generating for meta keys.  It's not too bad
right now but there's a risk of things getting out of hand.  One
way to rationalize the logic here would be to transmit "TensorMeta-ness"
inside the TensorOptions (so tensor_from_meta can deal with it); then
the "Meta" kernel magic would literally just be generating empty
out_impls to call after all the scaffolding is done.  But I didn't
do this because it seemed like it would be more annoying short term.

Also had to teach resize_ to work on meta tensors, since we use them
to implement the out kernels.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request Nov 18, 2020
If you port kernels to be structured, you get Meta kernels automatically
generated for you.  This is one payoff of structured kernels.

Code generation was mercifully really simple, although at risk of
"swiss cheese" syndrome: there's two new conditionals in the codegen
to tweak behavior when generating for meta keys.  It's not too bad
right now but there's a risk of things getting out of hand.  One
way to rationalize the logic here would be to transmit "TensorMeta-ness"
inside the TensorOptions (so tensor_from_meta can deal with it); then
the "Meta" kernel magic would literally just be generating empty
out_impls to call after all the scaffolding is done.  But I didn't
do this because it seemed like it would be more annoying short term.

Also had to teach resize_ to work on meta tensors, since we use them
to implement the out kernels.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

ghstack-source-id: c8e0cba
Pull Request resolved: #48116
// TODO: eliminate indirection
return at::empty_meta(meta.sizes, meta.options);
}

inline Tensor tensor_from_meta(const TensorMeta& meta) {
// TODO: eliminate indirection
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the downside of this indirection? I think the wrapper function name makes it a little clearer what the goal is in the codegen.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually a different indirection; at::empty_meta is making a hop in the dispatcher but it's unnecessary, we could hardcode at::native::empty_meta here. I haven't done it yet because I want to keep implementation simple for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahhh yep

@@ -1672,6 +1672,7 @@
CPU: resize_
CUDA: resize_cuda_
QuantizedCPU: quantized_resize_cpu_
Meta: resize_meta_
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means that we can't make resize_ structured, right? You have a check in gen_structured to assert that people don't explicitly include their own structured implementations

Copy link
Contributor

@bdhirsh bdhirsh Nov 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible alternative: Just call the meta function (resize_meta_) directly rather than through the dispatcher in our codegen. I think that just means that we'll need to call meta_tensor_from_meta to convert it to a tensor directly in the codegen as well. Although you'll only want to do that when the dispatch key is meta, which means emitting more different code for different kernels :/

Copy link
Contributor

@bdhirsh bdhirsh Nov 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aside: I'm not clear on the use case for out-variant meta functions. If your goal is to find the meta-info of the output tensor, you can just examine out argument directly, right?

I could imagine the goal being to take an existing model, stick the Meta dispatch key on all the input tensors, and run everything without having to make changes to the model. That's probably it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means that we can't make resize_ structured, right?

This is correct. In fact, we can't make resize_ structured anyway, because there is no out variant nor purely functional variant (nor would they make sense).

Possible alternative: Just call the meta function (resize_meta_) directly rather than through the dispatcher in our codegen.

We should do this anyway.

I could imagine the goal being to take an existing model, stick the Meta dispatch key on all the input tensors, and run everything without having to make changes to the model. That's probably it.

Yep!

# (not sure why passing None here doesn't work? How strange...)
z = torch.empty_meta(0)
torch._C._nn.upsample_nearest1d(x, (4 * 10 ** 10,), 2, out=z)
self.assertEqual(z.size(), (2 * 10 ** 10, 3, 4 * 10 ** 10))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe worth asserting that we didn't actually perform the computation? Also totally understand that the meta testing is subject to change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not... sure exactly how to test that haha. I'm sort of trying to implicitly test this by using ludicrously large sizes here; if you actually allocated these you'd probably OOM.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just thinking running the meta op, running the actual op, then asserting that their out results aren't equal (I'm not actually sure what the contents of the empta_meta are- if it's uninitialized memory we could just zero it out first).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A meta tensor should give a predictable error if you try to access its data, right? Can we just test by expecting that error here?

In master torch.empty_meta(10)[0] gets you a RuntimeError:

RuntimeError: Could not run 'aten::as_strided' with arguments from the 'Meta' backend.

Expecting a RuntimeError on data access (not on as_strided, just in general) seems like a reasonable proxy for "didn't compute"

If you port kernels to be structured, you get Meta kernels automatically
generated for you.  This is one payoff of structured kernels.

Code generation was mercifully really simple, although at risk of
"swiss cheese" syndrome: there's two new conditionals in the codegen
to tweak behavior when generating for meta keys.  It's not too bad
right now but there's a risk of things getting out of hand.  One
way to rationalize the logic here would be to transmit "TensorMeta-ness"
inside the TensorOptions (so tensor_from_meta can deal with it); then
the "Meta" kernel magic would literally just be generating empty
out_impls to call after all the scaffolding is done.  But I didn't
do this because it seemed like it would be more annoying short term.

Also had to teach resize_ to work on meta tensors, since we use them
to implement the out kernels.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Differential Revision: [D25056640](https://our.internmc.facebook.com/intern/diff/D25056640)

[ghstack-poisoned]
@ezyang
Copy link
Contributor Author

ezyang commented Nov 30, 2020

The ASAN failure on this PR is legit; but it looks like it's just because I picked input sizes that are too big

# (not sure why passing None here doesn't work? How strange...)
z = torch.empty_meta(0)
torch._C._nn.upsample_nearest1d(x, (4 * 10 ** 10,), 2, out=z)
self.assertEqual(z.size(), (2 * 10 ** 10, 3, 4 * 10 ** 10))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A meta tensor should give a predictable error if you try to access its data, right? Can we just test by expecting that error here?

In master torch.empty_meta(10)[0] gets you a RuntimeError:

RuntimeError: Could not run 'aten::as_strided' with arguments from the 'Meta' backend.

Expecting a RuntimeError on data access (not on as_strided, just in general) seems like a reasonable proxy for "didn't compute"

If you port kernels to be structured, you get Meta kernels automatically
generated for you.  This is one payoff of structured kernels.

Code generation was mercifully really simple, although at risk of
"swiss cheese" syndrome: there's two new conditionals in the codegen
to tweak behavior when generating for meta keys.  It's not too bad
right now but there's a risk of things getting out of hand.  One
way to rationalize the logic here would be to transmit "TensorMeta-ness"
inside the TensorOptions (so tensor_from_meta can deal with it); then
the "Meta" kernel magic would literally just be generating empty
out_impls to call after all the scaffolding is done.  But I didn't
do this because it seemed like it would be more annoying short term.

Also had to teach resize_ to work on meta tensors, since we use them
to implement the out kernels.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Differential Revision: [D25056640](https://our.internmc.facebook.com/intern/diff/D25056640)

[ghstack-poisoned]
ezyang added a commit to ezyang/pytorch that referenced this pull request Dec 1, 2020
If you port kernels to be structured, you get Meta kernels automatically
generated for you.  This is one payoff of structured kernels.

Code generation was mercifully really simple, although at risk of
"swiss cheese" syndrome: there's two new conditionals in the codegen
to tweak behavior when generating for meta keys.  It's not too bad
right now but there's a risk of things getting out of hand.  One
way to rationalize the logic here would be to transmit "TensorMeta-ness"
inside the TensorOptions (so tensor_from_meta can deal with it); then
the "Meta" kernel magic would literally just be generating empty
out_impls to call after all the scaffolding is done.  But I didn't
do this because it seemed like it would be more annoying short term.

Also had to teach resize_ to work on meta tensors, since we use them
to implement the out kernels.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

ghstack-source-id: 9e2c1fd
Pull Request resolved: pytorch#48116
ezyang added a commit to ezyang/pytorch that referenced this pull request Dec 2, 2020
If you port kernels to be structured, you get Meta kernels automatically
generated for you.  This is one payoff of structured kernels.

Code generation was mercifully really simple, although at risk of
"swiss cheese" syndrome: there's two new conditionals in the codegen
to tweak behavior when generating for meta keys.  It's not too bad
right now but there's a risk of things getting out of hand.  One
way to rationalize the logic here would be to transmit "TensorMeta-ness"
inside the TensorOptions (so tensor_from_meta can deal with it); then
the "Meta" kernel magic would literally just be generating empty
out_impls to call after all the scaffolding is done.  But I didn't
do this because it seemed like it would be more annoying short term.

Also had to teach resize_ to work on meta tensors, since we use them
to implement the out kernels.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

ghstack-source-id: 9e2c1fd
Pull Request resolved: pytorch#48116
If you port kernels to be structured, you get Meta kernels automatically
generated for you.  This is one payoff of structured kernels.

Code generation was mercifully really simple, although at risk of
"swiss cheese" syndrome: there's two new conditionals in the codegen
to tweak behavior when generating for meta keys.  It's not too bad
right now but there's a risk of things getting out of hand.  One
way to rationalize the logic here would be to transmit "TensorMeta-ness"
inside the TensorOptions (so tensor_from_meta can deal with it); then
the "Meta" kernel magic would literally just be generating empty
out_impls to call after all the scaffolding is done.  But I didn't
do this because it seemed like it would be more annoying short term.

Also had to teach resize_ to work on meta tensors, since we use them
to implement the out kernels.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Differential Revision: [D25056640](https://our.internmc.facebook.com/intern/diff/D25056640)

[ghstack-poisoned]
ezyang added a commit to ezyang/pytorch that referenced this pull request Dec 2, 2020
If you port kernels to be structured, you get Meta kernels automatically
generated for you.  This is one payoff of structured kernels.

Code generation was mercifully really simple, although at risk of
"swiss cheese" syndrome: there's two new conditionals in the codegen
to tweak behavior when generating for meta keys.  It's not too bad
right now but there's a risk of things getting out of hand.  One
way to rationalize the logic here would be to transmit "TensorMeta-ness"
inside the TensorOptions (so tensor_from_meta can deal with it); then
the "Meta" kernel magic would literally just be generating empty
out_impls to call after all the scaffolding is done.  But I didn't
do this because it seemed like it would be more annoying short term.

Also had to teach resize_ to work on meta tensors, since we use them
to implement the out kernels.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

ghstack-source-id: c1642f6
Pull Request resolved: pytorch#48116
@facebook-github-bot
Copy link
Contributor

@ezyang merged this pull request in b4f5efa.

ezyang added a commit that referenced this pull request Dec 2, 2020
Signed-off-by: Edward Z. Yang <ezyang@fb.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request Dec 2, 2020
Signed-off-by: Edward Z. Yang <ezyang@fb.com>

[ghstack-poisoned]
shaibagon pushed a commit to shaibagon/pytorch that referenced this pull request Dec 3, 2020
Summary:
Pull Request resolved: pytorch#48116

If you port kernels to be structured, you get Meta kernels automatically
generated for you.  This is one payoff of structured kernels.

Code generation was mercifully really simple, although at risk of
"swiss cheese" syndrome: there's two new conditionals in the codegen
to tweak behavior when generating for meta keys.  It's not too bad
right now but there's a risk of things getting out of hand.  One
way to rationalize the logic here would be to transmit "TensorMeta-ness"
inside the TensorOptions (so tensor_from_meta can deal with it); then
the "Meta" kernel magic would literally just be generating empty
out_impls to call after all the scaffolding is done.  But I didn't
do this because it seemed like it would be more annoying short term.

Also had to teach resize_ to work on meta tensors, since we use them
to implement the out kernels.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: bhosmer, ailzhang

Differential Revision: D25056640

Pulled By: ezyang

fbshipit-source-id: f8fcfa0dbb58a94d9b4196748f56e155f83b1521
ezyang added a commit to ezyang/pytorch that referenced this pull request Dec 3, 2020
Signed-off-by: Edward Z. Yang <ezyang@fb.com>

ghstack-source-id: 1de9cce
Pull Request resolved: pytorch#48731
facebook-github-bot pushed a commit that referenced this pull request Dec 3, 2020
Summary:
Pull Request resolved: #48731

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: bhosmer

Differential Revision: D25278034

Pulled By: ezyang

fbshipit-source-id: 73652311b48d8d80c06e9385b7ff18ef3a158ae8
@facebook-github-bot facebook-github-bot deleted the gh/ezyang/867/head branch December 6, 2020 15:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants