Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hacky_wrapper_for_legacy_signatures reorders out arguments #48911

Closed
wants to merge 17 commits into from

Conversation

smessmer
Copy link
Contributor

@smessmer smessmer commented Dec 7, 2020

Stack from ghstack:

This enables us to use hacky_wrapper_for_legacy_signatures for ops with out arguments so they can use templated unboxing logic without having to be rewritten.

This only actually enables it for one op as a proof of concept. There will be a separate PR enabling it for more ops.

Differential Revision: D25363336

This enables us to use hacky_wrapper_for_legacy_signatures for ops with out arguments so they can use templated unboxing logic without having to be rewritten.

This only actually enables it for one op as a proof of concept. There will be a separate PR enabling it for more ops.

Differential Revision: [D25363336](https://our.internmc.facebook.com/intern/diff/D25363336/)

[ghstack-poisoned]
@dr-ci
Copy link

dr-ci bot commented Dec 7, 2020

💊 CI failures summary and remediations

As of commit c33786e (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

This comment has been revised 68 times.

This enables us to use hacky_wrapper_for_legacy_signatures for ops with out arguments so they can use templated unboxing logic without having to be rewritten.

This only actually enables it for one op as a proof of concept. There will be a separate PR enabling it for more ops.

Differential Revision: [D25363336](https://our.internmc.facebook.com/intern/diff/D25363336/)

[ghstack-poisoned]
This enables us to use hacky_wrapper_for_legacy_signatures for ops with out arguments so they can use templated unboxing logic without having to be rewritten.

This only actually enables it for one op as a proof of concept. There will be a separate PR enabling it for more ops.

Differential Revision: [D25363336](https://our.internmc.facebook.com/intern/diff/D25363336/)

[ghstack-poisoned]
This enables us to use hacky_wrapper_for_legacy_signatures for ops with out arguments so they can use templated unboxing logic without having to be rewritten.

This only actually enables it for one op as a proof of concept. There will be a separate PR enabling it for more ops.

Differential Revision: [D25363336](https://our.internmc.facebook.com/intern/diff/D25363336/)

[ghstack-poisoned]
This enables us to use hacky_wrapper_for_legacy_signatures for ops with out arguments so they can use templated unboxing logic without having to be rewritten.

This only actually enables it for one op as a proof of concept. There will be a separate PR enabling it for more ops.

Differential Revision: [D25363336](https://our.internmc.facebook.com/intern/diff/D25363336/)

[ghstack-poisoned]
@@ -2,6 +2,7 @@

#include <c10/util/Metaprogramming.h>
#include <c10/util/TypeList.h>
#include <c10/util/IntegerSequence.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ngl: I think I would have preferred it if this was implemented via codegen rather than template

This enables us to use hacky_wrapper_for_legacy_signatures for ops with out arguments so they can use templated unboxing logic without having to be rewritten.

This only actually enables it for one op as a proof of concept. There will be a separate PR enabling it for more ops.

Differential Revision: [D25363336](https://our.internmc.facebook.com/intern/diff/D25363336/)

[ghstack-poisoned]
Copy link
Contributor

@bhosmer bhosmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments inline, but the only big question is whether it's really necessary to handle the general case of out args appearing in arbitrary positions in the signature - doing so makes things much much more complicated than they would need to be to handle the standard pattern.

On a quick search of native_functions.yaml I can't find any ops that actually break the standard pattern. I might have missed such, but if not, I'd like to take advantage by drastically simplifying both this code and the utilities in IntegerSequence.h.

(If there's a use case for the full capability, can you point me at it? I'd dearly love to find some other way of handling it. :)

[Editing to zoom out and note agreement with @ezyang that codegen feels like it would solve this problem more straightforwardly and without loss of generality (e.g. since we're servicing only native_functions ops, out-of-tree access to the functionality isn't an issue). But also worth noting that this path was settled on prior to the recent codegen refactor, which shifted the landscape in which the decision was made somewhat. Since the TTL of hacky_wrapper is quite finite, we've decided to land a template-based approach rather than retool.]

aten/src/ATen/core/boxing/impl/boxing.h Outdated Show resolved Hide resolved
struct with_out_arguments_reordered_impl final {
private:
// For an example op
// > aten::example(Tensor a, int64_t b, Tensor(a!) out_c, int64_t d, Tensor(b!) out_e) -> (Tensor(a!), Tensor(b!))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example op has out arguments sprinkled through the signature, which AFAICT never happens in native_functions.yaml. I think it'd be clearer to use an example of the dominant pattern - out args at the end. (Among other things it would avoid the confusion of describing ops like this as having out arguments at the end, as on line 277.)

More importantly, are there any actual ops where out arguments aren't contiguous at the end? I think this logic (both here and in codegen, but especially here) could be made quite a bit simpler if it didn't have to handle that class of signatures.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right that the current implementation is more general than it would have to be. Not sure if restricting it to the actually occuring scenarios would simplify the logic though. Looking through it, the implementation could take two numbers [OutTensorArgIndicesBegin, OutTensorArgIndicesEnd] to represent a range instead of taking a full list of argument indices. But I think that's it, inside the metaprogramming I'd then still have to build the full OutTensorArgIndices array, so actually the metaprogramming would then contain an additional step and be more complex. Open to suggestions though if you have simplification ideas.

c10::impl::hacky_wrapper_for_legacy_signatures<
{dispatcher_sig.type()},
std::index_sequence<{', '.join(mutable_argument_indices(f.func))}>>(TORCH_FN({sig.name()}))
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be easy to avoid the copy pasted payload here and on 437 I hope - a little helper function rather than a CodeTemplate ideally

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm following the style in the rest of the file where we duplicate and explicitly spell out generated code instead of putting it into subroutines, search for example for "findSchemaOrThrow". But I'm open to writing it either way, wdyt?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this is a misunderstanding of why gen_structured is copy-pasted from gen_unstructured: the goal is that structured kernels never have any of the legacy crud that is in unstructured.

Of course, the reality is more bleak; gen structured has to test for use_c10_dispatcher: full because it prior to this patch series, it literally does not work. But I feel like at least for hacky_wrapper_for_legacy_signatures we should be able to avoid this for structured kernels, right? :o)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smessmer cool, house style works for me. But

@ezyang I could be misreading your comment, but are you thinking we just by fiat declare that a structured kernel/hacky wrapper combo is unsupported? That might be a nice inducement to finish kernel porting promptly, I guess.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. There's only two ops that are structured and I can volunteer to do the port. Need to take a look.

else:
payload = f'torch::CppFunction::makeUnboxedOnly({sig.name()})'
assert local.use_c10_dispatcher() is UseC10Dispatcher.with_codegenerated_unboxing_wrapper
payload = f"torch::CppFunction::makeUnboxedOnly(&{sig.name()})"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change necessitated by the hacky_wrapper stuff?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which change particularly? The added ampersand to get a pointer? No, that's a code style fix -- C++ is kind of inconsistent in that you can get function pointers without the ampersand but need it for getting pointers for anything else. Or do you mean the added assertion? That's just to make sure I handled all cases UseC10Dispatcher can take, just in case we add more states later (hopefully we don't).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant the ampersand :) thanks for the detail!

@smessmer smessmer requested a review from bhosmer December 9, 2020 08:29
This enables us to use hacky_wrapper_for_legacy_signatures for ops with out arguments so they can use templated unboxing logic without having to be rewritten.

This only actually enables it for one op as a proof of concept. There will be a separate PR enabling it for more ops.

Differential Revision: [D25363336](https://our.internmc.facebook.com/intern/diff/D25363336/)

[ghstack-poisoned]
This enables us to use hacky_wrapper_for_legacy_signatures for ops with out arguments so they can use templated unboxing logic without having to be rewritten.

This only actually enables it for one op as a proof of concept. There will be a separate PR enabling it for more ops.

Differential Revision: [D25363336](https://our.internmc.facebook.com/intern/diff/D25363336/)

[ghstack-poisoned]
This enables us to use hacky_wrapper_for_legacy_signatures for ops with out arguments so they can use templated unboxing logic without having to be rewritten.

This only actually enables it for one op as a proof of concept. There will be a separate PR enabling it for more ops.

Differential Revision: [D25363336](https://our.internmc.facebook.com/intern/diff/D25363336/)

[ghstack-poisoned]
This enables us to use hacky_wrapper_for_legacy_signatures for ops with out arguments so they can use templated unboxing logic without having to be rewritten.

This only actually enables it for one op as a proof of concept. There will be a separate PR enabling it for more ops.

Differential Revision: [D25363336](https://our.internmc.facebook.com/intern/diff/D25363336/)

[ghstack-poisoned]
This enables us to use hacky_wrapper_for_legacy_signatures for ops with out arguments so they can use templated unboxing logic without having to be rewritten.

This only actually enables it for one op as a proof of concept. There will be a separate PR enabling it for more ops.

Differential Revision: [D25363336](https://our.internmc.facebook.com/intern/diff/D25363336/)

[ghstack-poisoned]
This enables us to use hacky_wrapper_for_legacy_signatures for ops with out arguments so they can use templated unboxing logic without having to be rewritten.

This only actually enables it for one op as a proof of concept. There will be a separate PR enabling it for more ops.

Differential Revision: [D25363336](https://our.internmc.facebook.com/intern/diff/D25363336/)

[ghstack-poisoned]
This enables us to use hacky_wrapper_for_legacy_signatures for ops with out arguments so they can use templated unboxing logic without having to be rewritten.

This only actually enables it for one op as a proof of concept. There will be a separate PR enabling it for more ops.

Differential Revision: [D25363336](https://our.internmc.facebook.com/intern/diff/D25363336/)

[ghstack-poisoned]
This enables us to use hacky_wrapper_for_legacy_signatures for ops with out arguments so they can use templated unboxing logic without having to be rewritten.

This only actually enables it for one op as a proof of concept. There will be a separate PR enabling it for more ops.

Differential Revision: [D25363336](https://our.internmc.facebook.com/intern/diff/D25363336/)

[ghstack-poisoned]
This enables us to use hacky_wrapper_for_legacy_signatures for ops with out arguments so they can use templated unboxing logic without having to be rewritten.

This only actually enables it for one op as a proof of concept. There will be a separate PR enabling it for more ops.

Differential Revision: [D25363336](https://our.internmc.facebook.com/intern/diff/D25363336/)

[ghstack-poisoned]
Copy link
Contributor

@bhosmer bhosmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! I think the simplified version is much clearer and easier to follow.

else:
payload = f'torch::CppFunction::makeUnboxedOnly({sig.name()})'
assert local.use_c10_dispatcher() is UseC10Dispatcher.with_codegenerated_unboxing_wrapper
payload = f"torch::CppFunction::makeUnboxedOnly(&{sig.name()})"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant the ampersand :) thanks for the detail!

c10::impl::hacky_wrapper_for_legacy_signatures<
{dispatcher_sig.type()},
std::index_sequence<{', '.join(mutable_argument_indices(f.func))}>>(TORCH_FN({sig.name()}))
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smessmer cool, house style works for me. But

@ezyang I could be misreading your comment, but are you thinking we just by fiat declare that a structured kernel/hacky wrapper combo is unsupported? That might be a nice inducement to finish kernel porting promptly, I guess.

This enables us to use hacky_wrapper_for_legacy_signatures for ops with out arguments so they can use templated unboxing logic without having to be rewritten.

This only actually enables it for one op as a proof of concept. There will be a separate PR enabling it for more ops.

Differential Revision: [D25363336](https://our.internmc.facebook.com/intern/diff/D25363336/)

[ghstack-poisoned]
This enables us to use hacky_wrapper_for_legacy_signatures for ops with out arguments so they can use templated unboxing logic without having to be rewritten.

This only actually enables it for one op as a proof of concept. There will be a separate PR enabling it for more ops.

Differential Revision: [D25363336](https://our.internmc.facebook.com/intern/diff/D25363336/)

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 56a157f.

@facebook-github-bot facebook-github-bot deleted the gh/smessmer/271/head branch December 14, 2020 15:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants