-
Notifications
You must be signed in to change notification settings - Fork 26.2k
add to ExternalBackendFunction data model #56834
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
[ghstack-poisoned]
💊 CI failures summary and remediationsAs of commit 245034d (more details on the Dr. CI page):
1 failure not recognized by patterns:
This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions to the (internal) Dr. CI Users group. |
[ghstack-poisoned]
A few tweaks to the codegen that make the next PR (generating out/inplace wrappers) easier. 1. [**main change**] When generating `ExternalBackendFunction` objects, I added the backend's kernel to the `dispatch` entry of `NativeFunction`. The invariant now is that if `ExternalBackendFunction` implements the `XLA` backend, then it should expect its corresponding `NativeFunction` to have an `XLA` entry in the dispatch table. I do that with a new `with_dispatch_key()` method on `NativeFunction`. This makes it easier to re-use the code in `register_dispatch_key.py`, since the external logic now is mostly just "another dispatch key" (with some exceptions; see the next PR). I'm open to other opinions though. 2. Doing that requires parsing the backend into a valid dispatch key, so I added more validations and tests. We also technically need to parse the backend's autograd key if they've provided any autograd kernels (if xla provides "backend: XLA" and they have an autograd section, I try to parse the "AutogradXLA" key), so I added tests for that too. 3. I ended up removing `ExternalBackendFunctionsGroup.from_function_group` and putting the logic directly in `gen_backend_stubs.py`. Kind of annoying, but I mostly did it because storing the kernel name directly in `ExternalBackendFunction` (in the dispatch table) requires calling the dispatcher API, and we can't add a dependency on `dispatcher.py` directly in `model.py`. We don't have that problem with NativeFunction objects because the kernel name is provided directly in the yaml, whereas for external ops, we force them to follow the dispatcher convention for naming. [ghstack-poisoned]
| return str(self).lower() | ||
|
|
||
| def is_autograd_key(self) -> bool: | ||
| return 'Autograd' in str(self) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps, if str(self).startswith('Autograd') as a more restrictive option
| try: | ||
| return DispatchKey.parse(value) | ||
| except AssertionError: | ||
| return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will be wondering why we need this. If we actually need to detect parse errors we'll need to roll out a separate exception hierarchy for them; catching assert errors is too dangerous
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair; I can rewrite this to avoid the try/catch. Alternatively, if we're fine with the original parsing error message then we don't need this
| ) | ||
|
|
||
| @staticmethod | ||
| def with_dispatch_entry(f: 'NativeFunction', dispatch_key: DispatchKey, kernel: str) -> 'NativeFunction': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a little odd this isn't a method, given that it takes a NativeFunction as argument
|
|
||
| backend = yaml_values.pop('backend', None) | ||
| assert backend is not None, 'You must provide a value for "backend"' | ||
| backend_key = DispatchKey.try_parse(backend) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't you just rely on the parser failure propagating here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I totally can- I added this to make the error message clearer. That way if e.g., a backend without a corresponding autograd key tried to add an autograd entry, they'd get a more actionable error message than a parsing error.
Do you think the parsing error is good enough? We probably want good documentation on the new codegen anyway, that makes it clear how to opt into autograd kernels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another way to make the error better is to "push an error context" before doing the parse. The preexisting example:
with context(f'in native_functions.yaml line {f.loc}:\n {f.func}'):
| dispatch_key = DispatchKey.parse(f'Autograd{backend}') \ | ||
| if m is not None and m.is_autograd else DispatchKey.parse(backend) | ||
| kernel = kernel_name(f.func) | ||
| return ExternalBackendFunction(NativeFunction.with_dispatch_entry(f, dispatch_key, kernel), dispatch_key, m) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking back to https://github.com/pytorch/pytorch/pull/55050/files#r616215579
The stated motivation for patching in the dispatch information is to make it easier to reuse existing code in the codegen. This is very reasonable. But to me this more and more bends the orientation of the system towards trying to reuse the existing data structures or directly adding in the information as you need, and then augmenting it with side data (so the initial model doesn't have to changed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After talking to Ed, I'm working on a new patch that will change most of the contents of this PR:
Rather than augmenting NativeFunction directly to be external-backend-aware, I'm going to move data on NativeFunction that is backend-dependent into a different struct. Important stuff it'll contain includes (a) the kernel name (different per backend), and (b) whether the kernel is structured (technically the same for all in-tree backends, but can be different per external backend).
| kernel = kernel_name(g.out.func) | ||
| dispatch_key = DispatchKey.parse(f'Autograd{backend}') \ | ||
| if out_meta is not None and out_meta.is_autograd else DispatchKey.parse(backend) | ||
| out = ExternalBackendFunction(NativeFunction.with_dispatch_entry(g.out, dispatch_key, kernel), dispatch_key, out_meta) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The three blocks here feel juuuust long enough to want some deduping
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair 😛
ezyang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okey dokey
|
Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as |
A few tweaks to the codegen that make the next PR (generating out/inplace wrappers) easier.
ExternalBackendFunctionobjects, I added the backend's kernel to thedispatchentry ofNativeFunction. The invariant now is that ifExternalBackendFunctionimplements theXLAbackend, then it should expect its correspondingNativeFunctionto have anXLAentry in the dispatch table. I do that with a newwith_dispatch_key()method onNativeFunction. This makes it easier to re-use the code inregister_dispatch_key.py, since the external logic now is mostly just "another dispatch key" (with some exceptions; see the next PR). I'm open to other opinions though.ExternalBackendFunctionsGroup.from_function_groupand putting the logic directly ingen_backend_stubs.py. Kind of annoying, but I mostly did it because storing the kernel name directly inExternalBackendFunction(in the dispatch table) requires calling the dispatcher API, and we can't add a dependency ondispatcher.pydirectly inmodel.py. We don't have that problem with NativeFunction objects because the kernel name is provided directly in the yaml, whereas for external ops, we force them to follow the dispatcher convention for naming.Stack from ghstack: