-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Support view returns for functional inverses on narrowing views #115893
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
[ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/115893
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 0ba36a6 with merge base 7b7f11f ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
…owing views" Part 1 of implementation for general [subclass view fake-ification](https://docs.google.com/document/d/1C5taWiplmX7nKiURXDOAZG2W5VNJ2iV0fQFq92H0Cxw). The following functional inverses are currently implemented scatter-style and thus don't properly respect `reapply_views=True`: * `as_strided_copy_inverse()` * `diagonal_copy_inverse()` * `select_copy_int_inverse()` * `slice_copy_Tensor_inverse()` * `split_copy_Tensor_inverse()` * `split_with_sizes_copy_inverse()` * `unbind_copy_int_inverse()` * `unfold_copy_inverse()` We need `reapply_views=True` to be respected to get actual views for the introduction of reverse view funcs coming next. Details: * Use `as_strided()` to implement inverses for the above when `reapply_views=True` * Assumes we're given a mutated_view that is actually part of a bigger storage; this isn't really the case for functionalization * Make sure functionalization works as before * Change codegen to unconditionally pass `reapply_views=False` for the above to keep the old behavior TODO: More tests to make sure all the above are covered. At least some of them are covered in the pre-existing tests. [ghstack-poisoned]
…owing views" Part 1 of implementation for general [subclass view fake-ification](https://docs.google.com/document/d/1C5taWiplmX7nKiURXDOAZG2W5VNJ2iV0fQFq92H0Cxw). The following functional inverses are currently implemented scatter-style and thus don't properly respect `reapply_views=True`: * `as_strided_copy_inverse()` * `diagonal_copy_inverse()` * `expand_copy_inverse()` * `select_copy_int_inverse()` * `slice_copy_Tensor_inverse()` * `split_copy_Tensor_inverse()` * `split_with_sizes_copy_inverse()` * `unbind_copy_int_inverse()` * `unfold_copy_inverse()` We need `reapply_views=True` to be respected to get actual views for the introduction of reverse view funcs coming next. Details: * Use `as_strided()` to implement inverses for the above when `reapply_views=True` * Assumes we're given a mutated_view that is actually part of a bigger storage; this isn't really the case for functionalization * Make sure functionalization works as before * Adds `called_by_functionalization` flag to functional inverse signatures * Adds tests to ensure old behavior for above inverses **in functionalization** even when `reapply_views=True` [ghstack-poisoned]
Do you really need to add this flag? Not that the |
Nice, I'd love to avoid this flag :) lemme try this out. Update: I tried this out but it results in views being returned for functionalization when we actually don't want them. I'm unsure how to get around this without a flag that indicates whether or not we're in a functionalization context. Is there some other proxy we can use to determine this with info we have available?
Hm isn't it the other way around? i.e. for functionalization, it may pass |
@albanD we discussed it a bit here: #115893 (comment): the tldr is that: (1) functionalization has an existing error check in (2) If you replace all of your views with view-copies (e.g. What's annoying is that before this PR, (1) functionalization uses (2) We now need a way to differentiate between view_inverse calls made by functionalization (which normally reapply_views, but can't in some cases like unfold or slice), and view_inverse calls made by autograd's view_inverse logic (which always wants a view, and is ok with as_strided showing up sometimes. WDYT? |
…owing views" Part 1 of implementation for general [subclass view fake-ification](https://docs.google.com/document/d/1C5taWiplmX7nKiURXDOAZG2W5VNJ2iV0fQFq92H0Cxw). The following functional inverses are currently implemented scatter-style and thus don't properly respect `reapply_views=True`: * `as_strided_copy_inverse()` * `diagonal_copy_inverse()` * `expand_copy_inverse()` * `select_copy_int_inverse()` * `slice_copy_Tensor_inverse()` * `split_copy_Tensor_inverse()` * `split_with_sizes_copy_inverse()` * `unbind_copy_int_inverse()` * `unfold_copy_inverse()` We need `reapply_views=True` to be respected to get actual views for the introduction of reverse view funcs coming next. Details: * Use `as_strided()` to implement inverses for the above when `reapply_views=True` * Assumes we're given a mutated_view that is actually part of a bigger storage; this isn't really the case for functionalization * Make sure functionalization works as before * Adds `called_by_functionalization` flag to functional inverse signatures * Adds tests to ensure old behavior for above inverses **in functionalization** even when `reapply_views=True` [ghstack-poisoned]
As discussed offline, I introduced an enum to cover the 3 cases we need (see PR desc for more details). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
Did we agree to introduce new [slice|select]_inverse
native ops, to make NestedTensor's life easier so it doesn't have to implement as_strided?
I plan to toss these into a follow-up PR as needed, if that's cool with you |
…views" Part 1 of implementation for general [subclass view fake-ification](https://docs.google.com/document/d/1C5taWiplmX7nKiURXDOAZG2W5VNJ2iV0fQFq92H0Cxw). The following functional inverses are currently implemented scatter-style and thus never return views: * `as_strided_copy_inverse()` * `diagonal_copy_inverse()` * `expand_copy_inverse()` * `select_copy_int_inverse()` * `slice_copy_Tensor_inverse()` * `split_copy_Tensor_inverse()` * `split_with_sizes_copy_inverse()` * `unbind_copy_int_inverse()` * `unfold_copy_inverse()` We need to get actual views for the introduction of reverse view funcs coming next. Details: * Use `as_strided()` to implement actual view inverses for the above * Assumes we're given a mutated_view that is actually part of a bigger storage; this isn't really the case for functionalization * Introduce `InverseReturnMode` enum for customization of functional inverses * `AlwaysView` - always return an actual view; needed for reverse view_funcs() * `NeverView` - always do a copy; useful for certain functionalization use cases (e.g. XLA, executorch) * `ViewOrScatterInverse` - return an actual view in most cases, but prefer scatter inverses when they exist. this avoids the need to implement `as_strided()` for subclasses, which can be difficult or impossible * Make sure functionalization works as before * Use `ViewOrScatterInverse` when reapply_views TLS is True or `NeverView` otherwise * Adds tests to ensure old behavior for above inverses **in functionalization** [ghstack-poisoned]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
@pytorchbot merge |
Merge failedReason: Not merging any PRs at the moment because there is a merge blocking https://github.com/pytorch/pytorch/labels/ci:%20sev issue open at: Details for Dev Infra teamRaised by workflow job |
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Stack from ghstack (oldest at bottom):
Part 1 of implementation for general subclass view fake-ification.
The following functional inverses are currently implemented scatter-style and thus never return views:
as_strided_copy_inverse()
diagonal_copy_inverse()
expand_copy_inverse()
select_copy_int_inverse()
slice_copy_Tensor_inverse()
split_copy_Tensor_inverse()
split_with_sizes_copy_inverse()
unbind_copy_int_inverse()
unfold_copy_inverse()
We need to get actual views for the introduction of reverse view funcs coming next.
Details:
as_strided()
to implement actual view inverses for the aboveInverseReturnMode
enum for customization of functional inversesAlwaysView
- always return an actual view; needed for reverse view_funcs()NeverView
- always do a copy; useful for certain functionalization use cases (e.g. XLA, executorch)ViewOrScatterInverse
- return an actual view in most cases, but prefer scatter inverses when they exist. this avoids the need to implementas_strided()
for subclasses, which can be difficult or impossibleViewOrScatterInverse
when reapply_views TLS is True orNeverView
otherwise