Higher order op for preserving leaf functions through trace, particularly for getting user defined hooks to compiled autograd #109690

voznesenskym · 2023-09-20T06:18:28Z

Stack from ghstack (oldest at bottom):

-> Higher order op for preserving leaf functions through trace, particularly for getting user defined hooks to compiled autograd #109690

cc @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @chenyang78 @aakhundov @kadeng

…arly for getting user defined hooks to compiled autograd [ghstack-poisoned]

pytorch-bot · 2023-09-20T06:18:32Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/109690

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 2 Pending

As of commit 807641b with merge base 34ded74 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

…arly for getting user defined hooks to compiled autograd ghstack-source-id: 9194b20f488ef37015646cb041195b995c322f6e Pull Request resolved: #109690

…e, particularly for getting user defined hooks to compiled autograd" cc penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx chenyang78 aakhundov kadeng [ghstack-poisoned]

…arly for getting user defined hooks to compiled autograd ghstack-source-id: 1daeccc15a48edf8a8a9c855cc3d514924c70d29 Pull Request resolved: #109690

torch/_dynamo/_trace_wrapped_higher_order_op.py

bdhirsh · 2023-09-20T13:57:37Z

torch/_dynamo/_trace_wrapped_higher_order_op.py

+    fn = functools.partial(self_invoke, fn=fn)
+    fn.__name__ = fn.func.__name__
+
+    is_func = torch._is_functional_tensor(grad)


Hmm, seems fishy that we need to muck with functional tensors inside of ProxyTorchDispatchMode, lmk if I can help

You commented on this here #107502 (comment) sorry to move stuff around so much

ah right :p

test/dynamo/test_backward_higher_order_ops.py

…e, particularly for getting user defined hooks to compiled autograd" cc penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx chenyang78 aakhundov kadeng [ghstack-poisoned]

…arly for getting user defined hooks to compiled autograd ghstack-source-id: 0518747bd22e427555bb72c60356c04827313b3b Pull Request resolved: #109690 Higher order op for preserving leaf functions through trace, particularly for getting user defined hooks to compiled autograd

…e, particularly for getting user defined hooks to compiled autograd" cc penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx chenyang78 aakhundov kadeng [ghstack-poisoned]

…arly for getting user defined hooks to compiled autograd ghstack-source-id: 29d32f183082aabb0402241a4fed1852eede2d2d Pull Request resolved: #109690 Higher order op for preserving leaf functions through trace, particularly for getting user defined hooks to compiled autograd Higher order op for preserving leaf functions through trace, particularly for getting user defined hooks to compiled autograd Higher order op for preserving leaf functions through trace, particularly for getting user defined hooks to compiled autograd

…e, particularly for getting user defined hooks to compiled autograd" cc penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx chenyang78 aakhundov kadeng [ghstack-poisoned]

aten/src/ATen/native/ComparisonUtils.cpp

…arly for getting user defined hooks to compiled autograd ghstack-source-id: 84c1209b94bcca2f0cd1d84242dd2291fde9beb7 Pull Request resolved: #109690 Higher order op for preserving leaf functions through trace, particularly for getting user defined hooks to compiled autograd Higher order op for preserving leaf functions through trace, particularly for getting user defined hooks to compiled autograd Higher order op for preserving leaf functions through trace, particularly for getting user defined hooks to compiled autograd Higher order op for preserving leaf functions through trace, particularly for getting user defined hooks to compiled autograd

voznesenskym · 2023-09-20T20:35:57Z

torch/_dynamo/_trace_wrapped_higher_order_op.py

+# the functions as needed. This, in turn, means we can support functions in backward with complex python
+# state mutation. If we were to not do this, the functions would get inlined into their composing aten ops,
+# and we would lose the python state mutation.
+def _trace_wrapped(*args, fn):


Still bikeshedding on name, placeholder for now to not churn review comments overmuch.

something about leaving the inner fn opaque so we can trace it in the bw? trace_opaque, trace_opaque_for_bw (idk)

@ezyang suggested PythonOp or some variation around the word leaf...

torch/_dynamo/_trace_wrapped_higher_order_op.py

ezyang · 2023-09-24T03:22:14Z

torch/_dynamo/_trace_wrapped_higher_order_op.py

+    out_proxy = mode.tracer.create_proxy(
+        "call_function", fn, proxy_args, {}, name="invocation"
+    )
+    grad = torch.empty_like(grad)


nit: zeros_like to prevent unfortunate accidents when grad is actually not a fake tensor. Or perhaps assert that grad must be fake?

(1) Do we ever intend to use this higher order op for stuff other than hooks?

(2) Is it asserted anywhere else (e.g. the autograd engine) that hook functions always take in a single tensor argument?

(1) We want this for autograd function backwards too, and other stuff in the future

(2) Yea, thats the contract, but I like repeating invariants. I'll do a zeros_like and assert too.

ezyang · 2023-09-24T03:24:28Z

torch/_dynamo/_trace_wrapped_higher_order_op.py

+        name="assert",
+    )
+    grad = torch.empty_like(grad)
+    grad = track_tensor_tree(grad, out_proxy, constant=None, tracer=mode.tracer)


I'm confused, why do you need to do this twice

I think I understand why you did this (you need to prevent the assert from getting DCEd) but I don't think this is the right way to do it. Let me think...

I don't think you need to prevent this from DCE'd? Like, the assert can just have no data deps and you don't have to track at all. What happens when you do that?

Hmm, lemme try, I thought it was cause of DCE but now I do not remember.

Yeah, you get DCE if you don't track w/ create_proxy. However, if we change it to create_node, it breaks in other ways because none of the rest of this is nodes. It's all proxies. Is there a way to pass proxies to node creation? It seems like crossing streams...

Every proxy has a node so you can extract the node from

Ofc, but is that kosher here? is that better than just repeating proxy binding code? Does it actually make a difference? I defer to you.

If there is some DCE thing, it will happen whether or not you create_proxy or create_node. I guess this is fine. Actually, why don't you just shove this into self_invoke, that will also prevent DCE

ezyang · 2023-09-24T03:30:59Z

torch/_dynamo/variables/higher_order_ops.py

+            tx=tx,
+            proxy=tx.output.create_proxy(
+                "call_function",
+                torch._dynamo._trace_wrapped_higher_order_op._assert_meta,


No why do you have to do this. You've already inserted the assert meta into the graph, you're going to trace into it later

Discussing offline atm.

we will keep it simple

bdhirsh · 2023-09-25T15:39:54Z

test/dynamo/test_backward_higher_order_ops.py

+            ("dtype", _tensor_mutating_dtype),
+        ]:
+
+            def _graph_break_invoke(grad):


This technically won't cause a graph break (as its name implies) right - it'll cause a hard error in the backward?

Also, it would be nice to have another test for aliasing, where a hook returns a view of the input. (which is "wrong" because our fake tensor rule for the wrap higher order op assumes that the output of the hook never aliases the input).

It's a graph break, which turns into a hard error.

bdhirsh · 2023-09-25T15:41:42Z

torch/_dynamo/_trace_wrapped_higher_order_op.py

+    return _trace_wrapped_op(*args, fn=fn)
+
+
+def _assert_meta(grad, size, stride, dtype):


noob dynamo q: how does dynamo know to execute these asserts at compile time (while dynamo is tracing), instead of automatically trying to add these asserts and metadata calls as proxies into the backward graph?

So long as this function is not allowed in graph, dynamo must inline into it

oh right- thanks!

ezyang · 2023-09-25T17:10:35Z

fwiw, for me, this is very close, just my two last comments

voznesenskym · 2023-09-26T00:16:18Z

fwiw, for me, this is very close, just my two last comments

Understood, tyvm :)

…e, particularly for getting user defined hooks to compiled autograd" cc penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx chenyang78 aakhundov kadeng [ghstack-poisoned]

ezyang · 2023-09-26T21:39:33Z

torch/_dynamo/variables/higher_order_ops.py

+        fn = kwargs["fn"]
+        assert len(args) == 1
+        grad = args[0]
+        assert isinstance(grad, TensorVariable)


These shouldn't be asserts right, because malformed user code can trigger them

This is explicitly a user invoked operation, provided almost akin to something like a compiler directive - it feels like it should assert. I dont mind doing unimplemented but I feel a tad stronger here than like, for when we are aping something from std.

ezyang · 2023-09-27T13:26:20Z

torch/_dynamo/_trace_wrapped_higher_order_op.py

+# we can support functions in backward with complex python. It can be thought of as an allow_in_graph
+# for our aten graph. If we were to not do this, the functions would get inlined into their composing aten ops,
+# and we would lose the python state mutation.
+def trace_wrapped(*args, fn):


Is there any reason to take this variadically, you only support one argument 🤔

I went back and forth. This feels better in case we want to use it in autograd.Function, where we take multiple args.

ezyang · 2023-09-27T13:43:27Z

torch/_dynamo/_trace_wrapped_higher_order_op.py

+# the functions as needed. While there is nothing backward specific about this op, the way it is written means
+# we can support functions in backward with complex python. It can be thought of as an allow_in_graph
+# for our aten graph. If we were to not do this, the functions would get inlined into their composing aten ops,
+# and we would lose the python state mutation.


Here is a proposed rewrite of the top level comment:

trace_wrapped(*args, fn) is equivalent to fn(*args), but with a twist: if you make_fx trace through this call, we will not actually trace into fn; instead, we will directly insert it as a call_function to fn in the graph. (Unlike make_fx, Dynamo WILL inline into fn.) You can think of this as a one off allow_in_graph equivalent for proxy tensor tracing.

Because proxy tensor tracing does not actually run the function, there are requirements on the behavior of fn. We are still figuring it out, but here is the current state:

fn can only take a single argument, which must be a tensor

fn must return a new tensor with the same metadata as the original tensor (e.g., empty_like(input) is a permissible implementation of fn). This is verified via an extra assert that is inserted into the traced graph.

fn MAY have side effects, but it MAY NOT perform metadata mutation on other tensors participating in proxy tensor tracing (it MAY mutate other tensors, it MAY mutate Python state)

These requirements stem from the requirement that we need to continue performing proxy tensor tracing, which assumes accurate fake tensor metadata, without actually running fn. In the future, we may allow for a "meta" function associated with fn to allow for more interesting input-output patterns.

Note that tensors / Python state are allowed to be mutated. This is relaxed constraint is not always sound, but it is sound for backward tracing with fake tensors as it takes place in AOTAutograd, as the backward pass is guaranteed not to depend on concrete tensor values (via fake tensor) or Python state (because the autograd engine doesn't depend on Python).

The intended use case for this function is to allow AOTAutograd to defer complex backward hooks to compiled autograd. AOTAutograd performs a make_fx trace which preserves the function call as is in the graph, and only when we Dynamo through the backward graph in compiled autograd do we inline into the function.

Sure, this is better. Thanks for rewriting it. eg: zeros_like(input) I suppose

if you are using may and must as https://www.rfc-editor.org/rfc/rfc2119 - let's use SHOULD and MUST ;)

Thank you again.

ezyang · 2023-09-27T14:12:12Z

torch/_dynamo/_trace_wrapped_higher_order_op.py

+    # calls into "leaf modules" as per traditional FX terminology.
+    # Note: Instead of naming it "allow_in_graph", we opted for a different name since "allow_in_graph"
+    # might imply that it's traceable, whereas this function is intrinsically non-traceable.
+    # Note2: I hate this name


This would be subsumed by the comment above I think

ezyang · 2023-09-27T14:15:39Z

torch/_dynamo/_trace_wrapped_higher_order_op.py

+
+    proxy_args = (mode.tracer.unwrap_proxy(grad),)
+    out_proxy = mode.tracer.create_proxy(
+        "call_function", self_invoke, proxy_args, {}, name="invocation"


nit: call this trace_wrapped instead?

ezyang · 2023-09-27T14:17:32Z

torch/_dynamo/_trace_wrapped_higher_order_op.py

+    # a runtime assert
+    proxy_args = pytree.tree_map(
+        mode.tracer.unwrap_proxy, (grad, grad.size(), grad.stride(), grad.dtype)
+    )


nit: the tree_map here is also unnecessary, just s/grad/out_proxy/

ezyang

go go go

…e, particularly for getting user defined hooks to compiled autograd" cc penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx chenyang78 aakhundov kadeng [ghstack-poisoned]

…arly for getting user defined hooks to compiled autograd ghstack-source-id: 37029e876bf9889441c2f469b06add4fe754e4d5 Pull Request resolved: #109690 Higher order op for preserving leaf functions through trace, particularly for getting user defined hooks to compiled autograd Higher order op for preserving leaf functions through trace, particularly for getting user defined hooks to compiled autograd Higher order op for preserving leaf functions through trace, particularly for getting user defined hooks to compiled autograd Higher order op for preserving leaf functions through trace, particularly for getting user defined hooks to compiled autograd Higher order op for preserving leaf functions through trace, particularly for getting user defined hooks to compiled autograd Higher order op for preserving leaf functions through trace, particularly for getting user defined hooks to compiled autograd Higher order op for preserving leaf functions through trace, particularly for getting user defined hooks to compiled autograd lint lint lint lint more test more test more test more test

voznesenskym · 2023-09-27T20:44:09Z

@pytorchbot merge -f "Flaky ci, graph break break in vision rcnn introduced in #110101"

pytorchmergebot · 2023-09-27T20:47:10Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Higher order op for preserving leaf functions through trace, particul…

2208e96

…arly for getting user defined hooks to compiled autograd [ghstack-poisoned]

voznesenskym mentioned this pull request Sep 20, 2023

Store dynamo hook registrations in residuals #109689

Closed

pytorch-bot bot added the release notes: AO frontend label Sep 20, 2023

github-actions bot requested review from albanD, antoniojkim, bdhirsh, ezyang, miladm, SherlockNoMad and wconstab September 20, 2023 06:18

github-actions bot added module: dynamo ciflow/inductor labels Sep 20, 2023

voznesenskym marked this pull request as draft September 20, 2023 08:08

bdhirsh reviewed Sep 20, 2023

View reviewed changes

torch/_dynamo/_trace_wrapped_higher_order_op.py Outdated Show resolved Hide resolved

bdhirsh reviewed Sep 20, 2023

View reviewed changes

test/dynamo/test_backward_higher_order_ops.py Show resolved Hide resolved

voznesenskym commented Sep 20, 2023

View reviewed changes

aten/src/ATen/native/ComparisonUtils.cpp Outdated Show resolved Hide resolved

voznesenskym marked this pull request as ready for review September 20, 2023 20:35

github-actions bot requested a review from bdhirsh September 20, 2023 20:35

voznesenskym commented Sep 20, 2023

View reviewed changes

ezyang reviewed Sep 24, 2023

View reviewed changes

torch/_dynamo/_trace_wrapped_higher_order_op.py Outdated Show resolved Hide resolved

ezyang reviewed Sep 24, 2023

View reviewed changes

torch/_dynamo/_trace_wrapped_higher_order_op.py Outdated Show resolved Hide resolved

ezyang reviewed Sep 24, 2023

View reviewed changes

bdhirsh reviewed Sep 25, 2023

View reviewed changes

voznesenskym added 2 commits September 25, 2023 22:45

voznesenskym requested a review from ezyang September 26, 2023 16:57

ezyang reviewed Sep 27, 2023

View reviewed changes

ezyang approved these changes Sep 27, 2023

View reviewed changes

pytorch deleted a comment from pytorch-bot bot Sep 27, 2023

pytorchmergebot added the merging label Sep 27, 2023

pytorchmergebot added Merged and removed merging labels Sep 27, 2023

pytorchmergebot closed this in b123fd1 Sep 27, 2023

facebook-github-bot deleted the gh/voznesenskym/226/head branch October 1, 2023 14:23

		return _trace_wrapped_op(*args, fn=fn)


		def _assert_meta(grad, size, stride, dtype):

Higher order op for preserving leaf functions through trace, particularly for getting user defined hooks to compiled autograd #109690

Higher order op for preserving leaf functions through trace, particularly for getting user defined hooks to compiled autograd #109690

Conversation

voznesenskym commented Sep 20, 2023 • edited

pytorch-bot bot commented Sep 20, 2023 • edited

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/109690

⏳ No Failures, 2 Pending

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

voznesenskym Sep 27, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ezyang commented Sep 25, 2023

voznesenskym commented Sep 26, 2023

Choose a reason for hiding this comment

voznesenskym Sep 27, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ezyang Sep 27, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ezyang left a comment

Choose a reason for hiding this comment

voznesenskym commented Sep 27, 2023

pytorchmergebot commented Sep 27, 2023

Merge started

voznesenskym commented Sep 20, 2023 •

edited

pytorch-bot bot commented Sep 20, 2023 •

edited

voznesenskym Sep 27, 2023 •

edited

voznesenskym Sep 27, 2023 •

edited

ezyang Sep 27, 2023 •

edited