New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
change torch._dynamo.export(aten_graph=...) to allow pre_autograd tracing #98031
Conversation
…cing [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/98031
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 7f6bdb5: This comment was automatically generated by Dr. CI and updates every 15 minutes. |
…cing ghstack-source-id: 7c13ae9d8f47d4bb85c7ab5b7b49c3cad7ec6fd9 Pull Request resolved: #98031
torch/_dynamo/eval_frame.py
Outdated
aten_graph (bool): If True, exports a graph with ATen operators. | ||
If False, exports a graph with Python operators. Default is False. | ||
aten_graph (str): Valid options include: | ||
"none": export a graph with Python operations. Default is False. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"none": export a graph with Python operations. Default is False. | |
"none": export a graph with Python operations. |
torch/_dynamo/eval_frame.py
Outdated
"aten": export a graph with ATen operations. | ||
"aten_pre_autograd": export a graph with pre_autograd ATen operations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the difference between pre_autograd aten ops and non-pre_autograd ones?
torch/_dynamo/eval_frame.py
Outdated
@@ -584,7 +584,7 @@ class directly; instead, use :func:`torch._export.dynamic_dim`. | |||
def export( | |||
f: Callable[..., Any], | |||
*args, | |||
aten_graph: bool = False, | |||
aten_graph: str = "none", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I would have personally gone ahead and done a BC-breaking change here. But if we are in the business of making a BC-breaking change, I think we ought to also rename the keyword argument as well. It's weird and redundant to say aten_graph="aten"
. How about dialect
or target_dialect
? Also cc @SherlockNoMad for opinions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dialect
sounds good to me (but interested in Sherlock's opinion).
I'm also concerned about BC risk - if you're worried about changing the bool, I'm happy to leave it alone (although two bools to control the "dialect" feels bad). Fixing any internal call-sites will be annoying but doable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're willing to put in the effort, fixing the API earlier rather than later is good. I just remember I had specifically asked you to plumb this, and had been thinking you'd do the low effort thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're right - in interesting of landing sooner, I'm avoiding the API concern for now and adding an extra boolean argument. cc @andrewor14
…utograd tracing" pre_autograd tracing is still early, but it should work for basic cases. This PR changes the API a bit for export to expose pre_autograd tracing. Name bikeshedding is welcome, but it looks like: ``` torch._dynamo.export(..., aten_graph="aten_pre_autograd") ``` cc andrewor14 cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire [ghstack-poisoned]
updated, fixed tests |
@@ -414,7 +414,8 @@ def can_handle_tensor(x): | |||
else: | |||
constant = None | |||
|
|||
track_tensor_tree(out, proxy_out, constant=constant, tracer=tracer) | |||
with inside_mode(proxy_mode): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems surprising to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still waiting on explainer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sigh, wrote the comment when I initially made this update and forgot to submit it.
I'd be happy to make this a separate PR. The issue was that:
(1) We detach tensors as part of creating proxies.
(2) In pre_autograd tracing, we currently push TorchProxyDispatchMode
onto both the autograd mode stack, and the original python key mode stack
(3) The detach() calls get intercepted by the proxy mode on the original python key mode stack.
So we need to be careful that any aten ops that we call inside of TorchProxyDispatchMode happen in a with inside_mode()
now.
I think we agreed 2-3 weeks ago that we could avoid re-entrant issues if we solved the "fallthrough keys can't be intercepted by the python dispatcher" issue, but we agreed this is difficult.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok put this comment in the code?
This looks fine EXCEPT for the one line at the end, how come? |
…utograd tracing" pre_autograd tracing is still early, but it should work for basic cases. This PR changes the API a bit for export to expose pre_autograd tracing. Name bikeshedding is welcome, but it looks like: ``` torch._dynamo.export(..., aten_graph="aten_pre_autograd") ``` cc andrewor14 cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire [ghstack-poisoned]
…utograd tracing" pre_autograd tracing is still early, but it should work for basic cases. This PR changes the API a bit for export to expose pre_autograd tracing. Name bikeshedding is welcome, but it looks like: ``` torch._dynamo.export(..., aten_graph="aten_pre_autograd") ``` cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire andrewor14 cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire [ghstack-poisoned]
…utograd tracing" pre_autograd tracing is still early, but it should work for basic cases. This PR changes the API a bit for export to expose pre_autograd tracing. Name bikeshedding is welcome, but it looks like: ``` torch._dynamo.export(..., aten_graph="aten_pre_autograd") ``` cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire andrewor14 cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire [ghstack-poisoned]
…utograd tracing" pre_autograd tracing is still early, but it should work for basic cases. This PR changes the API a bit for export to expose pre_autograd tracing. Name bikeshedding is welcome, but it looks like: ``` torch._dynamo.export(..., aten_graph="aten_pre_autograd") ``` cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire andrewor14 cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire [ghstack-poisoned]
…utograd tracing" pre_autograd tracing is still early, but it should work for basic cases. This PR changes the API a bit for export to expose pre_autograd tracing. Name bikeshedding is welcome, but it looks like: ``` torch._dynamo.export(..., aten_graph="aten_pre_autograd") ``` cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire andrewor14 cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire [ghstack-poisoned]
…utograd tracing" pre_autograd tracing is still early, but it should work for basic cases. This PR changes the API a bit for export to expose pre_autograd tracing. Name bikeshedding is welcome, but it looks like: ``` torch._dynamo.export(..., aten_graph="aten_pre_autograd") ``` cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire andrewor14 cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire [ghstack-poisoned]
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 1 jobs have failed, first few of them are: trunk / macos-12-py3-arm64 / build Details for Dev Infra teamRaised by workflow job |
…utograd tracing" pre_autograd tracing is still early, but it should work for basic cases. This PR changes the API a bit for export to expose pre_autograd tracing. Name bikeshedding is welcome, but it looks like: ``` torch._dynamo.export(..., aten_graph="aten_pre_autograd") ``` cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire andrewor14 cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire [ghstack-poisoned]
…utograd tracing" pre_autograd tracing is still early, but it should work for basic cases. This PR changes the API a bit for export to expose pre_autograd tracing. Name bikeshedding is welcome, but it looks like: ``` torch._dynamo.export(..., aten_graph="aten_pre_autograd") ``` cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire andrewor14 cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire [ghstack-poisoned]
…utograd tracing" pre_autograd tracing is still early, but it should work for basic cases. This PR changes the API a bit for export to expose pre_autograd tracing. Name bikeshedding is welcome, but it looks like: ``` torch._dynamo.export(..., aten_graph="aten_pre_autograd") ``` cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire andrewor14 cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire [ghstack-poisoned]
…utograd tracing" pre_autograd tracing is still early, but it should work for basic cases. This PR changes the API a bit for export to expose pre_autograd tracing. Name bikeshedding is welcome, but it looks like: ``` torch._dynamo.export(..., aten_graph="aten_pre_autograd") ``` cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire andrewor14 cc soumith voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire [ghstack-poisoned]
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
pre_autograd tracing is still early, but it should work for basic cases. This PR changes the API a bit for export to expose pre_autograd tracing. Name bikeshedding is welcome, but it looks like:
cc @soumith @voznesenskym @penguinwu @anijain2305 @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @desertfire @andrewor14
Stack from ghstack (oldest at bottom):
cc @soumith @voznesenskym @penguinwu @anijain2305 @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @desertfire