Switch calling convention back to real tensors #99320

ezyang · 2023-04-17T14:30:01Z

Stack from ghstack (oldest at bottom):

Months ago, in order to get dynamic shapes working through to Dynamo backends, we changed the calling convention to pass fake tensors rather than real tensors as example inputs to backends. The motivation at the time was, well, backends shouldn't really be peeking at the real tensors when they are doing compilation, and so it would make more sense to hide the real tensors from backends. But there were a bunch of problems:

This interacted poorly with our accuracy minifier design: accuracy minifier needs access to the real inputs in order to run the model and figure out what happens!
The TensorRT backend required real inputs and we never figured out how to fix it.
In practice, all the backends needed to detect if they were passed real tensors, and fakeify them anyway (certainly AOTAutograd does this)
Parameters and inputs are treated non-uniformly: parameters had to be passed as real tensors, because CUDA graphs requires knowing what the actual tensors are

Furthermore, there were some more problems discovered after the fact:

Backends may want to optimize on aspects of tensors which you cannot tell without having real tensors; e.g., alignment of the data pointer

So, this PR decides that changing the calling convention was a bad idea, and switches back to passing real tensors. There is a problem though: AOTAutograd will perform fakeification, which means that in practice backends are still going to end up with fake tensors in the end anyway. I want to change this, but this will require some work with bdhirsh's upcoming AOTAutograd export refactor.

Signed-off-by: Edward Z. Yang ezyang@meta.com

cc @soumith @voznesenskym @penguinwu @anijain2305 @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @peterbell10 @desertfire

Signed-off-by: Edward Z. Yang <ezyang@meta.com> [ghstack-poisoned]

pytorch-bot · 2023-04-17T14:30:04Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/99320

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

Long queue for windows.4xlarge

❌ 2 Failures

As of commit 6fc2d5a:

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Signed-off-by: Edward Z. Yang <ezyangmeta.com> ghstack-source-id: 60d5c7d462c0c0e45b0fe1084280573ed16d237e Pull Request resolved: #99320

Months ago, in order to get dynamic shapes working through to Dynamo backends, we changed the calling convention to pass fake tensors rather than real tensors as example inputs to backends. The motivation at the time was, well, backends shouldn't really be peeking at the real tensors when they are doing compilation, and so it would make more sense to hide the real tensors from backends. But there were a bunch of problems: * This interacted poorly with our accuracy minifier design: accuracy minifier needs access to the real inputs in order to run the model and figure out what happens! * The TensorRT backend required real inputs and we never figured out how to fix it. * In practice, all the backends needed to detect if they were passed real tensors, and fakeify them anyway (certainly AOTAutograd does this) * Parameters and inputs are treated non-uniformly: parameters had to be passed as real tensors, because CUDA graphs requires knowing what the actual tensors are Furthermore, there were some more problems discovered after the fact: * Backends may want to optimize on aspects of tensors which you cannot tell without having real tensors; e.g., alignment of the data pointer So, this PR decides that changing the calling convention was a bad idea, and switches back to passing real tensors. There is a problem though: AOTAutograd will perform fakeification, which means that in practice backends are still going to end up with fake tensors in the end anyway. I want to change this, but this will require some work with bdhirsh's upcoming AOTAutograd export refactor. Signed-off-by: Edward Z. Yang <ezyangmeta.com> [ghstack-poisoned]

Signed-off-by: Edward Z. Yang <ezyangmeta.com> ghstack-source-id: 630eba64bace3331bdc6788f356604e74c7ba8eb Pull Request resolved: #99320

Months ago, in order to get dynamic shapes working through to Dynamo backends, we changed the calling convention to pass fake tensors rather than real tensors as example inputs to backends. The motivation at the time was, well, backends shouldn't really be peeking at the real tensors when they are doing compilation, and so it would make more sense to hide the real tensors from backends. But there were a bunch of problems: * This interacted poorly with our accuracy minifier design: accuracy minifier needs access to the real inputs in order to run the model and figure out what happens! * The TensorRT backend required real inputs and we never figured out how to fix it. * In practice, all the backends needed to detect if they were passed real tensors, and fakeify them anyway (certainly AOTAutograd does this) * Parameters and inputs are treated non-uniformly: parameters had to be passed as real tensors, because CUDA graphs requires knowing what the actual tensors are Furthermore, there were some more problems discovered after the fact: * Backends may want to optimize on aspects of tensors which you cannot tell without having real tensors; e.g., alignment of the data pointer So, this PR decides that changing the calling convention was a bad idea, and switches back to passing real tensors. There is a problem though: AOTAutograd will perform fakeification, which means that in practice backends are still going to end up with fake tensors in the end anyway. I want to change this, but this will require some work with bdhirsh's upcoming AOTAutograd export refactor. Signed-off-by: Edward Z. Yang <ezyangmeta.com> [ghstack-poisoned]

Signed-off-by: Edward Z. Yang <ezyangmeta.com> ghstack-source-id: 79091172e92460935a9b66a08a3960615a4eb6ce Pull Request resolved: #99320

pytorchmergebot · 2023-04-18T02:09:44Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

This bug was discovered by a stronger assert (which I will be posting in a follow up PR.) The explanation for this change is a bit long and windy, and I am not sure I entirely understand the situation myself. But here's what I think is going on. jansel's joint graph pattern matcher does something fairly unusual: in order to initialize the pattern in question, it (lazily) runs an aot_function invocation in order to trace out what the joint graph of a given pattern looks like (we ought not use aot_function, but we can't really do this until bdhirsh lands AOT Autograd export properly). However, this lazy initialization occurs within the context of a separate compilation, which has its own tracing context, and importantly, fake tensor mode. What we would like, is the pattern matcher lazy initialization fake tensor mode to be unrelated to whatever the ambient fake tensor mode of the graph we were actually compiling. We want these to be independent, because we don't really care what the current compiled graph is; this is a lazy init function, it could have gotten initialized during any compilation, it just happens to be initialized on this one. To prevent us from picking up the ambient fake mode, we have to do two things: we have to remove the tracing context (which stores a fake mode), and we have to also disable the ambiently active fake mode. In #99377 eellison proposed an alternative approach, where we reuse the fake mode. While this probably won't cause any errors, it's morally not the right thing to do, because you'll end up polluting the enclosing fake tensor mode with tensors that have nothing to do with the mode itself. This might fix #99286 but it's also possible that #99320 fixed it already. Signed-off-by: Edward Z. Yang <ezyangmeta.com> ghstack-source-id: f572909369d2c03eeb031311724eb9c9cd91acbc Pull Request resolved: #99391

This bug was discovered by a stronger assert (which I will be posting in a follow up PR.) The explanation for this change is a bit long and windy, and I am not sure I entirely understand the situation myself. But here's what I think is going on. jansel's joint graph pattern matcher does something fairly unusual: in order to initialize the pattern in question, it (lazily) runs an aot_function invocation in order to trace out what the joint graph of a given pattern looks like (we ought not use aot_function, but we can't really do this until bdhirsh lands AOT Autograd export properly). However, this lazy initialization occurs within the context of a separate compilation, which has its own tracing context, and importantly, fake tensor mode. What we would like, is the pattern matcher lazy initialization fake tensor mode to be unrelated to whatever the ambient fake tensor mode of the graph we were actually compiling. We want these to be independent, because we don't really care what the current compiled graph is; this is a lazy init function, it could have gotten initialized during any compilation, it just happens to be initialized on this one. To prevent us from picking up the ambient fake mode, we have to do two things: we have to remove the tracing context (which stores a fake mode), and we have to also disable the ambiently active fake mode. In #99377 eellison proposed an alternative approach, where we reuse the fake mode. While this probably won't cause any errors, it's morally not the right thing to do, because you'll end up polluting the enclosing fake tensor mode with tensors that have nothing to do with the mode itself. This might fix #99286 but it's also possible that #99320 fixed it already. Signed-off-by: Edward Z. Yang <ezyang@meta.com> [ghstack-poisoned]

This bug was discovered by a stronger assert (which I will be posting in a follow up PR.) The explanation for this change is a bit long and windy, and I am not sure I entirely understand the situation myself. But here's what I think is going on. jansel's joint graph pattern matcher does something fairly unusual: in order to initialize the pattern in question, it (lazily) runs an aot_function invocation in order to trace out what the joint graph of a given pattern looks like (we ought not use aot_function, but we can't really do this until bdhirsh lands AOT Autograd export properly). However, this lazy initialization occurs within the context of a separate compilation, which has its own tracing context, and importantly, fake tensor mode. What we would like, is the pattern matcher lazy initialization fake tensor mode to be unrelated to whatever the ambient fake tensor mode of the graph we were actually compiling. We want these to be independent, because we don't really care what the current compiled graph is; this is a lazy init function, it could have gotten initialized during any compilation, it just happens to be initialized on this one. To prevent us from picking up the ambient fake mode, we have to do two things: we have to remove the tracing context (which stores a fake mode), and we have to also disable the ambiently active fake mode. In #99377 eellison proposed an alternative approach, where we reuse the fake mode. While this probably won't cause any errors, it's morally not the right thing to do, because you'll end up polluting the enclosing fake tensor mode with tensors that have nothing to do with the mode itself. This might fix #99286 but it's also possible that #99320 fixed it already. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: #99391 Approved by: https://github.com/bdhirsh

DanilBaibak · 2023-04-19T09:42:09Z

@pytorchbot revert -m "Break internal build" -c ghfirst

pytorchmergebot · 2023-04-19T09:43:58Z

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

pytorchmergebot · 2023-04-19T09:44:12Z

@ezyang your PR has been successfully reverted.

This reverts commit 780922c. Reverted #99320 on behalf of https://github.com/DanilBaibak due to Break internal build

DanilBaibak · 2023-04-19T09:45:46Z

@ezyang sorry, I have to revert your PR. It leads to internal failures. Here you can find more details - D45103307.

ezyang · 2023-04-19T12:13:53Z

@pytorchbot merge -f "force merging as need internal only changes"

pytorchmergebot · 2023-04-19T12:15:43Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

ezyang · 2023-04-19T12:19:22Z

attn oncall, fbcode side changes are at https://www.internalfb.com/diff/D45103307

DanilBaibak · 2023-04-20T09:52:37Z

@pytorchbot revert -m "Break internal build" -c ghfirst

pytorchmergebot · 2023-04-20T09:54:27Z

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

pytorchmergebot · 2023-04-20T09:54:36Z

Reverting PR 99320 failed

Reason: Command git -C /home/runner/work/pytorch/pytorch revert --no-edit c67c16bcd2007cce35440a0c95b163ede2b73b61 returned non-zero exit code 1

Auto-merging torch/_dynamo/config.py
Auto-merging torch/_dynamo/debug_utils.py
CONFLICT (content): Merge conflict in torch/_dynamo/debug_utils.py
Auto-merging torch/_dynamo/eval_frame.py
Auto-merging torch/_inductor/pattern_matcher.py
error: could not revert c67c16bcd20... Switch calling convention back to real tensors (#99320)
hint: After resolving the conflicts, mark them with
hint: "git add/rm <pathspec>", then run
hint: "git revert --continue".
hint: You can instead skip this commit with "git revert --skip".
hint: To abort and get back to the state before "git revert",
hint: run "git revert --abort".

Details for Dev Infra team

Raised by workflow job

Switch calling convention back to real tensors

1b5118a

Signed-off-by: Edward Z. Yang <ezyang@meta.com> [ghstack-poisoned]

ezyang mentioned this pull request Apr 17, 2023

Delete functorch use_fake_tensor and debug_fake_cross_ref #99314

Closed

pytorch-bot bot added the release notes: AO frontend label Apr 17, 2023

ezyang mentioned this pull request Apr 17, 2023

Delete use_functionalize feature flag #99317

Closed

ezyang added a commit that referenced this pull request Apr 17, 2023

Switch calling convention back to real tensors

61da21b

Signed-off-by: Edward Z. Yang <ezyangmeta.com> ghstack-source-id: 60d5c7d462c0c0e45b0fe1084280573ed16d237e Pull Request resolved: #99320

github-actions bot added ciflow/inductor module: dynamo module: inductor labels Apr 17, 2023

github-actions bot requested review from albanD, antoniojkim, bdhirsh, jbschlosser, miladm, SherlockNoMad, voznesenskym and wconstab April 17, 2023 14:30

ezyang added a commit that referenced this pull request Apr 17, 2023

Switch calling convention back to real tensors

a1af140

Signed-off-by: Edward Z. Yang <ezyangmeta.com> ghstack-source-id: 630eba64bace3331bdc6788f356604e74c7ba8eb Pull Request resolved: #99320

albanD removed their request for review April 17, 2023 15:46

ezyang requested review from Chillee and albanD April 17, 2023 15:56

ezyang added a commit that referenced this pull request Apr 17, 2023

Switch calling convention back to real tensors

0732e4a

Signed-off-by: Edward Z. Yang <ezyangmeta.com> ghstack-source-id: 79091172e92460935a9b66a08a3960615a4eb6ce Pull Request resolved: #99320

albanD removed their request for review April 17, 2023 15:58

voznesenskym approved these changes Apr 17, 2023

View reviewed changes

ezyang added ciflow/trunk Trigger trunk jobs on your pull request topic: bc breaking topic category topic: bug fixes topic category release notes: dynamo labels Apr 17, 2023

pytorchmergebot added the Merged label Apr 18, 2023

pytorchmergebot closed this in 780922c Apr 18, 2023

ezyang mentioned this pull request Apr 18, 2023

Reset joint graph fake mode earlier, and more comprehensively #99391

Closed

pytorchmergebot added the Reverted label Apr 19, 2023

pytorchmergebot added a commit that referenced this pull request Apr 19, 2023

Revert "Switch calling convention back to real tensors (#99320)"

ea50d4f

This reverts commit 780922c. Reverted #99320 on behalf of https://github.com/DanilBaibak due to Break internal build

ezyang reopened this Apr 19, 2023

github-actions bot requested review from albanD and voznesenskym April 19, 2023 12:13

pytorchmergebot removed the merging label Apr 19, 2023

pytorchmergebot closed this in c67c16b Apr 19, 2023

thiagocrepaldi mentioned this pull request Apr 21, 2023

Feature request: dynamo.export to support fake model and inputs #95900

Closed

facebook-github-bot deleted the gh/ezyang/2001/head branch June 8, 2023 17:01

liquanfeng mentioned this pull request Oct 21, 2023

[Unity] Update relax_dynamo apache/tvm#15962

Merged

jon-chuang mentioned this pull request Oct 31, 2023

[Dynamo] Custom backend receives non FakeTensor example inputs #112506

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Switch calling convention back to real tensors #99320

Switch calling convention back to real tensors #99320

ezyang commented Apr 17, 2023 •

edited by pytorch-bot bot

pytorch-bot bot commented Apr 17, 2023 •

edited

pytorchmergebot commented Apr 18, 2023

DanilBaibak commented Apr 19, 2023

pytorchmergebot commented Apr 19, 2023

pytorchmergebot commented Apr 19, 2023

DanilBaibak commented Apr 19, 2023

ezyang commented Apr 19, 2023

pytorchmergebot commented Apr 19, 2023

ezyang commented Apr 19, 2023

DanilBaibak commented Apr 20, 2023

pytorchmergebot commented Apr 20, 2023

pytorchmergebot commented Apr 20, 2023

Switch calling convention back to real tensors #99320

Switch calling convention back to real tensors #99320

Conversation

ezyang commented Apr 17, 2023 • edited by pytorch-bot bot

pytorch-bot bot commented Apr 17, 2023 • edited

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/99320

❗ 1 Active SEVs

❌ 2 Failures

pytorchmergebot commented Apr 18, 2023

Merge started

DanilBaibak commented Apr 19, 2023

pytorchmergebot commented Apr 19, 2023

pytorchmergebot commented Apr 19, 2023

DanilBaibak commented Apr 19, 2023

ezyang commented Apr 19, 2023

pytorchmergebot commented Apr 19, 2023

Merge started

ezyang commented Apr 19, 2023

DanilBaibak commented Apr 20, 2023

pytorchmergebot commented Apr 20, 2023

pytorchmergebot commented Apr 20, 2023

Reverting PR 99320 failed

ezyang commented Apr 17, 2023 •

edited by pytorch-bot bot

pytorch-bot bot commented Apr 17, 2023 •

edited