[test fix] try to re-use cached symnodes across dynamo and AOTAutograd #120090

bdhirsh · 2024-02-16T17:32:54Z

More to generate discussion (maybe we want to land this? But the state of caching here feels pretty fragile). Partial fix to the issue here: https://fb.workplace.com/groups/1075192433118967/permalink/1381371379167736/

It looks like we used to try to cache symint creation - but that was changed in this PR, to cache raw sympy symbols instead: #115396 (the reason described in the PR is that it helps avoid creating spurious symbols).

That has an extra affect though, which is that only caching the raw sympy.symbols means that we will generate new SymNode wrappers around these symbols each time we fakeify a tensor and generate symints for its sizes.

This means that when we fakeify in dynamo and then again in AOTAutograd:

(1) the symints that we create for the sizes of the fake tensors in these two cases will share the same underlying sympy.symbol (thanks to the symbol cache)

(2) However, they will not share the same SymNode.

This is a problem because make_fx() has a symnode_tracker, that is effectively a giant map from [torch.SymNode] -> [Proxy]. We can end up with symints from both AOTAutograd and Dynamo showing up in the function that we want to run make_fx() on, and make_fx() will see two versions of s1 with different SymNodes, that it would then be obligated to create (different) proxies for.

Stack from ghstack (oldest at bottom):

[ghstack-poisoned]

pytorch-bot · 2024-02-16T17:32:57Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/120090

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 9 New Failures, 2 Unrelated Failures

As of commit 6340936 with merge base 24968ff ():

NEW FAILURES - The following jobs have failed:

pull / linux-focal-cuda12.1-py3.10-gcc9 / test (default, 1, 5, linux.4xlarge.nvidia.gpu) (gh)
functorch/test_aotdispatch.py::TestAOTAutograd::test_input_data_and_metadata_mutation_aliases_other_input
pull / linux-focal-cuda12.1-py3.10-gcc9-sm86 / test (default, 1, 5, linux.g5.4xlarge.nvidia.gpu) (gh)
functorch/test_aotdispatch.py::TestAOTAutograd::test_input_data_and_metadata_mutation_aliases_other_input
pull / linux-focal-py3.11-clang10 / test (crossref, 1, 2, linux.2xlarge) (gh)
functorch/test_aotdispatch.py::TestAOTAutograd::test_input_data_and_metadata_mutation_aliases_other_input
pull / linux-focal-py3.11-clang10 / test (default, 1, 3, linux.2xlarge) (gh)
functorch/test_aotdispatch.py::TestAOTAutograd::test_input_data_and_metadata_mutation_aliases_other_input
pull / linux-focal-py3.12-clang10 / test (default, 1, 3, linux.2xlarge) (gh)
functorch/test_aotdispatch.py::TestAOTAutograd::test_input_data_and_metadata_mutation_aliases_other_input
pull / linux-focal-py3.8-clang10 / test (crossref, 1, 2, linux.2xlarge) (gh)
functorch/test_aotdispatch.py::TestAOTAutograd::test_input_data_and_metadata_mutation_aliases_other_input
pull / linux-focal-py3.8-clang10 / test (default, 1, 3, linux.2xlarge) (gh)
functorch/test_aotdispatch.py::TestAOTAutograd::test_input_data_and_metadata_mutation_aliases_other_input
pull / linux-jammy-py3.10-clang15-asan / test (default, 1, 6, linux.4xlarge) (gh)
functorch/test_aotdispatch.py::TestAOTAutograd::test_input_data_and_metadata_mutation_aliases_other_input
pull / linux-jammy-py3.8-gcc11 / test (default, 1, 3, linux.2xlarge) (gh)
functorch/test_aotdispatch.py::TestAOTAutograd::test_input_data_and_metadata_mutation_aliases_other_input

FLAKY - The following job failed but was likely due to flakiness present on trunk:

pull / linux-jammy-py3.10-clang15-asan / test (default, 4, 6, linux.4xlarge) (gh)
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! KeyboardInterrupt !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

UNSTABLE - The following job failed but was likely due to flakiness present on trunk and has been marked as unstable:

inductor / rocm6.0-py3.8-inductor / test (inductor, 1, 1, linux.rocm.gpu.2, unstable) (gh)
Action 'https://api.github.com/repos/pytorch/pytorch/tarball/ac2ba7889df3268e39090b89115cb50d1b49e1e9' download has timed out. Error: The request was canceled due to the configured HttpClient.Timeout of 100 seconds elapsing.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-source-id: eca9e3bd708490d69e7567d178b40617f16ccb95 Pull Request resolved: #120090

… AOTAutograd" More to generate discussion (maybe we want to land this? But the state of caching here feels pretty fragile). Partial fix to the issue here: https://fb.workplace.com/groups/1075192433118967/permalink/1381371379167736/ It looks like we used to try to cache symint creation - but that was changed in this PR, to cache raw sympy symbols instead: #115396 (the reason described in the PR is that it helps avoid creating spurious symbols). That has an extra affect though, which is that only caching the raw sympy.symbols means that we will generate new `SymNode` wrappers around these symbols each time we fakeify a tensor and generate symints for its sizes. This means that when we fakeify in dynamo and then again in AOTAutograd: (1) the symints that we create for the sizes of the fake tensors in these two cases **will** share the same underlying `sympy.symbol` (thanks to the symbol cache) (2) However, they will **not** share the same `SymNode`. This is a problem because `make_fx()` has a `symnode_tracker`, that is effectively a giant map from `[torch.SymNode] -> [Proxy]`. We can end up with symints from both AOTAutograd and Dynamo showing up in the function that we want to run `make_fx()` on, and make_fx() will see two versions of `s1` with different SymNodes, that it would then be obligated to create (different) proxies for. [ghstack-poisoned]

ghstack-source-id: a1ea3edb77fc88720741578461f388340f9ff41c Pull Request resolved: #120090

… AOTAutograd" More to generate discussion (maybe we want to land this? But the state of caching here feels pretty fragile). Partial fix to the issue here: https://fb.workplace.com/groups/1075192433118967/permalink/1381371379167736/ It looks like we used to try to cache symint creation - but that was changed in this PR, to cache raw sympy symbols instead: #115396 (the reason described in the PR is that it helps avoid creating spurious symbols). That has an extra affect though, which is that only caching the raw sympy.symbols means that we will generate new `SymNode` wrappers around these symbols each time we fakeify a tensor and generate symints for its sizes. This means that when we fakeify in dynamo and then again in AOTAutograd: (1) the symints that we create for the sizes of the fake tensors in these two cases **will** share the same underlying `sympy.symbol` (thanks to the symbol cache) (2) However, they will **not** share the same `SymNode`. This is a problem because `make_fx()` has a `symnode_tracker`, that is effectively a giant map from `[torch.SymNode] -> [Proxy]`. We can end up with symints from both AOTAutograd and Dynamo showing up in the function that we want to run `make_fx()` on, and make_fx() will see two versions of `s1` with different SymNodes, that it would then be obligated to create (different) proxies for. [ghstack-poisoned]

ghstack-source-id: 376553890eea0758e46dc7d8f2d5be184a1a6517 Pull Request resolved: #120090

ezyang · 2024-02-20T14:12:46Z

Meh, I am not convinced by the approach in this PR.

What's not clear to me is why it matters that we have two different SymNodes for the same symbol. Yes, make_fx has a symnode_tracker, but it ordinarily isn't an error condition to have two SymNodes for the same symbol. The usual situation this could occur is if you have s0 = x.size(0) and s0_2 = y.size(0) where x and y compute to the same sympy expression but only after some non-trivial simplification. We will end up with two proxies for this case, but that is OK because one proxy will refer to the x.size(0) projection, while another refers to y.size(0) projection.

This is sort of related to the stuff @eellison was working on. It actually, eventually, is undesirable to have s0 proxy that depends on the tensor x, because it will keep x live longer than it needs to. So there's a separate FX pass that tries to break these dependencies, based on the underlying sympy expression. I know I asked Elias to undo his pass interleaved directly with make_fx because it ended up being complicated and difficult to handle, but maybe a version of it where we eliminate symnode_tracker entirely might still be worth it. But we have to do this carefully because we need to avoid accidentally DCE'ing a spurious dependence on tangents.

github-actions · 2024-04-20T14:33:35Z

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

[test fix] try to re-use cached symnodes across dynamo and AOTAutograd

987c107

[ghstack-poisoned]

bdhirsh mentioned this pull request Feb 16, 2024

allow synthetic base code path as long as none of the aliased tensors are subclasses #119824

Closed

pytorch-bot bot added the release notes: fx release notes category label Feb 16, 2024

bdhirsh added a commit that referenced this pull request Feb 16, 2024

[test fix] try to re-use cached symnodes across dynamo and AOTAutograd

26eec4c

ghstack-source-id: eca9e3bd708490d69e7567d178b40617f16ccb95 Pull Request resolved: #120090

github-actions bot added the ciflow/inductor label Feb 16, 2024

github-actions bot requested review from albanD, antoniojkim, ezyang, miladm and SherlockNoMad February 16, 2024 17:33

bdhirsh mentioned this pull request Feb 16, 2024

add a test that non_overlapping checks dont generate too many guards #120106

Closed

bdhirsh added a commit that referenced this pull request Feb 16, 2024

[test fix] try to re-use cached symnodes across dynamo and AOTAutograd

25685d0

ghstack-source-id: a1ea3edb77fc88720741578461f388340f9ff41c Pull Request resolved: #120090

albanD removed their request for review February 16, 2024 20:12

bdhirsh added a commit that referenced this pull request Feb 17, 2024

[test fix] try to re-use cached symnodes across dynamo and AOTAutograd

7f22e70

ghstack-source-id: 376553890eea0758e46dc7d8f2d5be184a1a6517 Pull Request resolved: #120090

github-actions bot added the Stale label Apr 20, 2024

github-actions bot closed this May 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[test fix] try to re-use cached symnodes across dynamo and AOTAutograd #120090

[test fix] try to re-use cached symnodes across dynamo and AOTAutograd #120090

bdhirsh commented Feb 16, 2024 •

edited

pytorch-bot bot commented Feb 16, 2024 •

edited

ezyang commented Feb 20, 2024 •

edited

github-actions bot commented Apr 20, 2024

[test fix] try to re-use cached symnodes across dynamo and AOTAutograd #120090

[test fix] try to re-use cached symnodes across dynamo and AOTAutograd #120090

Conversation

bdhirsh commented Feb 16, 2024 • edited

pytorch-bot bot commented Feb 16, 2024 • edited

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/120090

❌ 9 New Failures, 2 Unrelated Failures

ezyang commented Feb 20, 2024 • edited

github-actions bot commented Apr 20, 2024

bdhirsh commented Feb 16, 2024 •

edited

pytorch-bot bot commented Feb 16, 2024 •

edited

ezyang commented Feb 20, 2024 •

edited