Setup initial testing harness and cache key generation for AOTAutograd Cache #124642

jamesjwu · 2024-04-22T19:01:10Z

Stack from ghstack (oldest at bottom):

This doesn't introduce any new behavior, but sets up a basic cache key generation mechanism that I can test. From here I will:

Add checks on the ops in an input FXGraph to make sure they are safe to cache. We'll be conservative in the first version here.
Add serialization for FX graphs
Save these FX graphs to disk in the cache
Support graphs with more complicated ops like higher order ops and specialized nn modules

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang

[ghstack-poisoned]

pytorch-bot · 2024-04-22T19:01:13Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/124642

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit d9e2a33 with merge base 935a946 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

[ghstack-poisoned]

ghstack-source-id: 3b232da Pull Request resolved: #124642

jamesjwu · 2024-04-23T16:05:49Z

torch/_inductor/codecache.py

@@ -485,25 +485,56 @@ class FxGraphCachePickler(pickle.Pickler):
    dispatch_table[torch.Tensor] = _reduce_tensor
    dispatch_table[torch.SymInt] = _reduce_symint

-    @staticmethod
-    def dumps(obj) -> bytes:
+    @classmethod


This just lets me reuse the pickling implementation

[ghstack-poisoned]

aorenste · 2024-04-25T14:52:01Z

torch/_inductor/codecache.py

+            elif isinstance(obj, bytes):
+                return "<bytes>"
+            else:
+                return str(obj)


Since this is for debugging would it make sense to repr instead of str?

test/dynamo/test_aot_autograd_cache.py

[ghstack-poisoned]

jamesjwu · 2024-04-29T03:12:49Z

@pytorchbot rebase

[ghstack-poisoned]

pytorchmergebot · 2024-04-29T03:14:19Z

Rebased gh/jamesjwu/17/orig onto refs/remotes/origin/viable/strict because #124745 was rebased, please pull locally before adding more changes (for example, via ghstack checkout https://github.com/pytorch/pytorch/pull/124642)

pytorchmergebot · 2024-04-29T03:14:26Z

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

pytorchmergebot · 2024-04-29T03:14:40Z

Tried to rebase and push PR #124642, but it was already up to date. Try rebasing against main by issuing:
@pytorchbot rebase -b main

[ghstack-poisoned]

bdhirsh · 2024-04-29T15:03:02Z

test/dynamo/test_aot_autograd_cache.py

+            fx_graph = gm
+            return gm
+
+        g = torch.compile(fn, backend=compiler)


you might want fullgraph=True? if there area graph breaks, compiler() will get called multiple times (and the nonlocal will get overwritten with whatever the last graph was)

bdhirsh · 2024-04-29T15:16:41Z

torch/_functorch/_aot_autograd/autograd_cache.py

+
+def autograd_cache_hash(
+    gm: torch.fx.GraphModule,
+    config: AOTConfig,


Hmm, I think caching on the AOTConfig object is mostly reasonable. But just calling out: AOTConfig isn't an input to AOTAutograd - it's mostly a convenience object to bundle up and pass around state about what AOTAutograd should do during tracing, and it contains a mix of inputs to AOTAutograd, and other stuff that is inferred implicitly (is_export=False, aot_id=...).

Just looking at some of the stuff in it:

aot_id: Including AOTConfig.aot_id will probably give us some false negatives (this is just a counter used to uniquely identify each compiled region in a python process).

is_export: AOTAutograd is also used in an export setting where we don't need the warm cache, so this will just always be set to false. num_params_buffers/dynamic_shapes: these are both values that are inferred from looking at the dynamo graph. Then again these are all cheap to cache and won't cause false negatives, so caching on them probably won't matter.

nvm I guess this comment isn't too relevant, since you're already defining a custom reduce for the AOTConfig (so you can explicitly e.g. not hash on aot_id)

bdhirsh · 2024-04-29T16:58:29Z

torch/_functorch/_aot_autograd/autograd_cache.py

+
+def _reduce_aot_config(config: AOTConfig):
+    """
+    Reduce the config to a stable key for caching.


Hmmm - if we care about non-inductor backends working with the cache, then we probably want to include fw_compiler/bw_compiler/decomposition_table in some form here, right?

That is kind of annoyingly redundant for the inductor case - we are already caching on the inductor source code, so caching on the exact e.g. decomposition_table arguments is redundant. But two use cases are:

(1) if someone who is not inductor is using AOTAutograd as their backend, we probably want to recompile if they change the decompositions they pass in

(2) There are two nice backends that we have for debugging: torch.compile(backend="aot_eager") vs. torch.compile(backend="aot_eager_decomp_partition") that are mostly the same exact that they use different decomposition tables and partitioning functions. We probably want the cache key to encode their differences

Yup, I think we do need to add some combination of these, though computing a hash key from the callables directly seems hard. Will add in a followup PR (there's also probably other global settings like vmap we'll need to add too)

[ghstack-poisoned]

jamesjwu · 2024-04-29T23:07:54Z

@pytorchmergebot rebase

pytorchmergebot · 2024-04-29T23:09:25Z

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

[ghstack-poisoned]

pytorchmergebot · 2024-04-29T23:09:40Z

Successfully rebased gh/jamesjwu/16/orig onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via ghstack checkout https://github.com/pytorch/pytorch/pull/124642)

ghstack-source-id: b89440e Pull Request resolved: #124642

[ghstack-poisoned]

jamesjwu · 2024-04-30T15:11:15Z

@pytorchbot merge

pytorchmergebot · 2024-04-30T15:14:10Z

Merge failed

Reason: This PR needs a release notes: label
If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Details for Dev Infra team

Raised by workflow job

jamesjwu · 2024-04-30T15:40:40Z

@pytorchbot merge

pytorchmergebot · 2024-04-30T15:43:14Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

…d Cache (#124642) This doesn't introduce any new behavior, but sets up a basic cache key generation mechanism that I can test. From here I will: - Add checks on the ops in an input FXGraph to make sure they are safe to cache. We'll be conservative in the first version here. - Add serialization for FX graphs - Save these FX graphs to disk in the cache - Support graphs with more complicated ops like higher order ops and specialized nn modules Pull Request resolved: #124642 Approved by: https://github.com/aorenste

Update

275d579

[ghstack-poisoned]

pytorch-bot bot added ciflow/inductor module: dynamo module: inductor labels Apr 22, 2024

Update

74b3b23

[ghstack-poisoned]

Update

3caa2c3

[ghstack-poisoned]

jamesjwu added a commit that referenced this pull request Apr 22, 2024

Setup initial testing harness for AOTAutograd Cache

5a91f3d

ghstack-source-id: 3b232da Pull Request resolved: #124642

jamesjwu changed the title ~~Setup initial testing harness for AOTAutograd Cache~~ Setup initial testing harness and cache key generation for AOTAutograd Cache Apr 23, 2024

jamesjwu requested review from oulgen and bdhirsh April 23, 2024 16:03

jamesjwu marked this pull request as ready for review April 23, 2024 16:05

jamesjwu commented Apr 23, 2024

View reviewed changes

Update

f8b3c2f

[ghstack-poisoned]

jamesjwu mentioned this pull request Apr 23, 2024

Add basic sanity checks for graph ops to cache key #124745

Closed

jamesjwu requested a review from aorenste April 25, 2024 14:42

aorenste approved these changes Apr 25, 2024

View reviewed changes

Code review

ddac0c6

[ghstack-poisoned]

Update

ca5a850

[ghstack-poisoned]

Update

cd04258

[ghstack-poisoned]

jamesjwu mentioned this pull request Apr 29, 2024

Implement serialization and deserialization for fx graphs produced by AOTAutograd #125157

Closed

bdhirsh reviewed Apr 29, 2024

View reviewed changes

Code review

486cdfe

[ghstack-poisoned]

Update

9e226d9

[ghstack-poisoned]

pytorchmergebot pushed a commit that referenced this pull request Apr 29, 2024

Setup initial testing harness for AOTAutograd Cache

a0fe4e0

ghstack-source-id: b89440e Pull Request resolved: #124642

jamesjwu added 2 commits April 29, 2024 20:56

Update

3579c10

[ghstack-poisoned]

Rebase

d9e2a33

[ghstack-poisoned]

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Apr 30, 2024

pytorchmergebot added the merging label Apr 30, 2024

pytorchmergebot removed the merging label Apr 30, 2024

jamesjwu added the topic: not user facing topic category label Apr 30, 2024

pytorchmergebot added the merging label Apr 30, 2024

pytorchmergebot added the Merged label Apr 30, 2024

pytorchmergebot closed this in 07958c5 Apr 30, 2024

pytorchmergebot removed the merging label Apr 30, 2024

github-actions bot deleted the gh/jamesjwu/16/head branch June 4, 2024 01:58

Setup initial testing harness and cache key generation for AOTAutograd Cache #124642

Setup initial testing harness and cache key generation for AOTAutograd Cache #124642

Uh oh!

Conversation

jamesjwu commented Apr 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Apr 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/124642

✅ No Failures

Uh oh!

jamesjwu Apr 23, 2024

Choose a reason for hiding this comment

Uh oh!

aorenste Apr 25, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jamesjwu commented Apr 29, 2024

Uh oh!

pytorchmergebot commented Apr 29, 2024

Uh oh!

pytorchmergebot commented Apr 29, 2024

Uh oh!

pytorchmergebot commented Apr 29, 2024

Uh oh!

bdhirsh Apr 29, 2024

Choose a reason for hiding this comment

Uh oh!

bdhirsh Apr 29, 2024

Choose a reason for hiding this comment

Uh oh!

bdhirsh Apr 29, 2024

Choose a reason for hiding this comment

Uh oh!

bdhirsh Apr 29, 2024

Choose a reason for hiding this comment

Uh oh!

jamesjwu Apr 29, 2024

Choose a reason for hiding this comment

Uh oh!

jamesjwu commented Apr 29, 2024

Uh oh!

pytorchmergebot commented Apr 29, 2024

Uh oh!

pytorchmergebot commented Apr 29, 2024

Uh oh!

jamesjwu commented Apr 30, 2024

Uh oh!

pytorchmergebot commented Apr 30, 2024

Merge failed

Uh oh!

jamesjwu commented Apr 30, 2024

Uh oh!

pytorchmergebot commented Apr 30, 2024

Merge started

Uh oh!

Uh oh!

jamesjwu commented Apr 22, 2024 •

edited

Loading

pytorch-bot bot commented Apr 22, 2024 •

edited

Loading