[FakeTensor] Reuse flat_args throughout FakeTensorMode.dispatch #112418

peterbell10 · 2023-10-30T18:26:00Z

Stack from ghstack (oldest at bottom):

This function repeatedly flattens and unflattens the args, kwargs pair so we
get a quite significant perf improvement from saving the flat_args and
operating directly on those. I see a 15% improvement in dispatch for
empty_strided.

This function repeatedly flattens and unflattens the `args, kwargs` pair so we get a quite significant perf improvement from saving the `flat_args` and operating directly on those. I see a 15% improvement in dispatch for `empty_strided`. [ghstack-poisoned]

pytorch-bot · 2023-10-30T18:26:05Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/112418

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 2cafa24 with merge base 29f3d39 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

This function repeatedly flattens and unflattens the `args, kwargs` pair so we get a quite significant perf improvement from saving the `flat_args` and operating directly on those. I see a 15% improvement in dispatch for `empty_strided`. [ghstack-poisoned]

lezcano

Two optional nits

lezcano · 2023-11-01T09:22:01Z

torch/_subclasses/fake_tensor.py


-    def wrap_meta_outputs_with_default_device_logic(self, r, func, args, kwargs):
-        wrap = self.gen_wrap_fn(func, args, kwargs)
+    def wrap_meta_outputs_with_default_device_logic(self, r, func, flat_args, kwargs):


nit. perhaps just pass `device rather than passing flat_args and kwargs. I found this a bit confusing.

lezcano · 2023-11-01T09:24:17Z

torch/_subclasses/fake_tensor.py


-    return tree_map(map_out, r)
+    flat_out = [map_out(o) for o in flat_out]
+    return pytree.tree_unflatten(flat_out, out_spec)


Here perhaps we just want to tree_map all the transformation on r rather than unpacking and packing again?

…patch" This function repeatedly flattens and unflattens the `args, kwargs` pair so we get a quite significant perf improvement from saving the `flat_args` and operating directly on those. I see a 15% improvement in dispatch for `empty_strided`. [ghstack-poisoned]

`ShapeEnv` has tons of functionallity that is conditioned on this `translation_validation_enabled()` check, to the point where 8% of time in `empty_strided` is spent just in that function. However, it doesn't really make sense for the value of `translation_validation_enabled()` to change throughout the life of a `ShapeEnv` so we might as well run the check once and store it in the `ShapeEnv`. Pull Request resolved: #112493 Approved by: https://github.com/lezcano ghstack dependencies: #112418

…rch#112418) This function repeatedly flattens and unflattens the `args, kwargs` pair so we get a quite significant perf improvement from saving the `flat_args` and operating directly on those. I see a 15% improvement in dispatch for `empty_strided`. Pull Request resolved: pytorch#112418 Approved by: https://github.com/lezcano

`ShapeEnv` has tons of functionallity that is conditioned on this `translation_validation_enabled()` check, to the point where 8% of time in `empty_strided` is spent just in that function. However, it doesn't really make sense for the value of `translation_validation_enabled()` to change throughout the life of a `ShapeEnv` so we might as well run the check once and store it in the `ShapeEnv`. Pull Request resolved: pytorch#112493 Approved by: https://github.com/lezcano ghstack dependencies: pytorch#112418

…rch#112418) This function repeatedly flattens and unflattens the `args, kwargs` pair so we get a quite significant perf improvement from saving the `flat_args` and operating directly on those. I see a 15% improvement in dispatch for `empty_strided`. Pull Request resolved: pytorch#112418 Approved by: https://github.com/lezcano

`ShapeEnv` has tons of functionallity that is conditioned on this `translation_validation_enabled()` check, to the point where 8% of time in `empty_strided` is spent just in that function. However, it doesn't really make sense for the value of `translation_validation_enabled()` to change throughout the life of a `ShapeEnv` so we might as well run the check once and store it in the `ShapeEnv`. Pull Request resolved: pytorch#112493 Approved by: https://github.com/lezcano ghstack dependencies: pytorch#112418

This was referenced Oct 30, 2023

[pytree] Avoid constructing intermediate lists in tree_{flatten,leaves} #112391

Closed

[pytree] Remove LeafSpec construction cost in tree_flatten #112392

Closed

[pytree] Add arg_tree_leaves to optimize flattening function arguments #112393

Closed

This was referenced Oct 30, 2023

Use pytree.arg_tree_leaves everywhere #112394

Closed

Use pytree.tree_map_ everywhere #112417

Closed

github-actions bot added the ciflow/inductor label Oct 30, 2023

pytorchbot added the open source label Oct 30, 2023

peterbell10 mentioned this pull request Oct 31, 2023

[fx] Cache translation_validation_enabled on ShapeEnv #112493

Closed

peterbell10 changed the title ~~Reuse flat_args throughout FakeTensorMode.dispatch~~ [FakeTensor] Reuse flat_args throughout FakeTensorMode.dispatch Oct 31, 2023

peterbell10 marked this pull request as ready for review October 31, 2023 15:53

peterbell10 requested a review from lezcano October 31, 2023 15:53

peterbell10 added the topic: not user facing topic category label Oct 31, 2023

lezcano approved these changes Nov 1, 2023

View reviewed changes

peterbell10 added 2 commits November 1, 2023 10:32

pytorchmergebot added the Merged label Nov 1, 2023

pytorchmergebot closed this in 9e89c36 Nov 1, 2023

facebook-github-bot deleted the gh/peterbell10/651/head branch November 5, 2023 15:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[FakeTensor] Reuse flat_args throughout FakeTensorMode.dispatch #112418

[FakeTensor] Reuse flat_args throughout FakeTensorMode.dispatch #112418

Uh oh!

peterbell10 commented Oct 30, 2023 •

edited

Loading

Uh oh!

pytorch-bot bot commented Oct 30, 2023 •

edited

Loading

Uh oh!

lezcano left a comment

Uh oh!

lezcano Nov 1, 2023

Uh oh!

lezcano Nov 1, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[FakeTensor] Reuse flat_args throughout FakeTensorMode.dispatch #112418

[FakeTensor] Reuse flat_args throughout FakeTensorMode.dispatch #112418

Uh oh!

Conversation

peterbell10 commented Oct 30, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Oct 30, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/112418

✅ No Failures

Uh oh!

lezcano left a comment

Choose a reason for hiding this comment

Uh oh!

lezcano Nov 1, 2023

Choose a reason for hiding this comment

Uh oh!

lezcano Nov 1, 2023

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

peterbell10 commented Oct 30, 2023 •

edited

Loading

pytorch-bot bot commented Oct 30, 2023 •

edited

Loading