Add `flag _metrics_log_runtime` to disable runtime metric logging by default #153506

exclamaforte · 2025-05-13T23:58:44Z

#152708 expanded support of get_estimated_runtime to many more types of SchedulerNodes. This caused an increase in compile time because we're always calling get_estimated_runtime to populate the metrics table. This PR adds a flag for this logging, which reduces the instruction count by 8%. Long term, we should probably merge metrics.py with TORCH_LOGS/tlparse (suggestion from @xmfan).

Update: added support for TORCH_LOGS for the metrics logging.

Test Plan:
mm_loop.py and many existing tests cover.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov

pytorch-bot · 2025-05-13T23:58:48Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/153506

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit e3f7169 with merge base 72a3c8d ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

eellison

need to fix tests..

eellison · 2025-05-15T16:15:09Z

torch/_inductor/compile_fx.py

-                    metrics.num_bytes_accessed += num_bytes
-                    metrics.node_runtimes += node_runtimes
-                    metrics.nodes_num_elem += nodes_num_elem
+                    if config._metrics_log_runtime:


Should we just move this to TORCH_LOGS="inductor_metrics" ? it's a small change to add the logs - see https://github.com/pytorch/pytorch/pull/147248/files

eellison

looks good

eellison · 2025-05-21T18:26:21Z

test/dynamo/test_modules.py

        "inductor backend is not available",
    )
    def test_save_and_load_inductor(self):
+        torch._logging.set_logs(inductor_metrics=True)


set back to none at end

exclamaforte · 2025-05-21T19:58:16Z

@pytorchbot merge

pytorchmergebot · 2025-05-21T20:01:05Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2025-05-21T20:06:33Z

Merge failed

Reason: 1 mandatory check(s) failed. The first few are:

pull / linux-focal-cuda12.6-py3.10-gcc11-sm89 / test (default, 5, 5, linux.g6.4xlarge.experimental.nvidia.gpu)

Dig deeper by viewing the failures on hud

Details for Dev Infra team

Raised by workflow job

Failing merge rule: Core Maintainers

exclamaforte · 2025-05-21T20:11:09Z

@pytorchbot merge

pytorchmergebot · 2025-05-21T20:13:08Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2025-05-21T20:29:08Z

Merge failed

Reason: 1 mandatory check(s) failed. The first few are:

pull / before-test / target-determination

Dig deeper by viewing the failures on hud

Details for Dev Infra team

Raised by workflow job

Failing merge rule: Core Maintainers

exclamaforte · 2025-05-22T00:17:27Z

@pytorchbot merge

pytorchmergebot · 2025-05-22T00:19:27Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorch-bot bot added ciflow/inductor module: inductor labels May 13, 2025

exclamaforte changed the title ~~add flag _metrics_log_runtime to disable runtime metric logging by default~~ Add flag _metrics_log_runtime to disable runtime metric logging by default May 13, 2025

exclamaforte added the topic: improvements topic category label May 14, 2025

exclamaforte requested review from eellison and laithsakka May 14, 2025 00:01

exclamaforte added the release notes: inductor label May 14, 2025

eellison reviewed May 15, 2025

View reviewed changes

exclamaforte force-pushed the exclamaforte/flag-metrics branch from 2739678 to 7a15fa1 Compare May 20, 2025 18:48

pytorch-bot bot added the module: dynamo label May 20, 2025

exclamaforte removed the module: dynamo label May 20, 2025

pytorch-bot bot added the module: dynamo label May 20, 2025

exclamaforte force-pushed the exclamaforte/flag-metrics branch from de5d03a to f5b3f92 Compare May 21, 2025 08:21

exclamaforte requested a review from eellison May 21, 2025 17:05

eellison approved these changes May 21, 2025

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label May 21, 2025

pytorchmergebot added the merging label May 21, 2025

pytorchmergebot removed the merging label May 21, 2025

pytorchmergebot added the merging label May 21, 2025

pytorchmergebot removed the merging label May 21, 2025

exclamaforte added 3 commits May 21, 2025 13:57

add flag _metrics_log_runtime

27e681a

add inductor metrics to logging

ac64b37

remove breakpoint

a76b5d5

exclamaforte added 8 commits May 21, 2025 13:57

fix tests

bf45f18

fix tests

b1c2856

fix tests that are missing new log setting

79d1850

lint

266bf1f

skip sm70

5a8a3dc

unset logging in tests

7611297

add simple docstring to codegen_and_compile to stop linter

60794df

add missing log start

e3f7169

exclamaforte force-pushed the exclamaforte/flag-metrics branch from cfc623b to e3f7169 Compare May 21, 2025 20:57

pytorchmergebot added the merging label May 22, 2025

pytorchmergebot closed this in 254293b May 22, 2025

pytorchmergebot added Merged and removed merging labels May 22, 2025

github-actions bot deleted the exclamaforte/flag-metrics branch June 21, 2025 02:20

Add flag _metrics_log_runtime to disable runtime metric logging by default #153506

Add flag _metrics_log_runtime to disable runtime metric logging by default #153506

Uh oh!

Conversation

exclamaforte commented May 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented May 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/153506

✅ No Failures

Uh oh!

eellison left a comment

Choose a reason for hiding this comment

Uh oh!

eellison May 15, 2025

Choose a reason for hiding this comment

Uh oh!

eellison left a comment

Choose a reason for hiding this comment

Uh oh!

eellison May 21, 2025

Choose a reason for hiding this comment

Uh oh!

exclamaforte commented May 21, 2025

Uh oh!

pytorchmergebot commented May 21, 2025

Merge started

Uh oh!

pytorchmergebot commented May 21, 2025

Merge failed

Uh oh!

exclamaforte commented May 21, 2025

Uh oh!

pytorchmergebot commented May 21, 2025

Merge started

Uh oh!

pytorchmergebot commented May 21, 2025

Merge failed

Uh oh!

exclamaforte commented May 22, 2025

Uh oh!

pytorchmergebot commented May 22, 2025

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add `flag _metrics_log_runtime` to disable runtime metric logging by default #153506

Add `flag _metrics_log_runtime` to disable runtime metric logging by default #153506

exclamaforte commented May 13, 2025 •

edited

Loading

pytorch-bot bot commented May 13, 2025 •

edited

Loading