[mm_logs][ez] dump tuned mm info at lowering stage #148363

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Closed

YUNQIUGUO wants to merge 1 commit into pytorch:main from YUNQIUGUO:export-D70507880

Contributor

YUNQIUGUO commented Mar 3, 2025 •

edited by pytorch-bot bot

Loading

Summary:
As title. it would be beneficial for judging e2e perf improvement

Easy first step to dump mm info at lowering stage.

e.g.

fbsource/fbcode/caffe2/torch/_inductor/kernel/mm.py:525] [0/0] Tuned aten.addmm: m=16, n=6, k=16, layout=FixedLayout('cuda:0', torch.float32, size=[16, 6], stride=[6, 1])

Next step:

Dump overview info at post_grad_graph stage such as
overall count of aten.mm in the graph & visualize to a table structure.

Test Plan: by looking very hard in aot inductor bmm and mm UTs.

Differential Revision: D70507880

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov

pytorch-bot bot commented Mar 3, 2025 •

edited

Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/148363

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit d1dd36a with merge base e4c558b ():

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

trunk / libtorch-linux-focal-cuda12.4-py3.10-gcc9-debug / build (gh) (#148495)
undefined reference to std::__throw_bad_array_new_length()'`

This comment was automatically generated by Dr. CI and updates every 15 minutes.

pytorch-bot bot added ciflow/inductor module: inductor labels

YUNQIUGUO force-pushed the export-D70507880 branch 2 times, most recently from e4c47be to 46156dc Compare

March 3, 2025 21:18

Contributor

facebook-github-bot commented Mar 3, 2025

This pull request was exported from Phabricator. Differential Revision: D70507880

facebook-github-bot added the fb-exported label

Contributor

facebook-github-bot commented Mar 3, 2025

This pull request was exported from Phabricator. Differential Revision: D70507880

YUNQIUGUO force-pushed the export-D70507880 branch from 46156dc to 4449118 Compare

March 3, 2025 21:18

YUNQIUGUO added the topic: not user facing label

ColinPeppler requested a review from eellison

March 3, 2025 21:24

Contributor

ColinPeppler commented Mar 3, 2025

I think log.info is the appropriate log level. LGTM!

Contributor

facebook-github-bot commented Mar 3, 2025

This pull request was exported from Phabricator. Differential Revision: D70507880

YUNQIUGUO force-pushed the export-D70507880 branch from 4449118 to c816591 Compare

March 3, 2025 21:34

Contributor

facebook-github-bot commented Mar 3, 2025

This pull request was exported from Phabricator. Differential Revision: D70507880

YUNQIUGUO force-pushed the export-D70507880 branch from c816591 to 426cce7 Compare

March 3, 2025 21:49

YUNQIUGUO added a commit to YUNQIUGUO/pytorch that referenced this pull request


          [mm_logs][ez] dump tuned mm info at lowering stage (pytorch#148363)

426cce7

Summary:
Pull Request resolved: pytorch#148363

As title. it would be beneficial for judging e2e perf improvement

Easy first step to dump mm info at lowering stage.

e.g.

```
fbsource/fbcode/caffe2/torch/_inductor/kernel/mm.py:525] [0/0] Tuned aten.addmm: m=16, n=6, k=16, layout=FixedLayout('cuda:0', torch.float32, size=[16, 6], stride=[6, 1])
```

Next step:

Dump overview info at `post_grad_graph` stage such as
overall count of `aten.mm` in the graph & visualize to a table structure.

Test Plan: by looking very hard in aot inductor bmm and mm UTs.

Differential Revision: D70507880

YUNQIUGUO force-pushed the export-D70507880 branch from 426cce7 to a4bebee Compare

March 4, 2025 01:27

Contributor

facebook-github-bot commented Mar 4, 2025

This pull request was exported from Phabricator. Differential Revision: D70507880

YUNQIUGUO force-pushed the export-D70507880 branch 2 times, most recently from de8ab58 to 5852540 Compare

March 4, 2025 06:41

Contributor

facebook-github-bot commented Mar 4, 2025

This pull request was exported from Phabricator. Differential Revision: D70507880

YUNQIUGUO force-pushed the export-D70507880 branch from 5852540 to 1b63fc1 Compare

March 4, 2025 06:47

YUNQIUGUO added a commit to YUNQIUGUO/pytorch that referenced this pull request


          [mm_logs][ez] dump tuned mm info at lowering stage (pytorch#148363)

cb34040

Summary:

As title. it would be beneficial for judging e2e perf improvement

Easy first step to dump mm info at lowering stage.

e.g.

```
fbcode/caffe2/torch/_inductor/kernel/mm.py:525] [0/0] Tuned aten.addmm: m=16, n=6, k=16, mat1_dtype=torch.float32, mat2_dtype=torch.float32, output_layout=FixedLayout('cpu', torch.float32, size=[16, 6], stride=[6, 1])
```


Next step:

Dump overview info at `post_grad_graph` stage such as
overall count of `aten.mm` in the graph & visualize to a table structure.

Test Plan: by looking very hard in aot inductor bmm and mm UTs.

Reviewed By: ColinPeppler

Differential Revision: D70507880

YUNQIUGUO force-pushed the export-D70507880 branch from 1b63fc1 to cb34040 Compare

March 4, 2025 06:58

Contributor

facebook-github-bot commented Mar 4, 2025

This pull request was exported from Phabricator. Differential Revision: D70507880

YUNQIUGUO requested a review from henrylhtsang

March 4, 2025 20:40


          [mm_logs][ez] dump tuned mm info at lowering stage (pytorch#148363)

d1dd36a

Summary:

As title. it would be beneficial for judging e2e perf improvement

Easy first step to dump mm info at lowering stage.

e.g.

```
fbcode/caffe2/torch/_inductor/kernel/mm.py:525] [0/0] Tuned aten.addmm: m=16, n=6, k=16, mat1_dtype=torch.float32, mat2_dtype=torch.float32, output_layout=FixedLayout('cpu', torch.float32, size=[16, 6], stride=[6, 1])
```


Next step:

Dump overview info at `post_grad_graph` stage such as
overall count of `aten.mm` in the graph & visualize to a table structure.

Test Plan: by looking very hard in aot inductor bmm and mm UTs.

Reviewed By: ColinPeppler

Differential Revision: D70507880

YUNQIUGUO force-pushed the export-D70507880 branch from cb34040 to d1dd36a Compare

March 4, 2025 20:46

Contributor

facebook-github-bot commented Mar 4, 2025

This pull request was exported from Phabricator. Differential Revision: D70507880

henrylhtsang approved these changes

View reviewed changes

pytorch-bot bot added the ciflow/trunk label

Contributor

facebook-github-bot commented Mar 5, 2025

@pytorchbot merge

(Initiating merge automatically since Phabricator Diff has merged)

pytorchmergebot added the merging label

Collaborator

pytorchmergebot commented Mar 5, 2025

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot added the Merged label

pytorchmergebot closed this in

1673bc7

pytorchmergebot removed the merging label

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/inductor ciflow/trunk fb-exported Merged module: inductor topic: not user facing