[inductor] optimize isa dry compile time. #124602

xuhancn · 2024-04-22T09:14:10Z

Fixes #100378
Original issue caused by startup dry compile need cost almost 1 second.

This PR add compiler version info, isa build options and pytorch version info to the test binary path hash.
So same compile, same isa and same pytorch can skip the dry compile.

Local test:
First time:

We need to compile all c++ modules and it cost 16.5s.

Second time:

We skipped dry compile due to the same isa fingerprint. It is only cost 0.36s.

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @voznesenskym @penguinwu @EikanWang @Guobing-Chen @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang

pytorch-bot · 2024-04-22T09:14:13Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/124602

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit fe29f70 with merge base 87079f5 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

xuhancn · 2024-04-22T12:25:25Z

@pytorchbot rebase

pytorchmergebot · 2024-04-22T12:26:51Z

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

pytorchmergebot · 2024-04-22T12:26:57Z

Successfully rebased xu_optimize_isa_dry_compile onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout xu_optimize_isa_dry_compile && git pull --rebase)

…ncn/pytorch into xu_optimize_isa_dry_compile

torch/_inductor/codecache.py

ezyang · 2024-04-24T13:53:16Z

I guess we can land this as is, but you're still doing multiple(!) subprocess calls, and I would hope that there was some way to architect this where those were not needed either.

xuhancn · 2024-04-24T14:05:06Z

I guess we can land this as is, but you're still doing multiple(!) subprocess calls, and I would hope that there was some way to architect this where those were not needed either.

Here has some lru_cache to cache result. What optimization you still want to do?

xuhancn · 2024-04-24T14:27:19Z

3e09156 @ezyang I add a cache to pick_vec_isa and it would reduce the callee function run.

xuhancn · 2024-04-24T16:55:26Z

@malfet Any comment to me for next step?

ezyang · 2024-04-24T18:17:25Z

lru_cache doesn't help, you need a cache that persists across processes.

xuhancn · 2024-04-24T19:50:30Z

lru_cache doesn't help, you need a cache that persists across processes.

I guess you means mutiple process call pick_vec_isa to get isa for cpp build.
I think what I can do is that write the isa info to the temp file. But still a problem, we always need to get compiler version information to ensure the compiler's capability can cover the isa level. At the same time, get compile version still need to call subprocess. So I think we maybe not have potential to reduce call subprocess.

xuhancn · 2024-04-25T08:59:18Z

@ezyang we can land this PR firstly.
If any other optimize idea let create new PR to land.

ezyang · 2024-04-25T14:14:51Z

ci fails look real

xuhancn · 2024-04-25T14:23:05Z

ci fails look real

Let me debug for it.

This reverts commit 3e09156.

xuhancn · 2024-04-25T18:35:07Z

ci fails look real

@ezyang the CI was failed as the reason that lru_cache for pick_vec_isa cache the value not match the test case expected result. After revert the related code, it works well.

xuhancn · 2024-04-25T18:36:28Z

@pytorchmergebot merge

pytorchmergebot · 2024-04-25T18:38:22Z

Merge failed

Reason: This PR needs a release notes: label
If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Details for Dev Infra team

Raised by workflow job

xuhancn · 2024-04-25T18:38:35Z

@pytorchbot label "topic: not user facing"

xuhancn · 2024-04-25T18:38:51Z

@pytorchmergebot merge

pytorchmergebot · 2024-04-25T18:40:39Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Fixes pytorch#100378 Original issue caused by startup dry compile need cost almost 1 second. This PR add compiler version info, isa build options and pytorch version info to the test binary path hash. So same compile, same isa and same pytorch can skip the dry compile. Local test: First time: <img width="1588" alt="image" src="https://github.com/pytorch/pytorch/assets/8433590/d0b83f5d-849e-4f37-9977-3b0276e5a5a5"> We need to compile all c++ modules and it cost 16.5s. Second time: <img width="1589" alt="image" src="https://github.com/pytorch/pytorch/assets/8433590/44f07fb0-5a15-4342-b0f6-dfe2c880b5d3"> We skipped dry compile due to the same isa fingerprint. It is only cost 0.36s. Pull Request resolved: pytorch#124602 Approved by: https://github.com/jgong5, https://github.com/ezyang

optimize isa dry compile time.

7a00a2a

xuhancn requested a review from EikanWang April 22, 2024 09:14

pytorch-bot bot added ciflow/inductor module: inductor labels Apr 22, 2024

pytorchbot added the open source label Apr 22, 2024

remove debug log.

d69e743

xuhancn marked this pull request as ready for review April 22, 2024 09:24

xuhancn requested a review from jgong5 April 22, 2024 09:24

xuhancn added 2 commits April 22, 2024 12:26

optimize isa dry compile time.

c9b2666

remove debug log.

6a1b4ec

pytorchmergebot force-pushed the xu_optimize_isa_dry_compile branch from d69e743 to 6a1b4ec Compare April 22, 2024 12:26

xuhancn mentioned this pull request Apr 22, 2024

Accelerate ISA detection #124366

Closed

xuhancn added 2 commits April 22, 2024 21:36

Merge branch 'pytorch:main' into xu_optimize_isa_dry_compile

39bc680

Merge branch 'xu_optimize_isa_dry_compile' of https://github.com/xuha…

74e75c3

…ncn/pytorch into xu_optimize_isa_dry_compile

xuhancn added the intel This tag is for PR from Intel label Apr 22, 2024

colesbury added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Apr 22, 2024

jgong5 approved these changes Apr 24, 2024

View reviewed changes

torch/_inductor/codecache.py Outdated Show resolved Hide resolved

fix typo.

c79b9a2

xuhancn requested a review from ezyang April 24, 2024 07:13

xuhancn changed the title ~~optimize isa dry compile time.~~ [inductor] optimize isa dry compile time. Apr 24, 2024

xuhancn mentioned this pull request Apr 24, 2024

VecISA.__bool__ is very expensive (nearly a second) on startup #100378

Closed

cache pick_vec_isa to improve performance.

3e09156

xuhancn requested a review from malfet April 24, 2024 16:54

xuhancn added the intel priority matters to intel architecture from performance wise label Apr 24, 2024

ezyang approved these changes Apr 25, 2024

View reviewed changes

xuhancn added 2 commits April 25, 2024 14:53

Revert "cache pick_vec_isa to improve performance."

fa48123

This reverts commit 3e09156.

Merge branch 'pytorch:main' into xu_optimize_isa_dry_compile

fe29f70

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Apr 25, 2024

pytorchmergebot added the merging label Apr 25, 2024

pytorchmergebot removed the merging label Apr 25, 2024

pytorch-bot bot added the topic: not user facing topic category label Apr 25, 2024

pytorchmergebot added the merging label Apr 25, 2024

pytorchmergebot added the Merged label Apr 25, 2024

pytorchmergebot closed this in c715e76 Apr 25, 2024

pytorchmergebot removed the merging label Apr 25, 2024

xuhancn deleted the xu_optimize_isa_dry_compile branch April 26, 2024 01:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[inductor] optimize isa dry compile time. #124602

[inductor] optimize isa dry compile time. #124602

xuhancn commented Apr 22, 2024 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Apr 22, 2024 •

edited

Loading

xuhancn commented Apr 22, 2024

pytorchmergebot commented Apr 22, 2024

pytorchmergebot commented Apr 22, 2024

ezyang commented Apr 24, 2024

xuhancn commented Apr 24, 2024

xuhancn commented Apr 24, 2024

xuhancn commented Apr 24, 2024

ezyang commented Apr 24, 2024

xuhancn commented Apr 24, 2024

xuhancn commented Apr 25, 2024

ezyang commented Apr 25, 2024

xuhancn commented Apr 25, 2024

xuhancn commented Apr 25, 2024 •

edited

Loading

xuhancn commented Apr 25, 2024

pytorchmergebot commented Apr 25, 2024

xuhancn commented Apr 25, 2024

xuhancn commented Apr 25, 2024

pytorchmergebot commented Apr 25, 2024

[inductor] optimize isa dry compile time. #124602

[inductor] optimize isa dry compile time. #124602

Conversation

xuhancn commented Apr 22, 2024 • edited by pytorch-bot bot Loading

pytorch-bot bot commented Apr 22, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/124602

✅ No Failures

xuhancn commented Apr 22, 2024

pytorchmergebot commented Apr 22, 2024

pytorchmergebot commented Apr 22, 2024

ezyang commented Apr 24, 2024

xuhancn commented Apr 24, 2024

xuhancn commented Apr 24, 2024

xuhancn commented Apr 24, 2024

ezyang commented Apr 24, 2024

xuhancn commented Apr 24, 2024

xuhancn commented Apr 25, 2024

ezyang commented Apr 25, 2024

xuhancn commented Apr 25, 2024

xuhancn commented Apr 25, 2024 • edited Loading

xuhancn commented Apr 25, 2024

pytorchmergebot commented Apr 25, 2024

Merge failed

xuhancn commented Apr 25, 2024

xuhancn commented Apr 25, 2024

pytorchmergebot commented Apr 25, 2024

Merge started

xuhancn commented Apr 22, 2024 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Apr 22, 2024 •

edited

Loading

xuhancn commented Apr 25, 2024 •

edited

Loading