-
Notifications
You must be signed in to change notification settings - Fork 21.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[inductor] optimize isa dry compile time. #124602
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/124602
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit fe29f70 with merge base 87079f5 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
@pytorchbot rebase |
@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here |
Successfully rebased |
d69e743
to
6a1b4ec
Compare
I guess we can land this as is, but you're still doing multiple(!) subprocess calls, and I would hope that there was some way to architect this where those were not needed either. |
Here has some |
@malfet Any comment to me for next step? |
lru_cache doesn't help, you need a cache that persists across processes. |
I guess you means mutiple process call |
@ezyang we can land this PR firstly. |
ci fails look real |
Let me debug for it. |
@ezyang the CI was failed as the reason that |
@pytorchmergebot merge |
Merge failedReason: This PR needs a If not, please add the To add a label, you can comment to pytorchbot, for example For more information, see Details for Dev Infra teamRaised by workflow job |
@pytorchbot label "topic: not user facing" |
@pytorchmergebot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Fixes pytorch#100378 Original issue caused by startup dry compile need cost almost 1 second. This PR add compiler version info, isa build options and pytorch version info to the test binary path hash. So same compile, same isa and same pytorch can skip the dry compile. Local test: First time: <img width="1588" alt="image" src="https://github.com/pytorch/pytorch/assets/8433590/d0b83f5d-849e-4f37-9977-3b0276e5a5a5"> We need to compile all c++ modules and it cost 16.5s. Second time: <img width="1589" alt="image" src="https://github.com/pytorch/pytorch/assets/8433590/44f07fb0-5a15-4342-b0f6-dfe2c880b5d3"> We skipped dry compile due to the same isa fingerprint. It is only cost 0.36s. Pull Request resolved: pytorch#124602 Approved by: https://github.com/jgong5, https://github.com/ezyang
Fixes pytorch#100378 Original issue caused by startup dry compile need cost almost 1 second. This PR add compiler version info, isa build options and pytorch version info to the test binary path hash. So same compile, same isa and same pytorch can skip the dry compile. Local test: First time: <img width="1588" alt="image" src="https://github.com/pytorch/pytorch/assets/8433590/d0b83f5d-849e-4f37-9977-3b0276e5a5a5"> We need to compile all c++ modules and it cost 16.5s. Second time: <img width="1589" alt="image" src="https://github.com/pytorch/pytorch/assets/8433590/44f07fb0-5a15-4342-b0f6-dfe2c880b5d3"> We skipped dry compile due to the same isa fingerprint. It is only cost 0.36s. Pull Request resolved: pytorch#124602 Approved by: https://github.com/jgong5, https://github.com/ezyang
Fixes pytorch#100378 Original issue caused by startup dry compile need cost almost 1 second. This PR add compiler version info, isa build options and pytorch version info to the test binary path hash. So same compile, same isa and same pytorch can skip the dry compile. Local test: First time: <img width="1588" alt="image" src="https://github.com/pytorch/pytorch/assets/8433590/d0b83f5d-849e-4f37-9977-3b0276e5a5a5"> We need to compile all c++ modules and it cost 16.5s. Second time: <img width="1589" alt="image" src="https://github.com/pytorch/pytorch/assets/8433590/44f07fb0-5a15-4342-b0f6-dfe2c880b5d3"> We skipped dry compile due to the same isa fingerprint. It is only cost 0.36s. Pull Request resolved: pytorch#124602 Approved by: https://github.com/jgong5, https://github.com/ezyang
Fixes #100378
Original issue caused by startup dry compile need cost almost 1 second.
This PR add compiler version info, isa build options and pytorch version info to the test binary path hash.
So same compile, same isa and same pytorch can skip the dry compile.
Local test:
![image](https://private-user-images.githubusercontent.com/8433590/324404488-d0b83f5d-849e-4f37-9977-3b0276e5a5a5.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTk2MDk0MDAsIm5iZiI6MTcxOTYwOTEwMCwicGF0aCI6Ii84NDMzNTkwLzMyNDQwNDQ4OC1kMGI4M2Y1ZC04NDllLTRmMzctOTk3Ny0zYjAyNzZlNWE1YTUucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI0MDYyOCUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNDA2MjhUMjExMTQwWiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9M2E1ODA0MzFkNDQ5ZDY0MTIyYzg3NTg1MjNhZDJlNTdhYmU4NWNiM2Q1YWFmOGY4Y2ZhOGNkYzJiNTFmN2VjNCZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmYWN0b3JfaWQ9MCZrZXlfaWQ9MCZyZXBvX2lkPTAifQ.s8hXC6JD92IYqfjtmreFSvOrSOGu0luN79i2SwN2C5Q)
First time:
We need to compile all c++ modules and it cost 16.5s.
Second time:
![image](https://private-user-images.githubusercontent.com/8433590/324404833-44f07fb0-5a15-4342-b0f6-dfe2c880b5d3.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTk2MDk0MDAsIm5iZiI6MTcxOTYwOTEwMCwicGF0aCI6Ii84NDMzNTkwLzMyNDQwNDgzMy00NGYwN2ZiMC01YTE1LTQzNDItYjBmNi1kZmUyYzg4MGI1ZDMucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI0MDYyOCUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNDA2MjhUMjExMTQwWiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9NzA1NzE4YWE0NjYyMGU3NmU4NWVlNjc2NjUzYjVmNzhkNGQ0YjZjNWVjMjA0MzYyZmMxZjdiMzkwY2Q4YWU3NCZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmYWN0b3JfaWQ9MCZrZXlfaWQ9MCZyZXBvX2lkPTAifQ.PDtRrVeNEEIfr7gEUYAhEgqJpATBH9HOCqqHxRfC9N0)
We skipped dry compile due to the same isa fingerprint. It is only cost 0.36s.
cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @voznesenskym @penguinwu @EikanWang @Guobing-Chen @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang