Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[aotinductor] Avoid generating redundant kernel loading code #110510

Closed
wants to merge 6 commits into from

Conversation

desertfire
Copy link
Contributor

@desertfire desertfire commented Oct 4, 2023

Stack from ghstack (oldest at bottom):

Summary: 1) Stop forcing triton.unique_kernel_names to True for AOTInductor, because the unique kernel name can be read from metadata; 2) Only generate load_kernel once for each kernel since we don't have control flow in our generated code. This solves #105553.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @aakhundov @ColinPeppler

Summary: 1) Stop forcing triton.unique_kernel_names to True for AOTInductor, because the unique kernel name can be read from metadata; 2) Only generate load_kernel once for each kernel since we don't have control flow in our generated code.  This solves #105553.

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Oct 4, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/110510

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 7 Unrelated Failures

As of commit b4b55bd with merge base cf1b494 (image):

NEW FAILURE - The following job has failed:

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

desertfire added a commit that referenced this pull request Oct 4, 2023
Summary: 1) Stop forcing triton.unique_kernel_names to True for AOTInductor, because the unique kernel name can be read from metadata; 2) Only generate load_kernel once for each kernel since we don't have control flow in our generated code.  This solves #105553.

ghstack-source-id: 0a8626035ffb259e5ead943685f3b1ed7cc3c531
Pull Request resolved: #110510
Summary: 1) Stop forcing triton.unique_kernel_names to True for AOTInductor, because the unique kernel name can be read from metadata; 2) Only generate load_kernel once for each kernel since we don't have control flow in our generated code.  This solves #105553.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
desertfire added a commit that referenced this pull request Oct 4, 2023
Summary: 1) Stop forcing triton.unique_kernel_names to True for AOTInductor, because the unique kernel name can be read from metadata; 2) Only generate load_kernel once for each kernel since we don't have control flow in our generated code.  This solves #105553.

ghstack-source-id: 8deae81099e0270a4ebb0f2569f5f8dbd6fad410
Pull Request resolved: #110510
Summary: 1) Stop forcing triton.unique_kernel_names to True for AOTInductor, because the unique kernel name can be read from metadata; 2) Only generate load_kernel once for each kernel since we don't have control flow in our generated code.  This solves #105553.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
desertfire added a commit that referenced this pull request Oct 4, 2023
Summary: 1) Stop forcing triton.unique_kernel_names to True for AOTInductor, because the unique kernel name can be read from metadata; 2) Only generate load_kernel once for each kernel since we don't have control flow in our generated code.  This solves #105553.

ghstack-source-id: 48b35e9ceca2f40ba34a57acdbebcd7fcbf0961f
Pull Request resolved: #110510
@desertfire desertfire added the topic: not user facing topic category label Oct 4, 2023
Copy link
Contributor

@chenyang78 chenyang78 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks.

Summary: 1) Stop forcing triton.unique_kernel_names to True for AOTInductor, because the unique kernel name can be read from metadata; 2) Only generate load_kernel once for each kernel since we don't have control flow in our generated code.  This solves #105553.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
desertfire added a commit that referenced this pull request Oct 5, 2023
Summary: 1) Stop forcing triton.unique_kernel_names to True for AOTInductor, because the unique kernel name can be read from metadata; 2) Only generate load_kernel once for each kernel since we don't have control flow in our generated code.  This solves #105553.

ghstack-source-id: 4b85f2909ab748fafd418d6d9a339767dfece063
Pull Request resolved: #110510
Summary: 1) Stop forcing triton.unique_kernel_names to True for AOTInductor, because the unique kernel name can be read from metadata; 2) Only generate load_kernel once for each kernel since we don't have control flow in our generated code.  This solves #105553.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
desertfire added a commit that referenced this pull request Oct 5, 2023
Summary: 1) Stop forcing triton.unique_kernel_names to True for AOTInductor, because the unique kernel name can be read from metadata; 2) Only generate load_kernel once for each kernel since we don't have control flow in our generated code.  This solves #105553.

ghstack-source-id: 66f5415322618aa71fa83126ec2e198c0d513937
Pull Request resolved: #110510
Summary: 1) Stop forcing triton.unique_kernel_names to True for AOTInductor, because the unique kernel name can be read from metadata; 2) Only generate load_kernel once for each kernel since we don't have control flow in our generated code.  This solves #105553.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
desertfire added a commit that referenced this pull request Oct 5, 2023
Summary: 1) Stop forcing triton.unique_kernel_names to True for AOTInductor, because the unique kernel name can be read from metadata; 2) Only generate load_kernel once for each kernel since we don't have control flow in our generated code.  This solves #105553.

ghstack-source-id: 4725771e16833a6fd2e49cc4488e686f38fe9b00
Pull Request resolved: #110510
@desertfire
Copy link
Contributor Author

@pytorchbot merge -f "only affects AOTInductor tests and they have passed"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@eellison
Copy link
Contributor

eellison commented Oct 6, 2023

Flagging that this caused aot_inductor regression in last day

huydhn added a commit to pytorch/test-infra that referenced this pull request Oct 6, 2023
This handles the cases like
pytorch/pytorch#110608 or
pytorch/pytorch#110510 where there were a bunch
of infra flaky failures in which the runner crashes and no log was
found. The `runner_name` and `failure_line` fields are all empty in such
cases. Having no associated runner guarantees that the failure is an
unrelated infra flake.

### Testing

* **With pytorch/pytorch#110608
<!-- drci-comment-start -->

## 🔗 Helpful Links
### 🧪 See artifacts and rendered test results at
[hud.pytorch.org/pr/110608](https://hud.pytorch.org/pr/110608)
* 📄 Preview [Python docs built from this
PR](https://docs-preview.pytorch.org/pytorch/pytorch/110608/index.html)
* 📄 Preview [C++ docs built from this
PR](https://docs-preview.pytorch.org/pytorch/pytorch/110608/cppdocs/index.html)
* ❓ Need help or want to give feedback on the CI? Visit the
[bot commands
wiki](https://github.com/pytorch/pytorch/wiki/Bot-commands) or our
[office
hours](https://github.com/pytorch/pytorch/wiki/Dev-Infra-Office-Hours)

Note: Links to docs will display an error until the docs builds have
been completed.


## ✅ You can merge normally! (7 Unrelated Failures)
As of commit 2c38c884c7a8a39d713167cdc789d0e0f332f019 with merge base
f17fe89e14ef7c29690d989c857ae011b8589b80 (<sub><sub><img alt="image"
width=70
src="https://img.shields.io/date/1696518288?label=&color=FFFFFF&style=flat-square"></sub></sub>):
<details ><summary><b>FLAKY</b> - The following jobs failed but were
likely due to flakiness present on trunk:</summary><p>

* [pull / linux-focal-py3.11-clang10 / test (default, 3, 3,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/110608#17434720187)
([gh](https://github.com/pytorch/pytorch/actions/runs/6420960240/job/17434720187))
* [pull / linux-focal-py3.8-clang10 / test (default, 1, 3,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/110608#17434716451)
([gh](https://github.com/pytorch/pytorch/actions/runs/6420960240/job/17434716451))
* [pull / linux-focal-py3.8-clang10-onnx / test (default, 1, 2,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/110608#17434729402)
([gh](https://github.com/pytorch/pytorch/actions/runs/6420960240/job/17434729402))
* [pull / linux-jammy-py3.8-gcc11 / test (backwards_compat, 1, 1,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/110608#17434761501)
([gh](https://github.com/pytorch/pytorch/actions/runs/6420960240/job/17434761501))
* [pull / linux-jammy-py3.8-gcc11 / test (default, 1, 3,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/110608#17434758713)
([gh](https://github.com/pytorch/pytorch/actions/runs/6420960240/job/17434758713))
* [pull / linux-jammy-py3.8-gcc11 / test (distributed, 2, 2,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/110608#17434762139)
([gh](https://github.com/pytorch/pytorch/actions/runs/6420960240/job/17434762139))
* [pull / linux-jammy-py3.9-clang12-asan / test (default, 3, 6,
linux.4xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/110608#17434743579)
([gh](https://github.com/pytorch/pytorch/actions/runs/6420960240/job/17434743579))
</p></details>


This comment was automatically generated by Dr. CI and updates every 15
minutes.
<!-- drci-comment-end -->


* **With pytorch/pytorch#110510
<!-- drci-comment-start -->

## 🔗 Helpful Links
### 🧪 See artifacts and rendered test results at
[hud.pytorch.org/pr/110510](https://hud.pytorch.org/pr/110510)
* 📄 Preview [Python docs built from this
PR](https://docs-preview.pytorch.org/pytorch/pytorch/110510/index.html)
* 📄 Preview [C++ docs built from this
PR](https://docs-preview.pytorch.org/pytorch/pytorch/110510/cppdocs/index.html)
* ❓ Need help or want to give feedback on the CI? Visit the
[bot commands
wiki](https://github.com/pytorch/pytorch/wiki/Bot-commands) or our
[office
hours](https://github.com/pytorch/pytorch/wiki/Dev-Infra-Office-Hours)

Note: Links to docs will display an error until the docs builds have
been completed.


## ❌ 1 New Failure, 7 Unrelated Failures
As of commit b4b55bd4421e4af1f6749a8ebaa557a49e66c9ae with merge base
cf1b494afd0d0368c22e70e93d91da3d9fe1ddce (<sub><sub><img alt="image"
width=70
src="https://img.shields.io/date/1696501025?label=&color=FFFFFF&style=flat-square"></sub></sub>):
<details open><summary><b>NEW FAILURE</b> - The following job has
failed:</summary><p>

* [inductor / cuda12.1-py3.10-gcc9-sm86 / test (inductor_torchbench, 1,
1,
linux.g5.4xlarge.nvidia.gpu)](https://hud.pytorch.org/pr/pytorch/pytorch/110510#17435647364)
([gh](https://github.com/pytorch/pytorch/actions/runs/6420900552/job/17435647364))
</p></details>
<details ><summary><b>FLAKY</b> - The following jobs failed but were
likely due to flakiness present on trunk:</summary><p>

* [pull / linux-focal-py3_8-clang9-xla / test (xla, 1, 1,
linux.12xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/110510#17434711945)
([gh](https://github.com/pytorch/pytorch/actions/runs/6420898905/job/17434711945))
* [pull / linux-focal-py3.11-clang10 / test (crossref, 1, 2,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/110510#17434570849)
([gh](https://github.com/pytorch/pytorch/actions/runs/6420898905/job/17434570849))
* [pull / linux-focal-py3.8-clang10 / test (crossref, 1, 2,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/110510#17434594635)
([gh](https://github.com/pytorch/pytorch/actions/runs/6420898905/job/17434594635))
* [pull / linux-focal-py3.8-clang10 / test (dynamo, 2, 2,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/110510#17434595559)
([gh](https://github.com/pytorch/pytorch/actions/runs/6420898905/job/17434595559))
* [pull / linux-jammy-py3.8-gcc11 / test (default, 1, 3,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/110510#17434611759)
([gh](https://github.com/pytorch/pytorch/actions/runs/6420898905/job/17434611759))
* [pull / linux-jammy-py3.8-gcc11 / test (default, 2, 3,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/110510#17434612149)
([gh](https://github.com/pytorch/pytorch/actions/runs/6420898905/job/17434612149))
* [pull / linux-jammy-py3.9-clang12-asan / test (default, 4, 6,
linux.4xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/110510#17434586348)
([gh](https://github.com/pytorch/pytorch/actions/runs/6420898905/job/17434586348))
</p></details>


This comment was automatically generated by Dr. CI and updates every 15
minutes.
<!-- drci-comment-end -->
@huydhn
Copy link
Contributor

huydhn commented Oct 6, 2023

@pytorchbot drci

(Please ignore this comment, I'm testing Dr.CI)

desertfire added a commit that referenced this pull request Oct 7, 2023
Summary: Forward fix a performance regression caused by #110510. When a model is run once, all those kernel pointers are initialized and removing the if-nullptr check will cause those loadKernel be unnecessarily executed again when we rerun the foward function. Another way to do this is to codegen loadKernel in the initializer, which I may in a later PR.

[ghstack-poisoned]
desertfire added a commit that referenced this pull request Oct 7, 2023
Summary: Forward fix a performance regression caused by #110510. When a model is run once, all those kernel pointers are initialized and removing the if-nullptr check will cause those loadKernel be unnecessarily executed again when we rerun the foward function. Another way to do this is to codegen loadKernel in the initializer, which I may in a later PR.

ghstack-source-id: d2d5531df77c4e69c38e0e13c21278ca6943f0f0
Pull Request resolved: #110800
pytorchmergebot pushed a commit that referenced this pull request Oct 8, 2023
Summary: Forward fix a performance regression caused by #110510. When a model is run once, all those kernel pointers are initialized and removing the if-nullptr check will cause those loadKernel be unnecessarily executed again when we rerun the foward function. Another way to do this is to codegen loadKernel in the initializer, which I may do in a later PR.

Pull Request resolved: #110800
Approved by: https://github.com/jansel
@facebook-github-bot facebook-github-bot deleted the gh/desertfire/234/head branch October 9, 2023 14:23
desertfire added a commit that referenced this pull request Oct 10, 2023
Summary: To prevent perf regression like the one caused by #110510

[ghstack-poisoned]
desertfire added a commit that referenced this pull request Oct 10, 2023
Summary: To prevent perf regression like the one caused by #110510

ghstack-source-id: 6ffb0e3e035061dc24881cdd11651cf4e5122d2e
Pull Request resolved: #110972
desertfire added a commit that referenced this pull request Oct 10, 2023
Summary: To prevent perf regression like the one caused by #110510

[ghstack-poisoned]
desertfire added a commit that referenced this pull request Oct 10, 2023
Summary: To prevent perf regression like the one caused by #110510

ghstack-source-id: 329434d1fd74d36cede2096033da26615916606d
Pull Request resolved: #110972
desertfire added a commit that referenced this pull request Oct 10, 2023
…TInductor"

Summary: To prevent perf regression like the one caused by #110510

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx chenyang78 aakhundov kadeng

[ghstack-poisoned]
desertfire added a commit that referenced this pull request Oct 10, 2023
Summary: To prevent perf regression like the one caused by #110510

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx chenyang78 aakhundov kadeng

[ghstack-poisoned]
desertfire added a commit that referenced this pull request Oct 10, 2023
Summary: To prevent perf regression like the one caused by #110510

ghstack-source-id: 93e313748ee7cbc37cac4bcbc44baa0e93c21438
Pull Request resolved: #110972
desertfire added a commit that referenced this pull request Oct 11, 2023
…TInductor"

Summary: To prevent perf regression like the one caused by #110510

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx chenyang78 aakhundov kadeng

[ghstack-poisoned]
desertfire added a commit that referenced this pull request Oct 11, 2023
Summary: To prevent perf regression like the one caused by #110510

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx chenyang78 aakhundov kadeng

[ghstack-poisoned]
desertfire added a commit that referenced this pull request Oct 11, 2023
Summary: To prevent perf regression like the one caused by #110510

ghstack-source-id: 039a824462413614d918d9dba63507f7251770a9
Pull Request resolved: #110972
pytorchmergebot pushed a commit that referenced this pull request Oct 11, 2023
Summary: To prevent perf regression like the one caused by #110510

Pull Request resolved: #110972
Approved by: https://github.com/chenyang78
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants