Only call triton in worker process, ahead of time compile #146334

jamesjwu · 2025-02-03T19:52:59Z

Stack from ghstack (oldest at bottom):

-> Only call triton in worker process, ahead of time compile #146334

This PR extends #144288 by combining calling triton in worker processes with the future cache: we kick off triton compilation in the worker processes earlier, during inductor codegen. Basically instead of calling async_compile.triton for the first time only after the entire code has been generated, we start compiling as soon as we know we'll need to compile the kernel. Then, when loading the generated inductor code, we can simply read from our in memory future cache, considerably increasing the parallelism.

Because LambdaFutures are much faster to kick off than TritonFutures, due to not needing to load from TritonCodeCache at all, the time spent kicking off these worker jobs is pretty minimal for inductor codegen.

Differential Revision: D69013710

This PR extends #144288 by combining calling triton in worker processes with the future cache: we kick off triton compilation in the worker processes earlier, during inductor codegen. Basically instead of calling async_compile.triton for the first time only after the entire code has been generated, we start compiling as soon as we know we'll need to compile the kernel. Then, when loading the generated inductor code, we can simply read from our in memory future cache, considerably increasing the parallelism. Because LambdaFutures are much faster to kick off than TritonFutures, due to not needing to load from TritonCodeCache at all, the time spent kicking off these worker jobs is pretty minimal for inductor codegen. Differential Revision: [D69013710](https://our.internmc.facebook.com/intern/diff/D69013710/) [ghstack-poisoned]

pytorch-bot · 2025-02-03T19:53:04Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/146334

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures, 65 Pending

As of commit 19b1b17 with merge base 550441a ():

NEW FAILURES - The following jobs have failed:

Check Labels / Check labels (gh)
RuntimeError: Error checking labels: PR does not have required labels
Lint / lintrunner-noclang / linux-job (gh)
>>> Lint for torch/_inductor/async_compile.py:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2025-02-03T19:53:05Z

This pull request was exported from Phabricator. Differential Revision: D69013710

This PR extends #144288 by combining calling triton in worker processes with the future cache: we kick off triton compilation in the worker processes earlier, during inductor codegen. Basically instead of calling async_compile.triton for the first time only after the entire code has been generated, we start compiling as soon as we know we'll need to compile the kernel. Then, when loading the generated inductor code, we can simply read from our in memory future cache, considerably increasing the parallelism. Because LambdaFutures are much faster to kick off than TritonFutures, due to not needing to load from TritonCodeCache at all, the time spent kicking off these worker jobs is pretty minimal for inductor codegen. Differential Revision: [D69013710](https://our.internmc.facebook.com/intern/diff/D69013710/) ghstack-source-id: 264436138 Pull Request resolved: #146334

github-actions · 2025-02-03T19:54:14Z

This PR needs a `release notes:` label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

pytorch-bot bot added ciflow/inductor module: inductor labels Feb 3, 2025

facebook-github-bot added the fb-exported label Feb 3, 2025

jamesjwu closed this Feb 3, 2025

github-actions bot deleted the gh/jamesjwu/101/head branch March 6, 2025 02:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Only call triton in worker process, ahead of time compile #146334

Only call triton in worker process, ahead of time compile #146334

Uh oh!

jamesjwu commented Feb 3, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Feb 3, 2025 •

edited

Loading

Uh oh!

facebook-github-bot commented Feb 3, 2025

Uh oh!

github-actions bot commented Feb 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Only call triton in worker process, ahead of time compile #146334

Only call triton in worker process, ahead of time compile #146334

Uh oh!

Conversation

jamesjwu commented Feb 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Feb 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/146334

❌ 2 New Failures, 65 Pending

Uh oh!

facebook-github-bot commented Feb 3, 2025

Uh oh!

github-actions bot commented Feb 3, 2025

This PR needs a release notes: label

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jamesjwu commented Feb 3, 2025 •

edited

Loading

pytorch-bot bot commented Feb 3, 2025 •

edited

Loading

This PR needs a `release notes:` label