perf(build): batch generated dispatch builds#609
Merged
Conversation
1f7d42d to
bbd9546
Compare
Collaborator
Author
|
请 @wooway777 初审,@Ziminli 终审。 |
Ziminli
previously approved these changes
May 15, 2026
bbd9546 to
17d7823
Compare
wooway777
approved these changes
May 15, 2026
Ziminli
approved these changes
May 15, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
generated_dispatch.hdeclaration header while batching per-operator dispatch definitions.IndexToOffsetheader helperinlineso generated dispatch shards can include the same implementation headers safely.INFINIOPS_DISPATCH_BATCH_SIZEto tune generated dispatch source batch size.Motivation
The generated dispatch translation unit can become large as bindings grow. Splitting it into multiple generated source files lets the build system compile dispatch definitions in parallel and reduces single-translation-unit compiler pressure.
Closes # N/A
Type of Change
feat— new feature / new operator / new platformfix— bug fixperf— performance improvementrefactor— code restructuring without behavior changetest— adding or fixing tests onlydocs— documentation onlybuild/ci— build system or CI configurationchore— tooling, formatting, or other non-code changesPlatforms Affected
WITH_CPU)WITH_NVIDIA)WITH_ILUVATAR)WITH_METAX)WITH_CAMBRICON)WITH_MOORE)WITH_ASCEND)WITH_TORCH)Test Results on Supported Platforms
Direct profile,
PYTEST_WORKERS=1.pytestResult4151 passed, 1375 skipped in 312.73spytest tests/. Matches #604/#605/#606/#607/#608 pass/skip counts.3651 passed, 375 skipped in 259.43spytest tests/. Matches #604/#605/#606/#607/#608 pass/skip counts.5795 passed, 1447 skipped in 351.71spytest tests/. Matches #604/#605/#606/#607/#608 pass/skip counts.3073 passed, 3857 skipped in 891.89spytest tests/. Preserves #607's fix for the previous integer-input failures.300 failed, 5459 passed, 1483 skipped in 536.58spytest tests/. Same known pre-existingtests/test_gemm.pyMUSA failures as #604/#605/#606/#607/#608.3828 passed, 138 skipped in 509.84s; container exit code 137 after pytestpytest tests/. No pytest failures in this run; same post-test exit-code behavior as #605/#606/#607/#608 and no regression versus #604.Compared with the last merged baseline PRs #604, #608, #605, #607, and #606, this PR has no regression in build status, collected coverage, or failure count.
Full validation summaries
Benchmark / Performance Impact
Expected to improve build parallelism for generated dispatch bindings. Runtime behavior is unchanged.
Notes for Reviewers
This only changes generated binding source layout.
generated_dispatch.hstill declares the same entrypoints, while generated source files are split by batches of operators.IndexToOffsetis madeinlinebecause generated dispatch shards can now include the same implementation headers in multiple translation units.Checklist
Title, Branch, and Commits
<type>/xxx-yyyy-zzzz.master.fixup!/squash!/wipcommits remain.Scope and Design
General Code Hygiene
backticks.C++ Specific
clang-formatwas run on the modified C++ header.clang-tidywas not run for this small helper linkage fix.new/delete.Python Specific
ruff check scripts/generate_wrappers.pypassed.ruff format --check scripts/generate_wrappers.pypassed.Testing
Build, CI, and Tooling
compile_commands.json.Documentation
Security and Safety