-
Notifications
You must be signed in to change notification settings - Fork 21.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[inductor max autotune] Flexible GEMM layout autotuning #114319
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This diff introduces memory layout autotuning and flexibilizes memory layouts that are accepted and written by the Cutlass GEMM Kernels. During autotuning, if Cutlass GEMM Kernels have inputs with Flexible Layouts, all possible combinations of row-major or column major layouts are tried during autotuning. Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary. Test Plan: * Additional Unit test(s) (more tbd) * CI [ghstack-poisoned]
This was referenced Nov 21, 2023
This was referenced Nov 21, 2023
kadeng
added a commit
that referenced
this pull request
Nov 21, 2023
This diff introduces memory layout autotuning and flexibilizes memory layouts that are accepted and written by the Cutlass GEMM Kernels. During autotuning, if Cutlass GEMM Kernels have inputs with Flexible Layouts, all possible combinations of row-major or column major layouts are tried during autotuning. Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary. Test Plan: * Additional Unit test(s) (more tbd) * CI ghstack-source-id: 852e7444693f7c57ea86d059ecc6a5f654075d90 Pull Request resolved: #114319
This diff introduces memory layout autotuning and flexibilizes memory layouts that are accepted and written by the Cutlass GEMM Kernels. During autotuning, if Cutlass GEMM Kernels have inputs with Flexible Layouts, all possible combinations of row-major or column major layouts are tried during autotuning. Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary. Test Plan: * Additional Unit test(s) (more tbd) * CI [ghstack-poisoned]
kadeng
added a commit
that referenced
this pull request
Nov 22, 2023
This diff introduces memory layout autotuning and flexibilizes memory layouts that are accepted and written by the Cutlass GEMM Kernels. During autotuning, if Cutlass GEMM Kernels have inputs with Flexible Layouts, all possible combinations of row-major or column major layouts are tried during autotuning. Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary. Test Plan: * Additional Unit test(s) (more tbd) * CI ghstack-source-id: 42fcc45d0efac03c5c2fcdb68ed714def8346bef Pull Request resolved: #114319
This diff introduces memory layout autotuning and flexibilizes memory layouts that are accepted and written by the Cutlass GEMM Kernels. During autotuning, if Cutlass GEMM Kernels have inputs with Flexible Layouts, all possible combinations of row-major or column major layouts are tried during autotuning. Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary. Test Plan: * Additional Unit test(s) (more tbd) * CI [ghstack-poisoned]
This was referenced Nov 27, 2023
Closed
This diff introduces memory layout autotuning and flexibilizes memory layouts that are accepted and written by the Cutlass GEMM Kernels. During autotuning, if Cutlass GEMM Kernels have inputs with Flexible Layouts, all possible combinations of row-major or column major layouts are tried during autotuning. Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary. Test Plan: * Additional Unit test(s) (more tbd) * CI [ghstack-poisoned]
This was referenced Dec 5, 2023
This diff introduces memory layout autotuning and flexibilizes memory layouts that are accepted and written by the Cutlass GEMM Kernels. During autotuning, if Cutlass GEMM Kernels have inputs with Flexible Layouts, all possible combinations of row-major or column major layouts are tried during autotuning. Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary. Test Plan: * Additional Unit test(s) (more tbd) * CI [ghstack-poisoned]
This diff introduces memory layout autotuning and flexibilizes memory layouts that are accepted and written by the Cutlass GEMM Kernels. During autotuning, if Cutlass GEMM Kernels have inputs with Flexible Layouts, all possible combinations of row-major or column major layouts are tried during autotuning. Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary. Test Plan: * Additional Unit test(s) (more tbd) * CI [ghstack-poisoned]
This diff introduces memory layout autotuning and flexibilizes memory layouts that are accepted and written by the Cutlass GEMM Kernels. During autotuning, if Cutlass GEMM Kernels have inputs with Flexible Layouts, all possible combinations of row-major or column major layouts are tried during autotuning. Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary. Test Plan: * Additional Unit test(s) (more tbd) * CI [ghstack-poisoned]
This diff introduces memory layout autotuning and flexibilizes memory layouts that are accepted and written by the Cutlass GEMM Kernels. During autotuning, if Cutlass GEMM Kernels have inputs with Flexible Layouts, all possible combinations of row-major or column major layouts are tried during autotuning. Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary. Test Plan: * Additional Unit test(s) (more tbd) * CI [ghstack-poisoned]
This diff introduces memory layout autotuning and flexibilizes memory layouts that are accepted and written by the Cutlass GEMM Kernels. During autotuning, if Cutlass GEMM Kernels have inputs with Flexible Layouts, all possible combinations of row-major or column major layouts are tried during autotuning. Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary. Test Plan: * Additional Unit test(s) (more tbd) * CI [ghstack-poisoned]
This diff introduces memory layout autotuning and flexibilizes memory layouts that are accepted and written by the Cutlass GEMM Kernels. During autotuning, if Cutlass GEMM Kernels have inputs with Flexible Layouts, all possible combinations of row-major or column major layouts are tried during autotuning. Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary. Test Plan: * Additional Unit test(s) (more tbd) * CI [ghstack-poisoned]
This diff introduces memory layout autotuning and flexibilizes memory layouts that are accepted and written by the Cutlass GEMM Kernels. During autotuning, if Cutlass GEMM Kernels have inputs with Flexible Layouts, all possible combinations of row-major or column major layouts are tried during autotuning. Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary. Test Plan: * Additional Unit test(s) (more tbd) * CI [ghstack-poisoned]
This was referenced Dec 12, 2023
Closed
This diff introduces memory layout autotuning and flexibilizes memory layouts that are accepted and written by the Cutlass GEMM Kernels. During autotuning, if Cutlass GEMM Kernels have inputs with Flexible Layouts, all possible combinations of row-major or column major layouts are tried during autotuning. Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary. Test Plan: * Additional Unit test(s) (more tbd) * CI [ghstack-poisoned]
Moved to a (draft) feature branch, see #115919 |
kadeng
added a commit
that referenced
this pull request
Jan 17, 2024
This diff introduces memory layout autotuning and flexibilizes memory layouts that are accepted and written by the Cutlass GEMM Kernels. During autotuning, if Cutlass GEMM Kernels have inputs with Flexible Layouts, all possible combinations of row-major or column major layouts are tried during autotuning. Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary. Test Plan: * Additional Unit test(s) (more tbd) * CI ghstack-source-id: 5dcfc8eb1712ec40672e9cf2b1a878cae1ee2311 Pull Request resolved: #114319
kadeng
added a commit
that referenced
this pull request
Jan 18, 2024
This diff introduces memory layout autotuning and flexibilizes memory layouts that are accepted and written by the Cutlass GEMM Kernels. During autotuning, if Cutlass GEMM Kernels have inputs with Flexible Layouts, all possible combinations of row-major or column major layouts are tried during autotuning. Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary. Test Plan: * Additional Unit test(s) (more tbd) * CI ghstack-source-id: 5dcfc8eb1712ec40672e9cf2b1a878cae1ee2311 Pull Request resolved: #114319
kadeng
added a commit
that referenced
this pull request
Jan 18, 2024
This diff introduces memory layout autotuning and flexibilizes memory layouts that are accepted and written by the Cutlass GEMM Kernels. During autotuning, if Cutlass GEMM Kernels have inputs with Flexible Layouts, all possible combinations of row-major or column major layouts are tried during autotuning. Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary. Test Plan: * Additional Unit test(s) (more tbd) * CI ghstack-source-id: 5dcfc8eb1712ec40672e9cf2b1a878cae1ee2311 Pull Request resolved: #114319
kadeng
added a commit
that referenced
this pull request
Jan 23, 2024
This diff introduces memory layout autotuning and flexibilizes memory layouts that are accepted and written by the Cutlass GEMM Kernels. During autotuning, if Cutlass GEMM Kernels have inputs with Flexible Layouts, all possible combinations of row-major or column major layouts are tried during autotuning. Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary. Test Plan: * Additional Unit test(s) (more tbd) * CI ghstack-source-id: 5dcfc8eb1712ec40672e9cf2b1a878cae1ee2311 Pull Request resolved: #114319
kadeng
added a commit
that referenced
this pull request
Jan 26, 2024
This diff introduces memory layout autotuning and flexibilizes memory layouts that are accepted and written by the Cutlass GEMM Kernels. During autotuning, if Cutlass GEMM Kernels have inputs with Flexible Layouts, all possible combinations of row-major or column major layouts are tried during autotuning. Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary. Test Plan: * Additional Unit test(s) (more tbd) * CI ghstack-source-id: 5dcfc8eb1712ec40672e9cf2b1a878cae1ee2311 Pull Request resolved: #114319
kadeng
added a commit
that referenced
this pull request
Jan 26, 2024
This diff introduces memory layout autotuning and flexibilizes memory layouts that are accepted and written by the Cutlass GEMM Kernels. During autotuning, if Cutlass GEMM Kernels have inputs with Flexible Layouts, all possible combinations of row-major or column major layouts are tried during autotuning. Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary. Test Plan: * Additional Unit test(s) (more tbd) * CI ghstack-source-id: 5dcfc8eb1712ec40672e9cf2b1a878cae1ee2311 Pull Request resolved: #114319
kadeng
added a commit
that referenced
this pull request
Jan 29, 2024
This diff introduces memory layout autotuning and flexibilizes memory layouts that are accepted and written by the Cutlass GEMM Kernels. During autotuning, if Cutlass GEMM Kernels have inputs with Flexible Layouts, all possible combinations of row-major or column major layouts are tried during autotuning. Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary. Test Plan: * Additional Unit test(s) (more tbd) * CI ghstack-source-id: 5dcfc8eb1712ec40672e9cf2b1a878cae1ee2311 Pull Request resolved: #114319
kadeng
added a commit
that referenced
this pull request
Jan 29, 2024
This diff introduces memory layout autotuning and flexibilizes memory layouts that are accepted and written by the Cutlass GEMM Kernels. During autotuning, if Cutlass GEMM Kernels have inputs with Flexible Layouts, all possible combinations of row-major or column major layouts are tried during autotuning. Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary. Test Plan: * Additional Unit test(s) (more tbd) * CI ghstack-source-id: 5dcfc8eb1712ec40672e9cf2b1a878cae1ee2311 Pull Request resolved: #114319
kadeng
added a commit
that referenced
this pull request
Jan 29, 2024
This diff introduces memory layout autotuning and flexibilizes memory layouts that are accepted and written by the Cutlass GEMM Kernels. During autotuning, if Cutlass GEMM Kernels have inputs with Flexible Layouts, all possible combinations of row-major or column major layouts are tried during autotuning. Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary. Test Plan: * Additional Unit test(s) (more tbd) * CI ghstack-source-id: 5dcfc8eb1712ec40672e9cf2b1a878cae1ee2311 Pull Request resolved: #114319
kadeng
added a commit
that referenced
this pull request
Jan 29, 2024
This diff introduces memory layout autotuning and flexibilizes memory layouts that are accepted and written by the Cutlass GEMM Kernels. During autotuning, if Cutlass GEMM Kernels have inputs with Flexible Layouts, all possible combinations of row-major or column major layouts are tried during autotuning. Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary. Test Plan: * Additional Unit test(s) (more tbd) * CI ghstack-source-id: 5dcfc8eb1712ec40672e9cf2b1a878cae1ee2311 Pull Request resolved: #114319
kadeng
added a commit
that referenced
this pull request
Jan 30, 2024
This diff introduces memory layout autotuning and flexibilizes memory layouts that are accepted and written by the Cutlass GEMM Kernels. During autotuning, if Cutlass GEMM Kernels have inputs with Flexible Layouts, all possible combinations of row-major or column major layouts are tried during autotuning. Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary. Test Plan: * Additional Unit test(s) (more tbd) * CI ghstack-source-id: 5dcfc8eb1712ec40672e9cf2b1a878cae1ee2311 Pull Request resolved: #114319
kadeng
added a commit
that referenced
this pull request
Jan 30, 2024
This diff introduces memory layout autotuning and flexibilizes memory layouts that are accepted and written by the Cutlass GEMM Kernels. During autotuning, if Cutlass GEMM Kernels have inputs with flexible memory layouts, all possible combinations of row-major or column major layouts are tried during autotuning. Test Plan: * Additional Unit test(s) * CI ghstack-source-id: 5dcfc8eb1712ec40672e9cf2b1a878cae1ee2311 Pull Request resolved: #114319
kadeng
added a commit
that referenced
this pull request
Jan 30, 2024
This diff introduces memory layout autotuning and flexibilizes memory layouts that are accepted and written by the Cutlass GEMM Kernels. During autotuning, if Cutlass GEMM Kernels have inputs with flexible memory layouts, all possible combinations of row-major or column major layouts are tried during autotuning. Test Plan: * Additional Unit test(s) * CI ghstack-source-id: 5dcfc8eb1712ec40672e9cf2b1a878cae1ee2311 Pull Request resolved: #114319
kadeng
added a commit
that referenced
this pull request
Feb 1, 2024
This diff introduces memory layout autotuning and flexibilizes memory layouts that are accepted and written by the Cutlass GEMM Kernels. During autotuning, if Cutlass GEMM Kernels have inputs with flexible memory layouts, all possible combinations of row-major or column major layouts are tried during autotuning. Test Plan: * Additional Unit test(s) * CI ghstack-source-id: 5dcfc8eb1712ec40672e9cf2b1a878cae1ee2311 Pull Request resolved: #114319
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):
This diff introduces memory layout autotuning and
flexibilizes memory layouts that are accepted and
written by the Cutlass GEMM Kernels.
During autotuning, if Cutlass GEMM Kernels have
inputs with Flexible Layouts, all possible combinations
of row-major or column major layouts are tried during
autotuning.
Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary.
Test Plan: