Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[inductor max autotune] Flexible GEMM layout autotuning #114319

Closed
wants to merge 17 commits into from

Conversation

kadeng
Copy link
Contributor

@kadeng kadeng commented Nov 21, 2023

Stack from ghstack (oldest at bottom):

This diff introduces memory layout autotuning and
flexibilizes memory layouts that are accepted and
written by the Cutlass GEMM Kernels.

During autotuning, if Cutlass GEMM Kernels have
inputs with Flexible Layouts, all possible combinations
of row-major or column major layouts are tried during
autotuning.

Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary.

Test Plan:

  • Additional Unit test(s) (more tbd)
  • CI

This diff introduces memory layout autotuning and
flexibilizes memory layouts that are accepted and
written by the Cutlass GEMM Kernels.

During autotuning, if Cutlass GEMM Kernels have
inputs with Flexible Layouts, all possible combinations
of row-major or column major layouts are tried during
autotuning.

Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary.

Test Plan:

 * Additional Unit test(s) (more tbd)
 * CI

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Nov 21, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/114319

Note: Links to docs will display an error until the docs builds have been completed.

❌ 23 New Failures, 9 Unrelated Failures

As of commit 4683580 with merge base afe6d27 (image):

NEW FAILURES - The following jobs have failed:

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

kadeng added a commit that referenced this pull request Nov 21, 2023
This diff introduces memory layout autotuning and
flexibilizes memory layouts that are accepted and
written by the Cutlass GEMM Kernels.

During autotuning, if Cutlass GEMM Kernels have
inputs with Flexible Layouts, all possible combinations
of row-major or column major layouts are tried during
autotuning.

Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary.

Test Plan:

 * Additional Unit test(s) (more tbd)
 * CI

ghstack-source-id: 852e7444693f7c57ea86d059ecc6a5f654075d90
Pull Request resolved: #114319
This diff introduces memory layout autotuning and
flexibilizes memory layouts that are accepted and
written by the Cutlass GEMM Kernels.

During autotuning, if Cutlass GEMM Kernels have
inputs with Flexible Layouts, all possible combinations
of row-major or column major layouts are tried during
autotuning.

Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary.

Test Plan:

 * Additional Unit test(s) (more tbd)
 * CI

[ghstack-poisoned]
kadeng added a commit that referenced this pull request Nov 22, 2023
This diff introduces memory layout autotuning and
flexibilizes memory layouts that are accepted and
written by the Cutlass GEMM Kernels.

During autotuning, if Cutlass GEMM Kernels have
inputs with Flexible Layouts, all possible combinations
of row-major or column major layouts are tried during
autotuning.

Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary.

Test Plan:

 * Additional Unit test(s) (more tbd)
 * CI

ghstack-source-id: 42fcc45d0efac03c5c2fcdb68ed714def8346bef
Pull Request resolved: #114319
This diff introduces memory layout autotuning and
flexibilizes memory layouts that are accepted and
written by the Cutlass GEMM Kernels.

During autotuning, if Cutlass GEMM Kernels have
inputs with Flexible Layouts, all possible combinations
of row-major or column major layouts are tried during
autotuning.

Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary.

Test Plan:

 * Additional Unit test(s) (more tbd)
 * CI

[ghstack-poisoned]
This diff introduces memory layout autotuning and
flexibilizes memory layouts that are accepted and
written by the Cutlass GEMM Kernels.

During autotuning, if Cutlass GEMM Kernels have
inputs with Flexible Layouts, all possible combinations
of row-major or column major layouts are tried during
autotuning.

Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary.

Test Plan:

 * Additional Unit test(s) (more tbd)
 * CI

[ghstack-poisoned]
This diff introduces memory layout autotuning and
flexibilizes memory layouts that are accepted and
written by the Cutlass GEMM Kernels.

During autotuning, if Cutlass GEMM Kernels have
inputs with Flexible Layouts, all possible combinations
of row-major or column major layouts are tried during
autotuning.

Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary.

Test Plan:

 * Additional Unit test(s) (more tbd)
 * CI

[ghstack-poisoned]
This diff introduces memory layout autotuning and
flexibilizes memory layouts that are accepted and
written by the Cutlass GEMM Kernels.

During autotuning, if Cutlass GEMM Kernels have
inputs with Flexible Layouts, all possible combinations
of row-major or column major layouts are tried during
autotuning.

Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary.

Test Plan:

 * Additional Unit test(s) (more tbd)
 * CI

[ghstack-poisoned]
This diff introduces memory layout autotuning and
flexibilizes memory layouts that are accepted and
written by the Cutlass GEMM Kernels.

During autotuning, if Cutlass GEMM Kernels have
inputs with Flexible Layouts, all possible combinations
of row-major or column major layouts are tried during
autotuning.

Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary.

Test Plan:

 * Additional Unit test(s) (more tbd)
 * CI

[ghstack-poisoned]
This diff introduces memory layout autotuning and
flexibilizes memory layouts that are accepted and
written by the Cutlass GEMM Kernels.

During autotuning, if Cutlass GEMM Kernels have
inputs with Flexible Layouts, all possible combinations
of row-major or column major layouts are tried during
autotuning.

Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary.

Test Plan:

 * Additional Unit test(s) (more tbd)
 * CI

[ghstack-poisoned]
This diff introduces memory layout autotuning and
flexibilizes memory layouts that are accepted and
written by the Cutlass GEMM Kernels.

During autotuning, if Cutlass GEMM Kernels have
inputs with Flexible Layouts, all possible combinations
of row-major or column major layouts are tried during
autotuning.

Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary.

Test Plan:

 * Additional Unit test(s) (more tbd)
 * CI

[ghstack-poisoned]
This diff introduces memory layout autotuning and
flexibilizes memory layouts that are accepted and
written by the Cutlass GEMM Kernels.

During autotuning, if Cutlass GEMM Kernels have
inputs with Flexible Layouts, all possible combinations
of row-major or column major layouts are tried during
autotuning.

Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary.

Test Plan:

 * Additional Unit test(s) (more tbd)
 * CI

[ghstack-poisoned]
This diff introduces memory layout autotuning and
flexibilizes memory layouts that are accepted and
written by the Cutlass GEMM Kernels.

During autotuning, if Cutlass GEMM Kernels have
inputs with Flexible Layouts, all possible combinations
of row-major or column major layouts are tried during
autotuning.

Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary.

Test Plan:

 * Additional Unit test(s) (more tbd)
 * CI

[ghstack-poisoned]
This diff introduces memory layout autotuning and
flexibilizes memory layouts that are accepted and
written by the Cutlass GEMM Kernels.

During autotuning, if Cutlass GEMM Kernels have
inputs with Flexible Layouts, all possible combinations
of row-major or column major layouts are tried during
autotuning.

Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary.

Test Plan:

 * Additional Unit test(s) (more tbd)
 * CI

[ghstack-poisoned]
@kadeng
Copy link
Contributor Author

kadeng commented Dec 15, 2023

Moved to a (draft) feature branch, see #115919

@kadeng kadeng closed this Dec 15, 2023
@facebook-github-bot facebook-github-bot deleted the gh/kadeng/32/head branch January 14, 2024 15:23
kadeng added a commit that referenced this pull request Jan 17, 2024
This diff introduces memory layout autotuning and
flexibilizes memory layouts that are accepted and
written by the Cutlass GEMM Kernels.

During autotuning, if Cutlass GEMM Kernels have
inputs with Flexible Layouts, all possible combinations
of row-major or column major layouts are tried during
autotuning.

Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary.

Test Plan:

 * Additional Unit test(s) (more tbd)
 * CI

ghstack-source-id: 5dcfc8eb1712ec40672e9cf2b1a878cae1ee2311
Pull Request resolved: #114319
kadeng added a commit that referenced this pull request Jan 18, 2024
This diff introduces memory layout autotuning and
flexibilizes memory layouts that are accepted and
written by the Cutlass GEMM Kernels.

During autotuning, if Cutlass GEMM Kernels have
inputs with Flexible Layouts, all possible combinations
of row-major or column major layouts are tried during
autotuning.

Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary.

Test Plan:

 * Additional Unit test(s) (more tbd)
 * CI

ghstack-source-id: 5dcfc8eb1712ec40672e9cf2b1a878cae1ee2311
Pull Request resolved: #114319
kadeng added a commit that referenced this pull request Jan 18, 2024
This diff introduces memory layout autotuning and
flexibilizes memory layouts that are accepted and
written by the Cutlass GEMM Kernels.

During autotuning, if Cutlass GEMM Kernels have
inputs with Flexible Layouts, all possible combinations
of row-major or column major layouts are tried during
autotuning.

Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary.

Test Plan:

 * Additional Unit test(s) (more tbd)
 * CI

ghstack-source-id: 5dcfc8eb1712ec40672e9cf2b1a878cae1ee2311
Pull Request resolved: #114319
kadeng added a commit that referenced this pull request Jan 23, 2024
This diff introduces memory layout autotuning and
flexibilizes memory layouts that are accepted and
written by the Cutlass GEMM Kernels.

During autotuning, if Cutlass GEMM Kernels have
inputs with Flexible Layouts, all possible combinations
of row-major or column major layouts are tried during
autotuning.

Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary.

Test Plan:

 * Additional Unit test(s) (more tbd)
 * CI

ghstack-source-id: 5dcfc8eb1712ec40672e9cf2b1a878cae1ee2311
Pull Request resolved: #114319
kadeng added a commit that referenced this pull request Jan 26, 2024
This diff introduces memory layout autotuning and
flexibilizes memory layouts that are accepted and
written by the Cutlass GEMM Kernels.

During autotuning, if Cutlass GEMM Kernels have
inputs with Flexible Layouts, all possible combinations
of row-major or column major layouts are tried during
autotuning.

Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary.

Test Plan:

 * Additional Unit test(s) (more tbd)
 * CI

ghstack-source-id: 5dcfc8eb1712ec40672e9cf2b1a878cae1ee2311
Pull Request resolved: #114319
kadeng added a commit that referenced this pull request Jan 26, 2024
This diff introduces memory layout autotuning and
flexibilizes memory layouts that are accepted and
written by the Cutlass GEMM Kernels.

During autotuning, if Cutlass GEMM Kernels have
inputs with Flexible Layouts, all possible combinations
of row-major or column major layouts are tried during
autotuning.

Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary.

Test Plan:

 * Additional Unit test(s) (more tbd)
 * CI

ghstack-source-id: 5dcfc8eb1712ec40672e9cf2b1a878cae1ee2311
Pull Request resolved: #114319
kadeng added a commit that referenced this pull request Jan 29, 2024
This diff introduces memory layout autotuning and
flexibilizes memory layouts that are accepted and
written by the Cutlass GEMM Kernels.

During autotuning, if Cutlass GEMM Kernels have
inputs with Flexible Layouts, all possible combinations
of row-major or column major layouts are tried during
autotuning.

Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary.

Test Plan:

 * Additional Unit test(s) (more tbd)
 * CI

ghstack-source-id: 5dcfc8eb1712ec40672e9cf2b1a878cae1ee2311
Pull Request resolved: #114319
kadeng added a commit that referenced this pull request Jan 29, 2024
This diff introduces memory layout autotuning and
flexibilizes memory layouts that are accepted and
written by the Cutlass GEMM Kernels.

During autotuning, if Cutlass GEMM Kernels have
inputs with Flexible Layouts, all possible combinations
of row-major or column major layouts are tried during
autotuning.

Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary.

Test Plan:

 * Additional Unit test(s) (more tbd)
 * CI

ghstack-source-id: 5dcfc8eb1712ec40672e9cf2b1a878cae1ee2311
Pull Request resolved: #114319
kadeng added a commit that referenced this pull request Jan 29, 2024
This diff introduces memory layout autotuning and
flexibilizes memory layouts that are accepted and
written by the Cutlass GEMM Kernels.

During autotuning, if Cutlass GEMM Kernels have
inputs with Flexible Layouts, all possible combinations
of row-major or column major layouts are tried during
autotuning.

Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary.

Test Plan:

 * Additional Unit test(s) (more tbd)
 * CI

ghstack-source-id: 5dcfc8eb1712ec40672e9cf2b1a878cae1ee2311
Pull Request resolved: #114319
kadeng added a commit that referenced this pull request Jan 29, 2024
This diff introduces memory layout autotuning and
flexibilizes memory layouts that are accepted and
written by the Cutlass GEMM Kernels.

During autotuning, if Cutlass GEMM Kernels have
inputs with Flexible Layouts, all possible combinations
of row-major or column major layouts are tried during
autotuning.

Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary.

Test Plan:

 * Additional Unit test(s) (more tbd)
 * CI

ghstack-source-id: 5dcfc8eb1712ec40672e9cf2b1a878cae1ee2311
Pull Request resolved: #114319
kadeng added a commit that referenced this pull request Jan 30, 2024
This diff introduces memory layout autotuning and
flexibilizes memory layouts that are accepted and
written by the Cutlass GEMM Kernels.

During autotuning, if Cutlass GEMM Kernels have
inputs with Flexible Layouts, all possible combinations
of row-major or column major layouts are tried during
autotuning.

Note: Flexible input layouts are practically relevant in certain internal production models, this made these changes neccessary.

Test Plan:

 * Additional Unit test(s) (more tbd)
 * CI

ghstack-source-id: 5dcfc8eb1712ec40672e9cf2b1a878cae1ee2311
Pull Request resolved: #114319
kadeng added a commit that referenced this pull request Jan 30, 2024
This diff introduces memory layout autotuning and flexibilizes
memory layouts that are accepted and written by the Cutlass GEMM Kernels.

During autotuning, if Cutlass GEMM Kernels have inputs with flexible memory
layouts, all possible combinations of row-major or column major layouts are
tried during autotuning.

Test Plan:

 * Additional Unit test(s)
 * CI

ghstack-source-id: 5dcfc8eb1712ec40672e9cf2b1a878cae1ee2311
Pull Request resolved: #114319
kadeng added a commit that referenced this pull request Jan 30, 2024
This diff introduces memory layout autotuning and flexibilizes
memory layouts that are accepted and written by the Cutlass GEMM Kernels.

During autotuning, if Cutlass GEMM Kernels have inputs with flexible memory
layouts, all possible combinations of row-major or column major layouts are
tried during autotuning.

Test Plan:

 * Additional Unit test(s)
 * CI

ghstack-source-id: 5dcfc8eb1712ec40672e9cf2b1a878cae1ee2311
Pull Request resolved: #114319
kadeng added a commit that referenced this pull request Feb 1, 2024
This diff introduces memory layout autotuning and flexibilizes
memory layouts that are accepted and written by the Cutlass GEMM Kernels.

During autotuning, if Cutlass GEMM Kernels have inputs with flexible memory
layouts, all possible combinations of row-major or column major layouts are
tried during autotuning.

Test Plan:

 * Additional Unit test(s)
 * CI

ghstack-source-id: 5dcfc8eb1712ec40672e9cf2b1a878cae1ee2311
Pull Request resolved: #114319
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant