Skip to content

Conversation

@jeffdaily
Copy link
Collaborator

This includes both the initial carveout PR plus it's later fix.

Fixes pytorch#149280.  Follow up to pytorch#147966, but now available for ROCm.

Since hipblaslt does not support HIPBLASLT_MATMUL_DESC_CU_COUNT_TARGET, we instead create a hipStream that has a CU mask applied.  We pass this masked stream to hipblaslt instead of pytorch's current stream.  We ensure stream ordering between streams using hipEvents and stream synchronization.

Pull Request resolved: pytorch#149466
Approved by: https://github.com/malfet, https://github.com/atalman
@rocm-repo-management-api
Copy link

rocm-repo-management-api bot commented Oct 7, 2025

Jenkins build for 47700f36cd3a73932df7b54c92e5fbb2ee41960a commit finished as NOT_BUILT
Links: Blue Ocean view / Build artifacts

Fixes pytorch#164271.

Carveout had been applied with an opposite bitmask. Besides being incorrect, this lead to flaky unit test behavior due to carveout being too high.

Pull Request resolved: pytorch#164303
Approved by: https://github.com/jeffdaily

Co-authored-by: Jeff Daily <jeff.daily@amd.com>
@jeffdaily jeffdaily force-pushed the release/2.7_carveout branch from f8a230b to 47700f3 Compare October 8, 2025 00:03
@rocm-repo-management-api
Copy link

rocm-repo-management-api bot commented Oct 8, 2025

Jenkins build for 47700f36cd3a73932df7b54c92e5fbb2ee41960a commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

@jithunnair-amd jithunnair-amd merged commit c2114ee into release/2.7 Oct 8, 2025
40 of 45 checks passed
@jithunnair-amd jithunnair-amd deleted the release/2.7_carveout branch October 8, 2025 17:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants