Skip to content

Conversation

@raikonenfnu
Copy link
Contributor

  • Implemented memory_counter_wait op
  • Use memory_counter_wait op to optimize waitcnt in async BF16 PP GEMM

Signed-off-by: Stanley Winata <stanley.winata@amd.com>
Signed-off-by: Stanley Winata <stanley.winata@amd.com>
Copy link
Collaborator

@harsh-nod harsh-nod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! just a minor refactor comment

Signed-off-by: Stanley Winata <stanley.winata@amd.com>
Copy link
Contributor

@Hardcode84 Hardcode84 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably need to update amdgpu.memory_counter_wait upstream to support tensorcnt too

@raikonenfnu raikonenfnu merged commit 5517156 into iree-org:main Nov 11, 2025
19 of 22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants