Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[mlir][gpu] Use known_block_size to set maxntid for NVVM target #77301

Merged
merged 1 commit into from Jan 8, 2024

Commits on Jan 8, 2024

  1. [mlir][gpu] Use known_block_size to set maxntid for NVVM target

    Setting thread block size with `maxntid` on the kernel has great performance benefits. In this way, downstream PTX compiler can do better register allocation.
    
    MLIR's `gpu.launch` and `gpu.launch_func` already has an attribute (`known_block_size`) that keeps the thread block size when it is known. This PR simply uses this attribute to set `maxntid`.
    grypp committed Jan 8, 2024
    Configuration menu
    Copy the full SHA
    498a987 View commit details
    Browse the repository at this point in the history