Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TIR] Support more mma intrinsics and get_mma_intrin_group utility #16073

Merged
merged 5 commits into from Nov 7, 2023

Conversation

Ubospica
Copy link
Contributor

@Ubospica Ubospica commented Nov 5, 2023

This PR focuses on adding supporting various mma intrinsics for matmul scheduling. To be specific, this PR:

  • Adds support for transposed A in ldmatrix and mma_sync
  • Changes all T.launch_thread(tx, 32) annotations to for loops for tx in T.thread_binding(0, WARP_SIZE, "threadIdx.x"). This is for the convenience of later transformation.
  • Refactor some logic and add a utility get_mma_intrin_group to get a group of intrinsics:
def get_mma_intrin_group(
    load_scope: Literal["shared", "shared.dyn"],
    store_scope: Literal["global", "shared", "shared.dyn"],
    in_dtype: Literal["float16", "int8"],
    out_dtype: Literal["float16", "float32", "int32"],
    trans_a: bool,
    trans_b: bool,
    not_use_mma_store_intrinic: bool = True,
    store_to_smem_dtype: Optional[Literal["float16", "float32", "int32"]] = None,
) -> Dict[str, str]
  • Avoid use the current mma_store intrinsic. Instread, use BufferStore statements.
    • This is because if we use mma_store intrinsic, during swizzling shared memory visits, our rearrangement scheme will involve areas accessed by different mma_store calls. This makes swizzling quite complex. But BufferStore will not face this problem

This PR is used and tested in the dlight matmul schedule rule.

@spectrometerHBH @vinx13 @Hzfengsy

@Ubospica Ubospica changed the title [TIR] Support more intrinsics and get_mma_intrin_group utility [TIR] Support more mma intrinsics and get_mma_intrin_group utility Nov 5, 2023
@vinx13 vinx13 merged commit db4290b into apache:main Nov 7, 2023
17 checks passed
Ubospica added a commit to Ubospica/tvm-develop that referenced this pull request Nov 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants