Skip to content

ConjTrans support for batched TeamVector gemm #2628

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: develop
Choose a base branch
from

Conversation

yasahi-hpc
Copy link
Contributor

This PR aims at improving the implementation and testing of TeamVector Gemm. (see also #2580)

  • Moving implementation details into Impl namespace
  • Add all the specializations of the combinations of Non-Trans/Trans/ConjTrans
  • Add a check function of input to force A, B, C are rank 0, 1, 2 Kokkos::View
  • Covering all the unit-tests for all the combinations of Non-Trans/Trans/ConjTrans
  • Merge the unit-test for Team and TeamVector which are parameterized with Mode
  • Deprecated warnings for the older implementation details under KokkosBatched namespace

Yuuichi Asahi added 2 commits May 7, 2025 04:09
Signed-off-by: Yuuichi Asahi <y.asahi@nr.titech.ac.jp>
Signed-off-by: Yuuichi Asahi <y.asahi@nr.titech.ac.jp>
@yasahi-hpc yasahi-hpc self-assigned this May 6, 2025
Yuuichi Asahi added 2 commits May 7, 2025 04:18
Signed-off-by: Yuuichi Asahi <y.asahi@nr.titech.ac.jp>
Signed-off-by: Yuuichi Asahi <y.asahi@nr.titech.ac.jp>
@yasahi-hpc yasahi-hpc added CI: skip-docs Do not run the documentation checks for this pull request enhancement labels May 6, 2025
@yasahi-hpc yasahi-hpc changed the title Improve batched team vector gemm ConjTrans support for batched TeamVector gemm May 6, 2025
Signed-off-by: Yuuichi Asahi <y.asahi@nr.titech.ac.jp>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AT2-CI-APPROVAL Approve CI to run at SNL CI: skip-docs Do not run the documentation checks for this pull request enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant