[CUDA] FpA IntB Gemm Kernel Test #25109

tianleiwu · 2025-06-18T22:00:05Z

Enhance MatMulNBits CUDA kernel testing:
(1) Add a kernel testing for different cuda kernels used in MatMulNBits.
(2) Refactoring the gemm profiler to use cuda allocator
(2) Add verbose logging macros.
(3) Adjustments to speed up compiling when sm90 is excluded from build.

Example kernel test output:

onnxruntime/contrib_ops/cuda/llm/fpA_intB_gemm/fpA_intB_gemm_template.h

onnxruntime/contrib_ops/cuda/llm/fpA_intB_gemv/fpA_intB_gemv.cu

onnxruntime/contrib_ops/cuda/llm/common/logger.h

onnxruntime/contrib_ops/cuda/llm/fpA_intB_gemm_profiler.cc

add fpA intB gemm kernel test

d51f07f

tianleiwu marked this pull request as draft June 18, 2025 22:03

tianleiwu added 2 commits June 18, 2025 17:49

use size_t to avoid int overflow

339beb1

minor change

902eaa6

tianleiwu marked this pull request as ready for review June 19, 2025 20:38

tianleiwu requested review from jiafatom, kunal-vaishnavi and nenad1002 June 19, 2025 20:42

kunal-vaishnavi reviewed Jun 19, 2025

View reviewed changes

onnxruntime/contrib_ops/cuda/llm/fpA_intB_gemm/fpA_intB_gemm_template.h Show resolved Hide resolved

nenad1002 reviewed Jun 19, 2025

View reviewed changes

onnxruntime/contrib_ops/cuda/llm/fpA_intB_gemm/fpA_intB_gemm_template.h Show resolved Hide resolved

onnxruntime/contrib_ops/cuda/llm/fpA_intB_gemv/fpA_intB_gemv.cu Show resolved Hide resolved

nenad1002 reviewed Jun 19, 2025

View reviewed changes

onnxruntime/contrib_ops/cuda/llm/common/logger.h Show resolved Hide resolved

nenad1002 approved these changes Jun 20, 2025

View reviewed changes

onnxruntime/contrib_ops/cuda/llm/fpA_intB_gemm_profiler.cc Show resolved Hide resolved

tianleiwu merged commit 7268117 into main Jun 20, 2025
89 checks passed

tianleiwu deleted the tlwu/fpA_intB_gemm_kernel_test branch June 20, 2025 20:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CUDA] FpA IntB Gemm Kernel Test #25109

[CUDA] FpA IntB Gemm Kernel Test #25109

Uh oh!

tianleiwu commented Jun 18, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[CUDA] FpA IntB Gemm Kernel Test #25109

[CUDA] FpA IntB Gemm Kernel Test #25109

Uh oh!

Conversation

tianleiwu commented Jun 18, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants