Skip to content

[webgpu] make DP4AMatMulNBitsSmallMProgram shader template #25025

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jun 13, 2025

Conversation

jing-bao
Copy link
Contributor

Description

This commit refactors the DP4AMatMulNBitsSmallMProgram to allow both tile_size_k_vec and tile_size to be configured. This change allows more flexibility for performance tuning without altering the core shader functionality.

There is no functional change in this commit.

Motivation and Context

This is a preparatory change to enable DP4AMatMulNBitsSmallMProgram performance optimization work in subsequent commits.

@daijh
Copy link
Contributor

daijh commented Jun 11, 2025

@qjia7 could you help to take a look?

jing-bao and others added 3 commits June 11, 2025 16:26
@qjia7 qjia7 requested review from fs-eire and guschmue June 13, 2025 07:18
@fs-eire
Copy link
Contributor

fs-eire commented Jun 13, 2025

/azp run Linux QNN CI Pipeline, Win_TRT_Minimal_CUDA_Test_CI, Windows ARM64 QNN CI Pipeline, Windows GPU Doc Gen CI Pipeline, Windows x64 QNN CI Pipeline

Copy link

Azure Pipelines successfully started running 5 pipeline(s).

@fs-eire fs-eire merged commit e7c9a6c into microsoft:main Jun 13, 2025
83 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants