vulkan: optimize coopmat2 iq2/iq3 callbacks #11521

jeffbolznv · 2025-01-30T17:42:40Z

I've only looked at the directed perf tests, and there may still be more to do. But this gets to the same order of magnitude as Q4_K (included for comparison):

before:
  MUL_MAT(type_a=q4_K,type_b=f32,m=4096,n=512,k=14336,bs=[1,1],nr=[1,1],per=[0,1,2,3]):                  626 runs -  1597.97 us/run -  60.13 GFLOP/run -  37.63 TFLOPS
  MUL_MAT(type_a=iq2_xxs,type_b=f32,m=4096,n=512,k=14336,bs=[1,1],nr=[1,1],per=[0,1,2,3]):               276 runs -  3629.79 us/run -  60.13 GFLOP/run -  16.57 TFLOPS
  MUL_MAT(type_a=iq2_xs,type_b=f32,m=4096,n=512,k=14336,bs=[1,1],nr=[1,1],per=[0,1,2,3]):                252 runs -  3981.24 us/run -  60.13 GFLOP/run -  15.10 TFLOPS
  MUL_MAT(type_a=iq2_s,type_b=f32,m=4096,n=512,k=14336,bs=[1,1],nr=[1,1],per=[0,1,2,3]):                 338 runs -  2970.81 us/run -  60.13 GFLOP/run -  20.24 TFLOPS
  MUL_MAT(type_a=iq3_xxs,type_b=f32,m=4096,n=512,k=14336,bs=[1,1],nr=[1,1],per=[0,1,2,3]):               284 runs -  3525.55 us/run -  60.13 GFLOP/run -  17.06 TFLOPS
  MUL_MAT(type_a=iq3_s,type_b=f32,m=4096,n=512,k=14336,bs=[1,1],nr=[1,1],per=[0,1,2,3]):                 330 runs -  3039.16 us/run -  60.13 GFLOP/run -  19.78 TFLOPS

after:
  MUL_MAT(type_a=q4_K,type_b=f32,m=4096,n=512,k=14336,bs=[1,1],nr=[1,1],per=[0,1,2,3]):                  626 runs -  1598.70 us/run -  60.13 GFLOP/run -  37.61 TFLOPS
  MUL_MAT(type_a=iq2_xxs,type_b=f32,m=4096,n=512,k=14336,bs=[1,1],nr=[1,1],per=[0,1,2,3]):               632 runs -  1585.54 us/run -  60.13 GFLOP/run -  37.92 TFLOPS
  MUL_MAT(type_a=iq2_xs,type_b=f32,m=4096,n=512,k=14336,bs=[1,1],nr=[1,1],per=[0,1,2,3]):                552 runs -  1811.94 us/run -  60.13 GFLOP/run -  33.19 TFLOPS
  MUL_MAT(type_a=iq2_s,type_b=f32,m=4096,n=512,k=14336,bs=[1,1],nr=[1,1],per=[0,1,2,3]):                 496 runs -  2016.71 us/run -  60.13 GFLOP/run -  29.82 TFLOPS
  MUL_MAT(type_a=iq3_xxs,type_b=f32,m=4096,n=512,k=14336,bs=[1,1],nr=[1,1],per=[0,1,2,3]):               670 runs -  1494.11 us/run -  60.13 GFLOP/run -  40.24 TFLOPS
  MUL_MAT(type_a=iq3_s,type_b=f32,m=4096,n=512,k=14336,bs=[1,1],nr=[1,1],per=[0,1,2,3]):                 670 runs -  1496.61 us/run -  60.13 GFLOP/run -  40.18 TFLOPS

0cc4m

LGTM

* vulkan: optimize coopmat2 iq2/iq3 callbacks * build: trigger CI on GLSL compute shader changes

vulkan: optimize coopmat2 iq2/iq3 callbacks

eafee68

jeffbolznv requested a review from 0cc4m January 30, 2025 17:42

build: trigger CI on GLSL compute shader changes

7634ce2

github-actions bot added Vulkan Issues specific to the Vulkan backend devops improvements to build systems and github actions ggml changes relating to the ggml tensor library for machine learning labels Jan 30, 2025

0cc4m approved these changes Feb 6, 2025

View reviewed changes

0cc4m merged commit 2c6c8df into ggml-org:master Feb 6, 2025
41 checks passed

tinglou pushed a commit to tinglou/llama.cpp that referenced this pull request Feb 13, 2025

vulkan: optimize coopmat2 iq2/iq3 callbacks (ggml-org#11521)

548ce99

* vulkan: optimize coopmat2 iq2/iq3 callbacks * build: trigger CI on GLSL compute shader changes

orca-zhang pushed a commit to orca-zhang/llama.cpp that referenced this pull request Feb 26, 2025

vulkan: optimize coopmat2 iq2/iq3 callbacks (ggml-org#11521)

1d12fcf

* vulkan: optimize coopmat2 iq2/iq3 callbacks * build: trigger CI on GLSL compute shader changes

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Feb 26, 2025

vulkan: optimize coopmat2 iq2/iq3 callbacks (ggml-org#11521)

1fba5cf

* vulkan: optimize coopmat2 iq2/iq3 callbacks * build: trigger CI on GLSL compute shader changes

mglambda pushed a commit to mglambda/llama.cpp that referenced this pull request Mar 8, 2025

vulkan: optimize coopmat2 iq2/iq3 callbacks (ggml-org#11521)

25f8a95

* vulkan: optimize coopmat2 iq2/iq3 callbacks * build: trigger CI on GLSL compute shader changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

vulkan: optimize coopmat2 iq2/iq3 callbacks #11521

vulkan: optimize coopmat2 iq2/iq3 callbacks #11521

Uh oh!

jeffbolznv commented Jan 30, 2025

Uh oh!

0cc4m left a comment

Uh oh!

Uh oh!

Uh oh!

vulkan: optimize coopmat2 iq2/iq3 callbacks #11521

vulkan: optimize coopmat2 iq2/iq3 callbacks #11521

Uh oh!

Conversation

jeffbolznv commented Jan 30, 2025

Uh oh!

0cc4m left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!