Skip to content

vulkan: dequantize iq4_xs 4 at a time#20657

Merged
0cc4m merged 1 commit intoggml-org:masterfrom
netrunnereve:matmul
Mar 19, 2026
Merged

vulkan: dequantize iq4_xs 4 at a time#20657
0cc4m merged 1 commit intoggml-org:masterfrom
netrunnereve:matmul

Conversation

@netrunnereve
Copy link
Collaborator

I guess I missed this when I did the other quants 🤷‍♀️.

On my RX 470:

PR:

MUL_MAT(type_a=iq4_xs,type_b=f32,m=4096,n=512,k=14336,bs=[1,1],nr=[1,1],per=[0,1,2,3],k_v=0,o=1):                       42 runs - 24842.74 us/run -  60.13 GFLOP/run -   2.42 TFLOPS
model size params backend ngl test t/s
llama 1B IQ4_XS - 4.25 bpw 580.86 MiB 1.10 B Vulkan 100 pp512 1116.81 ± 0.27

Master:

MUL_MAT(type_a=iq4_xs,type_b=f32,m=4096,n=512,k=14336,bs=[1,1],nr=[1,1],per=[0,1,2,3],k_v=0,o=1):                       38 runs - 26709.47 us/run -  60.13 GFLOP/run -   2.25 TFLOPS
model size params backend ngl test t/s
llama 1B IQ4_XS - 4.25 bpw 580.86 MiB 1.10 B Vulkan 100 pp512 1055.85 ± 2.70

@netrunnereve netrunnereve requested a review from a team as a code owner March 16, 2026 21:50
@github-actions github-actions bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Mar 16, 2026
Copy link
Contributor

@0cc4m 0cc4m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@0cc4m 0cc4m merged commit 07feeaa into ggml-org:master Mar 19, 2026
51 of 57 checks passed
@netrunnereve netrunnereve deleted the matmul branch March 19, 2026 15:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants