Skip to content

Conversation

ggerganov
Copy link
Member

@ggerganov ggerganov commented Sep 18, 2025

  • Make NSG and nxpsg compile-time constants
  • Remove NW template arg
  • Tune some mat-vec constants
Model Test t/s master t/s gg/metal-mul-mv-ext-opt Speedup
qwen2 14B Q4_K_M tg32 54.54 55.61 1.02
qwen2 32B IQ4_NL - 4.5 bpw tg32 28.20 28.27 1.00
qwen2 32B Q4_K_M tg32 27.33 27.51 1.01
qwen2 32B Q5_K_M tg32 21.18 21.47 1.01
qwen2 32B Q6_K tg32 21.25 22.97 1.08
qwen2 3B IQ4_NL - 4.5 bpw tg32 155.64 157.27 1.01
qwen2 3B Q4_K_M tg32 148.75 155.12 1.04
qwen2 3B Q5_K_M tg32 130.03 131.54 1.01
qwen2 3B Q6_K tg32 136.94 143.11 1.05
qwen2 7B IQ4_NL - 4.5 bpw tg32 108.95 109.78 1.01
qwen2 7B Q4_K_M tg32 100.48 103.56 1.03
qwen2 7B Q5_K_M tg32 78.64 79.69 1.01
qwen2 7B Q6_K tg32 84.44 90.19 1.07

@github-actions github-actions bot added ggml changes relating to the ggml tensor library for machine learning Apple Metal https://en.wikipedia.org/wiki/Metal_(API) labels Sep 18, 2025
@ggerganov ggerganov merged commit 703f9e3 into master Sep 18, 2025
53 of 54 checks passed
@ggerganov ggerganov deleted the gg/metal-mul-mv-ext-opt branch September 18, 2025 13:28
yael-works pushed a commit to yael-works/llama.cpp that referenced this pull request Oct 15, 2025
* metal : use function constants for mul_mv_ext kernels

ggml-ci

* metal : remove NW template argument

ggml-ci

* metal : adjust constants

ggml-ci
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Apple Metal https://en.wikipedia.org/wiki/Metal_(API) ggml changes relating to the ggml tensor library for machine learning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant