opencl: refine condition for using kqv mm kernel #17392

lhez · 2025-11-19T22:03:07Z

The kernel introduced in #17181 is specialized for f16f32 mm in attention, in particular for KQ and KQV. Ordinary f16f32 mm would go to this kernel under the original condition causing incorrect results like in #17351. This PR refines the condition (similarly to what vulkan backend does) so that this kernel will only be used for attention.

forforever73 · 2025-11-20T06:13:55Z

@lhez Great, the output is correct now. However, I'm still a bit confused about the condition you use to determine KQV. Most implementations of build_attn call ggml_cont on v, which prevents kqv from entering the ggml_cl_mul_mat_kq_kqv_adreno kernel. After enabling the GGML_OPENCL_PROFILING=ON build option, I confirmed that this is indeed the case.

lhez · 2025-11-21T22:24:11Z

@lhez Great, the output is correct now. However, I'm still a bit confused about the condition you use to determine KQV. Most implementations of build_attn call ggml_cont on v, which prevents kqv from entering the ggml_cl_mul_mat_kq_kqv_adreno kernel.

@forforever73 I think what you are referring to is for mmproj models and you are right - v is made contiguous with ggml_cont (here) for mmproj. In this case, kqv will use the ordinary f16f32 mm.

However, this is not the case for language models. For language models, v usually is not made contiguous and remains a view into the v cache. In this case, v is a non-contiguous tensor and kq is contiguous (output of softmax).

If you dump the kernel timing, e.g., qwen2.5-1.5b, you will see something like this,

So, for language models there is usually no cont and we still need to identify for kqv.

opencl: refine condition for kqv mm

a8b6189

github-actions bot added ggml changes relating to the ggml tensor library for machine learning OpenCL Issues specific to the OpenCL backend labels Nov 19, 2025

lhez mentioned this pull request Nov 19, 2025

Eval bug: Incorrect VLM image description on Android after commit 4db5641 #17351

Closed

lhez marked this pull request as ready for review November 20, 2025 18:55

lhez requested a review from max-krasnyansky as a code owner November 20, 2025 18:55

max-krasnyansky approved these changes Nov 21, 2025

View reviewed changes

max-krasnyansky merged commit 8e9ddba into ggml-org:master Nov 21, 2025
133 of 134 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

opencl: refine condition for using kqv mm kernel #17392

opencl: refine condition for using kqv mm kernel #17392

lhez commented Nov 19, 2025

Uh oh!

forforever73 commented Nov 20, 2025

Uh oh!

lhez commented Nov 21, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

opencl: refine condition for using kqv mm kernel #17392

opencl: refine condition for using kqv mm kernel #17392

Conversation

lhez commented Nov 19, 2025

Uh oh!

forforever73 commented Nov 20, 2025

Uh oh!

lhez commented Nov 21, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants