opencl: add sqr, sqrt, mean and ssm_conv #17476

lhez · 2025-11-24T17:07:26Z

This PR add new ops - sqr, sqrt, mean and ssm_conv needed by models like gemma3n and lfm2 models.

CISC · 2025-11-24T18:19:33Z

@lhez It's time to update ops.md and OpenCL.csv too. :)

max-krasnyansky

Very nice!

SSM_CONV in particular gives a nice perf boost for the LFM2 models.

LFM2-1.2B-Q4_0.gguf

Before:
common_perf_print: prompt eval time =     286.12 ms /   212 tokens (    1.35 ms per token,   740.95 tokens per second)
common_perf_print:        eval time =     966.67 ms /    50 runs   (   19.33 ms per token,    51.72 tokens per second)

After:
common_perf_print: prompt eval time =     250.62 ms /   212 tokens (    1.18 ms per token,   845.91 tokens per second)
common_perf_print:        eval time =    1126.55 ms /    63 runs   (   17.88 ms per token,    55.92 tokens per second)

I checked the SCHED_DEBUG output and we now claim all the SSM_CONV ops in LFM2.

loci-dev mentioned this pull request Nov 24, 2025

UPSTREAM PR #17476: opencl: add sqr, sqrt, mean and ssm_conv auroralabs-loci/llama.cpp#311

Open

github-actions bot added ggml changes relating to the ggml tensor library for machine learning OpenCL Issues specific to the OpenCL backend labels Nov 24, 2025

lhez assigned max-krasnyansky Nov 25, 2025

lhez added 6 commits November 24, 2025 22:28

opencl: add sqr

3c5d96a

opencl: add sqrt

d478554

opencl: add mean

95949c5

opencl: add ssm_conv

dd64bc9

opencl: add missing cl_khr_fp16

2dd65e4

opencl: do sqrt in f32 then convert to f16 for better precision

338d9ac

lhez force-pushed the lh/sqr-sqrt-mean-ssm-conv branch from 253dd70 to 338d9ac Compare November 25, 2025 06:41

lhez marked this pull request as ready for review November 25, 2025 18:15

lhez requested a review from max-krasnyansky as a code owner November 25, 2025 18:15

max-krasnyansky approved these changes Nov 26, 2025

View reviewed changes

max-krasnyansky merged commit 7cba58b into ggml-org:master Nov 26, 2025
65 of 74 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

opencl: add sqr, sqrt, mean and ssm_conv #17476

opencl: add sqr, sqrt, mean and ssm_conv #17476

Uh oh!

lhez commented Nov 24, 2025

Uh oh!

CISC commented Nov 24, 2025

Uh oh!

max-krasnyansky left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

opencl: add sqr, sqrt, mean and ssm_conv #17476

opencl: add sqr, sqrt, mean and ssm_conv #17476

Uh oh!

Conversation

lhez commented Nov 24, 2025

Uh oh!

CISC commented Nov 24, 2025

Uh oh!

max-krasnyansky left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants