opencl: initial `q8_0` mv support #15732

lhez · 2025-09-02T06:42:42Z

This PR adds initial q8_0 support.

lhez · 2025-09-08T21:10:59Z

ggml/src/ggml-opencl/ggml-opencl.cpp

+            extra = new ggml_tensor_extra_cl_q8_0();
+        } else {
+            extra = temp_tensor_extras_q8_0.back();
+            temp_tensor_extras_q4_0.pop_back();


Should be temp_tensor_extras_q8_0.pop_back().

lhez · 2025-09-21T03:39:44Z

@max-krasnyansky @rmatif I am going to merge this to move things forward. We can iterate on this if needed.

rmatif · 2025-09-21T08:41:03Z

@max-krasnyansky @rmatif I am going to merge this to move things forward. We can iterate on this if needed.

Sorry for the delay. I’ve been pretty busy. I’ll look into this (and the others) today

rmatif · 2025-09-21T16:56:08Z

ggml_opencl: default device: 'QUALCOMM Adreno(TM) 830 (OpenCL 3.0 Adreno(TM) 830)

model	size	params	backend	ngl	fa	test	t/s
llama 1B Q8_0	1.22 GiB	1.24 B	OpenCL	99	0	pp512	58.38 ± 0.76
llama 1B Q8_0	1.22 GiB	1.24 B	OpenCL	99	0	tg128	38.80 ± 0.20
llama 1B Q8_0	1.22 GiB	1.24 B	OpenCL	99	1	pp512	47.74 ± 11.05
llama 1B Q8_0	1.22 GiB	1.24 B	OpenCL	99	1	tg128	34.16 ± 2.31

rmatif · 2025-09-21T16:59:42Z

@lhez I'll let you merge the PR according to the new guidelines

lhez · 2025-09-21T21:48:06Z

@rmatif Thank you for taking a look!

github-actions bot added ggml changes relating to the ggml tensor library for machine learning OpenCL Issues specific to the OpenCL backend labels Sep 2, 2025

lhez marked this pull request as ready for review September 5, 2025 05:53

lhez requested review from max-krasnyansky and rmatif September 5, 2025 05:53

lhez commented Sep 8, 2025

View reviewed changes

lhez force-pushed the q8_0-mv branch 2 times, most recently from 9e07767 to 066b21d Compare September 16, 2025 03:23

lhez force-pushed the q8_0-mv branch from 066b21d to 7f64bfd Compare September 18, 2025 05:52

lhez added 6 commits September 18, 2025 14:17

opencl: initial q8_0 mv

e9a5410

opencl: initial q8_0 mv_id

9cce98b

opencl: add flattened q8_0 mv

3e0ee39

opencl: add flattened q8_0 mv_id

fd78540

opencl: improve mul_mv_q8_0_f32_flat

a07dada

opencl: improve mul_mv_id_q8_0_f32_flat

887e37f

lhez force-pushed the q8_0-mv branch from 7f64bfd to 887e37f Compare September 19, 2025 06:18

rmatif approved these changes Sep 21, 2025

View reviewed changes

lhez merged commit c4510dc into ggml-org:master Sep 21, 2025
94 of 96 checks passed

struct pushed a commit to struct/llama.cpp that referenced this pull request Sep 26, 2025

opencl: initial q8_0 mv support (ggml-org#15732)

86b1356

yael-works pushed a commit to yael-works/llama.cpp that referenced this pull request Oct 15, 2025

opencl: initial q8_0 mv support (ggml-org#15732)

f3b5b2f

pwilkin pushed a commit to pwilkin/llama.cpp that referenced this pull request Oct 23, 2025

opencl: initial q8_0 mv support (ggml-org#15732)

7be1cfd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

opencl: initial `q8_0` mv support #15732

opencl: initial `q8_0` mv support #15732

Uh oh!

lhez commented Sep 2, 2025

Uh oh!

lhez Sep 8, 2025

Uh oh!

lhez commented Sep 21, 2025

Uh oh!

rmatif commented Sep 21, 2025

Uh oh!

rmatif commented Sep 21, 2025

Uh oh!

rmatif commented Sep 21, 2025 •

edited

Loading

Uh oh!

lhez commented Sep 21, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

opencl: initial q8_0 mv support #15732

opencl: initial q8_0 mv support #15732

Uh oh!

Conversation

lhez commented Sep 2, 2025

Uh oh!

lhez Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

lhez commented Sep 21, 2025

Uh oh!

rmatif commented Sep 21, 2025

Uh oh!

rmatif commented Sep 21, 2025

Uh oh!

rmatif commented Sep 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lhez commented Sep 21, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

opencl: initial `q8_0` mv support #15732

opencl: initial `q8_0` mv support #15732

rmatif commented Sep 21, 2025 •

edited

Loading