Vulkan: RTE rounding for cpy to quant #12480

stduhpf · 2025-03-20T18:50:33Z

Fixes some failing tests
Discussed here: #11166
@jeffbolznv

ggml/src/ggml-vulkan/ggml-vulkan.cpp

jeffbolznv · 2025-03-20T19:19:07Z

The changes look good to me, but I haven't tested locally.

stduhpf · 2025-03-20T21:33:25Z

I just tested these changes in sdcpp, LoRAs still load properly on quantized models, image quality (with LoRA) seemed slightly better with this PR compared to the current implementation, but that might just be luck or placebo effect.

Co-Authored-By: Jeff Bolz <jbolz@nvidia.com>

0cc4m

Thank you!

* Vulkan: RTE rounding for cpy to quant Co-Authored-By: Jeff Bolz <jbolz@nvidia.com> * remove trailing whitespace * avoid duplicating pipeline_cpy_f32_quant * fix copypasting issue * remove duplicated code --------- Co-Authored-By: Jeff Bolz <jbolz@nvidia.com>

* Vulkan: RTE rounding for cpy to quant Co-Authored-By: Jeff Bolz <jbolz@nvidia.com> * remove trailing whitespace * avoid duplicating pipeline_cpy_f32_quant * fix copypasting issue * remove duplicated code --------- Co-authored-by: Jeff Bolz <jbolz@nvidia.com>

github-actions bot added Vulkan ggml labels Mar 20, 2025

0cc4m reviewed Mar 20, 2025

View reviewed changes

ggml/src/ggml-vulkan/ggml-vulkan.cpp Outdated Show resolved Hide resolved

stduhpf marked this pull request as ready for review March 20, 2025 21:31

stduhpf and others added 5 commits March 20, 2025 22:42

Vulkan: RTE rounding for cpy to quant

c6969b8

Co-Authored-By: Jeff Bolz <jbolz@nvidia.com>

remove trailing whitespace

c876daa

avoid duplicating pipeline_cpy_f32_quant

f5565cf

fix copypasting issue

17933a7

remove duplicated code

b241383

stduhpf force-pushed the vk-cpy-rte branch from b95649b to b241383 Compare March 20, 2025 21:47

0cc4m approved these changes Mar 21, 2025

View reviewed changes

0cc4m merged commit 4375415 into ggml-org:master Mar 21, 2025
48 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vulkan: RTE rounding for cpy to quant #12480

Vulkan: RTE rounding for cpy to quant #12480

stduhpf commented Mar 20, 2025 •

edited

Loading

jeffbolznv commented Mar 20, 2025

stduhpf commented Mar 20, 2025 •

edited

Loading

0cc4m left a comment

Vulkan: RTE rounding for cpy to quant #12480

Vulkan: RTE rounding for cpy to quant #12480

Conversation

stduhpf commented Mar 20, 2025 • edited Loading

jeffbolznv commented Mar 20, 2025

stduhpf commented Mar 20, 2025 • edited Loading

0cc4m left a comment

Choose a reason for hiding this comment

stduhpf commented Mar 20, 2025 •

edited

Loading

stduhpf commented Mar 20, 2025 •

edited

Loading