vulkan: Replace uses of maxMemoryAllocationSize and VK_WHOLE_SIZE #16354

jeffbolznv · 2025-09-30T16:14:52Z

Replace maxMemoryAllocationSize check with maxBufferSize when creating buffers. The maxMemoryAllocationSize limit is a "soft" limit and allocations can succeed beyond that limit. This allows > 4GB buffers to be allocated on some implementations (e.g. NVIDIA) and tensors this large can be used for im2col and mul_mat.

For temporary buffers (prealloc_x/y/etc) check against maxStorageBufferRange. I'm not sure this check is ideal, but we always use these buffers as a single full size binding and the limit may be smaller than maxMemoryAllocationSize or maxBufferSize, so I think this is reasonable.

Replace descriptor range uses of VK_WHOLE_SIZE with a manually computed range. The maxStorageBufferRange may be smaller than the maxBufferSize or maxMemoryAllocationSize (and the Vulkan spec warns about this in a note) and it's invalid usage if VK_WHOLE_SIZE computes a range larger than maxStorageBufferRange.

With this change, it should be possible to generate videos using wan networks in stable-diffusion.cpp.

Replace maxMemoryAllocationSize check with maxBufferSize when creating buffers. The maxMemoryAllocationSize limit is a "soft" limit and allocations can succeed beyond that limit. This allows > 4GB buffers to be allocated on some implementations (e.g. NVIDIA) and tensors this large can be used for im2col and mul_mat. For temporary buffers (prealloc_x/y/etc) check against maxStorageBufferRange. I'm not sure this check is ideal, but we always use these buffers as a single full size binding and the limit may be smaller than maxMemoryAllocationSize or maxBufferSize, so I think this is reasonable. Replace descriptor range uses of VK_WHOLE_SIZE with a manually computed range. The maxStorageBufferRange may be smaller than the maxBufferSize or maxMemoryAllocationSize (and the Vulkan spec warns about this in a note) and it's invalid usage if VK_WHOLE_SIZE computes a range larger than maxStorageBufferRange. With this change, it should be possible to generate videos using wan networks in stable-diffusion.cpp.

…SIZE (ggml-org#16354)" This reverts commit 2aaf0a2.

* origin/master: (124 commits) metal : fix loop bound in ggml_mem_ranges (ggml-org#16412) llama : fix shapes for bert/mpt q/k norm (ggml-org#16409) ggml : fix graph reallocation with multiple chunks (ggml-org#16396) Fix missing messages on sibling navigation (ggml-org#16408) vulkan: Replace uses of maxMemoryAllocationSize and VK_WHOLE_SIZE (ggml-org#16354) vulkan: Fix FA coopmat1 invalid array indexing (ggml-org#16365) ci : change macos-13 to macos-15-intel (ggml-org#16401) Capture model name only after first token (streaming) or completed request (ggml-org#16405) vulkan: in flash attention, bounds check against nem1 (don't rely on GGML_KQ_MASK_PAD) (ggml-org#16316) webui : Fix messages payload sent to chat completions (ggml-org#16402) fix: track viewportHeight via window.innerHeight to avoid unwanted scrolling (ggml-org#16356) test-barrier : do not use more threads than physically available (ggml-org#16389) ggml webgpu: add support for soft_max, optimize rms_norm (ggml-org#16357) model : Apertus model implementation (ggml-org#15852) musa: update compile flags (ggml-org#16265) ci : fix ubuntu-latest-cmake-rpc (disable ccache) (ggml-org#16388) ci: update vulkan ci (ggml-org#16294) ci : fix clean-up of old logs (ggml-org#16381) SYCL: Update to oneAPI 2025.2 (ggml-org#16371) HIP: add IMbackK to codeowner (ggml-org#16375) ...

jeffbolznv requested a review from 0cc4m as a code owner September 30, 2025 16:14

github-actions bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Sep 30, 2025

jeffbolznv mentioned this pull request Oct 1, 2025

Vulkan: is it possible to work around maxBufferSize? leejet/stable-diffusion.cpp#673

Closed

vulkan: Add env var GGML_VK_FORCE_MAX_BUFFER_SIZE and use stoull

63974a3

0cc4m approved these changes Oct 3, 2025

View reviewed changes

0cc4m merged commit 2aaf0a2 into ggml-org:master Oct 3, 2025
60 of 66 checks passed

Nexesenex added a commit to Nexesenex/croco.cpp that referenced this pull request Oct 3, 2025

Revert "vulkan: Replace uses of maxMemoryAllocationSize and VK_WHOLE_…

76aed17

…SIZE (ggml-org#16354)" This reverts commit 2aaf0a2.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

vulkan: Replace uses of maxMemoryAllocationSize and VK_WHOLE_SIZE #16354

vulkan: Replace uses of maxMemoryAllocationSize and VK_WHOLE_SIZE #16354

jeffbolznv commented Sep 30, 2025

Uh oh!

Uh oh!

Uh oh!

vulkan: Replace uses of maxMemoryAllocationSize and VK_WHOLE_SIZE #16354

vulkan: Replace uses of maxMemoryAllocationSize and VK_WHOLE_SIZE #16354

Conversation

jeffbolznv commented Sep 30, 2025

Uh oh!

Uh oh!

Uh oh!