-
Notifications
You must be signed in to change notification settings - Fork 13.2k
Description
Name and Version
ggml_vulkan: Found 2 Vulkan devices:
ggml_vulkan: 0 = NVIDIA GeForce RTX 4070 SUPER (NVIDIA) | uma: 0 | fp16: 1 | bf16: 0 | warp size: 32 | shared memory: 49152 | int dot: 0 | matrix cores: KHR_coopmat
ggml_vulkan: 1 = Intel(R) UHD Graphics 770 (Intel Corporation) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 32 | shared memory: 32768 | int dot: 0 | matrix cores: none
register_backend: registered backend Vulkan (2 devices)
register_device: registered device Vulkan0 (NVIDIA GeForce RTX 4070 SUPER)
register_device: registered device Vulkan1 (Intel(R) UHD Graphics 770)
register_backend: registered backend CPU (1 devices)
register_device: registered device CPU (13th Gen Intel(R) Core(TM) i7-13700)
load_backend: failed to find ggml_backend_init in M:\llama.cpp-master\build\bin\Debug\ggml-vulkan.dll
load_backend: failed to find ggml_backend_init in M:\llama.cpp-master\build\bin\Debug\ggml-cpu.dll
version: 0 (unknown)
built with MSVC 19.44.35214.0 for x64
Operating systems
Windows
Which llama.cpp modules do you know to be affected?
No response
Command line
Problem description & steps to reproduce
The Vulkan backend can use shared memory when dedicated memory is insufficient on discrete GPUs.
However, the value obtained via ggml_backend_dev_memory
represents dedicated memory only.
It would be desirable to also obtain the shared memory value.
Some discrete GPUs have only about 512 MB of dedicated memory, but by utilizing shared memory, they can run large models. This is important for verifying operational requirements based on memory capacity.
The possible solution is to remove this line.
llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp
Line 12111 in 4ca088b
if (heap.flags & vk::MemoryHeapFlagBits::eDeviceLocal) { |
However, there may be cases where you want to keep it within the capacity of dedicated memory, so it might be better if dedicated memory and shared memory could be acquired separately.
First Bad Commit
No response