-
Notifications
You must be signed in to change notification settings - Fork 13.2k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Name and Version
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = AMD Radeon Graphics (RADV RENOIR) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 0 | matrix cores: none
version: 6665 (95ce098)
built with cc (Ubuntu 14.2.0-19ubuntu2) 14.2.0 for x86_64-linux-gnu
Operating systems
Linux (Ubuntu 25.04 plucky)
Which llama.cpp modules do you know to be affected?
llama-server
Command line
./llama-server --gpu-layers 100 -dev Vulkan0 -t 1 -tb 1 --no-mmap -c 10000 -fa off -m ./gpt/gpt-oss-20b-mxfp4.gguf
Problem description & steps to reproduce
After the recent changes, llama.cpp is consistently crashing when using Vulkan with the default physical batch size (-ub 512
).
I've found that reducing the physical batch size to 401 or lower seems to work correctly.
First Bad Commit
No response
Relevant log output
...
slot update_slots: id 0 | task 0 | prompt processing progress, n_past = 4096, n_tokens = 2048, progress = 0.590457
/llama.cpp/ggml/src/ggml-backend.cpp:1850: GGML_ASSERT((char *)addr + ggml_backend_buffer_get_alloc_size(buffer, tensor) <= (char *)ggml_backend_buffer_get_base(buffer) + ggml_backend_buffer_get_size(buffer)) failed
[New LWP 82368]
[New LWP 82367]
[New LWP 82366]
[New LWP 82365]
[New LWP 82364]
[New LWP 82363]
[New LWP 82362]
[New LWP 82361]
[New LWP 82360]
[New LWP 82359]
[New LWP 82358]
[New LWP 82357]
[New LWP 82356]
[New LWP 82355]
[New LWP 82354]
[New LWP 82353]
[New LWP 82352]
[New LWP 82350]
[New LWP 82349]
This GDB supports auto-downloading debuginfo from the following URLs:
<https://debuginfod.ubuntu.com>
Enable debuginfod for this session? (y or [n]) [answered N; input not from terminal]
Debuginfod has been disabled.
To make this setting permanent, add 'set debuginfod enabled off' to .gdbinit.
Function(s) ^std::(move|forward|as_const|(__)?addressof) will be skipped when stepping.
Function(s) ^std::(shared|unique)_ptr<.*>::(get|operator) will be skipped when stepping.
Function(s) ^std::(basic_string|vector|array|deque|(forward_)?list|(unordered_|flat_)?(multi)?(map|set)|span)<.*>::(c?r?(begin|end)|front|back|data|size|empty) will be skipped when stepping.
Function(s) ^std::(basic_string|vector|array|deque|span)<.*>::operator.] will be skipped when stepping.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
__syscall_cancel_arch () at ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S:56
warning: 56 ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S: No such file or directory
#0 __syscall_cancel_arch () at ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S:56
56 in ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S
#1 0x000077136fc9eb63 in __internal_syscall_cancel (a1=<optimized out>, a2=<optimized out>, a3=<optimized out>, a4=<optimized out>, a5=0, a6=0, nr=61) at ./nptl/cancellation.c:49
warning: 49 ./nptl/cancellation.c: No such file or directory
#2 __syscall_cancel (a1=<optimized out>, a2=<optimized out>, a3=<optimized out>, a4=<optimized out>, a5=a5@entry=0, a6=a6@entry=0, nr=61) at ./nptl/cancellation.c:75
75 in ./nptl/cancellation.c
#3 0x000077136fd1ae9f in __GI___wait4 (pid=<optimized out>, stat_loc=<optimized out>, options=<optimized out>, usage=<optimized out>) at ../sysdeps/unix/sysv/linux/wait4.c:30
warning: 30 ../sysdeps/unix/sysv/linux/wait4.c: No such file or directory
#4 0x00007713706cdf13 in ggml_print_backtrace () from /llama.cpp/build/bin/libggml-base.so
#5 0x00007713706ce0bb in ggml_abort () from /llama.cpp/build/bin/libggml-base.so
#6 0x00007713706e853d in ggml_backend_tensor_alloc () from /llama.cpp/build/bin/libggml-base.so
#7 0x00007713706e2367 in ggml_gallocr_alloc_graph () from /llama.cpp/build/bin/libggml-base.so
#8 0x00007713706e818f in ggml_backend_sched_alloc_graph () from /llama.cpp/build/bin/libggml-base.so
#9 0x000077137049fa88 in llama_context::process_ubatch(llama_ubatch const&, llm_graph_type, llama_memory_context_i*, ggml_status&) () from /llama.cpp/build/bin/libllama.so
#10 0x00007713704a4c5f in llama_context::decode(llama_batch const&) () from /llama.cpp/build/bin/libllama.so
#11 0x00007713704a5c0f in llama_decode () from /llama.cpp/build/bin/libllama.so
#12 0x000062cb89e63220 in server_context::update_slots() ()
#13 0x000062cb89e263b2 in server_queue::start_loop() ()
#14 0x000062cb89de2bc9 in main ()
[Inferior 1 (process 82348) detached]
Aborted (core dumped)
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working