Misc. bug: Core dumped with Vulkan using Default Physical Batch Size.

### Name and Version

ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = AMD Radeon Graphics (RADV RENOIR) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 0 | matrix cores: none
version: 6665 (95ce0985)
built with cc (Ubuntu 14.2.0-19ubuntu2) 14.2.0 for x86_64-linux-gnu

### Operating systems

Linux  (Ubuntu 25.04 plucky)

### Which llama.cpp modules do you know to be affected?

llama-server

### Command line

```shell
./llama-server --gpu-layers 100 -dev Vulkan0 -t 1 -tb 1 --no-mmap -c 10000 -fa off -m ./gpt/gpt-oss-20b-mxfp4.gguf
```

### Problem description & steps to reproduce

After the recent changes, llama.cpp is consistently crashing when using Vulkan with the default physical batch size (`-ub 512`).

I've found that reducing the physical batch size to 401 or lower seems to work correctly.

### First Bad Commit

_No response_

### Relevant log output

```shell
...
slot update_slots: id  0 | task 0 | prompt processing progress, n_past = 4096, n_tokens = 2048, progress = 0.590457
/llama.cpp/ggml/src/ggml-backend.cpp:1850: GGML_ASSERT((char *)addr + ggml_backend_buffer_get_alloc_size(buffer, tensor) <= (char *)ggml_backend_buffer_get_base(buffer) + ggml_backend_buffer_get_size(buffer)) failed
[New LWP 82368]
[New LWP 82367]
[New LWP 82366]
[New LWP 82365]
[New LWP 82364]
[New LWP 82363]
[New LWP 82362]
[New LWP 82361]
[New LWP 82360]
[New LWP 82359]
[New LWP 82358]
[New LWP 82357]
[New LWP 82356]
[New LWP 82355]
[New LWP 82354]
[New LWP 82353]
[New LWP 82352]
[New LWP 82350]
[New LWP 82349]

This GDB supports auto-downloading debuginfo from the following URLs:
  <https://debuginfod.ubuntu.com>
Enable debuginfod for this session? (y or [n]) [answered N; input not from terminal]
Debuginfod has been disabled.
To make this setting permanent, add 'set debuginfod enabled off' to .gdbinit.
Function(s) ^std::(move|forward|as_const|(__)?addressof) will be skipped when stepping.
Function(s) ^std::(shared|unique)_ptr<.*>::(get|operator) will be skipped when stepping.
Function(s) ^std::(basic_string|vector|array|deque|(forward_)?list|(unordered_|flat_)?(multi)?(map|set)|span)<.*>::(c?r?(begin|end)|front|back|data|size|empty) will be skipped when stepping.
Function(s) ^std::(basic_string|vector|array|deque|span)<.*>::operator.] will be skipped when stepping.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
__syscall_cancel_arch () at ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S:56
warning: 56	../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S: No such file or directory
#0  __syscall_cancel_arch () at ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S:56
56	in ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S
#1  0x000077136fc9eb63 in __internal_syscall_cancel (a1=<optimized out>, a2=<optimized out>, a3=<optimized out>, a4=<optimized out>, a5=0, a6=0, nr=61) at ./nptl/cancellation.c:49
warning: 49	./nptl/cancellation.c: No such file or directory
#2  __syscall_cancel (a1=<optimized out>, a2=<optimized out>, a3=<optimized out>, a4=<optimized out>, a5=a5@entry=0, a6=a6@entry=0, nr=61) at ./nptl/cancellation.c:75
75	in ./nptl/cancellation.c
#3  0x000077136fd1ae9f in __GI___wait4 (pid=<optimized out>, stat_loc=<optimized out>, options=<optimized out>, usage=<optimized out>) at ../sysdeps/unix/sysv/linux/wait4.c:30
warning: 30	../sysdeps/unix/sysv/linux/wait4.c: No such file or directory
#4  0x00007713706cdf13 in ggml_print_backtrace () from /llama.cpp/build/bin/libggml-base.so
#5  0x00007713706ce0bb in ggml_abort () from /llama.cpp/build/bin/libggml-base.so
#6  0x00007713706e853d in ggml_backend_tensor_alloc () from /llama.cpp/build/bin/libggml-base.so
#7  0x00007713706e2367 in ggml_gallocr_alloc_graph () from /llama.cpp/build/bin/libggml-base.so
#8  0x00007713706e818f in ggml_backend_sched_alloc_graph () from /llama.cpp/build/bin/libggml-base.so
#9  0x000077137049fa88 in llama_context::process_ubatch(llama_ubatch const&, llm_graph_type, llama_memory_context_i*, ggml_status&) () from /llama.cpp/build/bin/libllama.so
#10 0x00007713704a4c5f in llama_context::decode(llama_batch const&) () from /llama.cpp/build/bin/libllama.so
#11 0x00007713704a5c0f in llama_decode () from /llama.cpp/build/bin/libllama.so
#12 0x000062cb89e63220 in server_context::update_slots() ()
#13 0x000062cb89e263b2 in server_queue::start_loop() ()
#14 0x000062cb89de2bc9 in main ()
[Inferior 1 (process 82348) detached]
Aborted (core dumped)
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Misc. bug: Core dumped with Vulkan using Default Physical Batch Size. #16383

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Misc. bug: Core dumped with Vulkan using Default Physical Batch Size. #16383

Description

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions