Eval bug: unsloth/Kimi-K2-Thinking-GGUF:UD-Q4_K_XL -ot ".ffn_.*_exps.=CPU"

### Name and Version


./llama.cpp/build/bin/llama-cli --version
ggml_vulkan: Found 3 Vulkan devices:
ggml_vulkan: 0 = AMD Radeon Graphics (RADV NAVI21) (radv) | uma: 0 | fp16: 1 | bf16: 0 | warp size: 32 | shared memory: 65536 | int dot: 1 | matrix cores: none
ggml_vulkan: 1 = AMD Radeon Graphics (RADV NAVI21) (radv) | uma: 0 | fp16: 1 | bf16: 0 | warp size: 32 | shared memory: 65536 | int dot: 1 | matrix cores: none
ggml_vulkan: 2 = AMD Radeon Graphics (RADV VEGA20) (radv) | uma: 0 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: none
version: 7058 (45c6ef730)
built with cc (GCC) 15.2.1 20251022 (Red Hat 15.2.1-3) for x86_64-redhat-linux




### Operating systems

Linux

### GGML backends

Vulkan

### Hardware

AMD Radeon Pro Duo W6800X + AMD Radeon Pro Vega II
No infinity fabric link installed

### Models

unsloth/Kimi-K2-Thinking-GGUF:UD-TQ1_0
unsloth/Kimi-K2-Thinking-GGUF:UD-Q4_K_XL

### Problem description & steps to reproduce

./build/bin/llama-cli -hf unsloth/Kimi-K2-Thinking-GGUF:UD-Q4_K_XL -ot ".ffn_.*_exps.=CPU" --special

"Tell me all about the rise and fall of the Roman Empire"

After only a few tokens, or a few hundred, it crashes with a seg fault.

Sometime later, the machine will abend and restart.

### First Bad Commit

_No response_

### Relevant log output

```shell
This GDB supports auto-downloading debuginfo from the following URLs:
  <ima:enforcing>
  <https://debuginfod.fedoraproject.org/>
  <ima:ignore>
Enable debuginfod for this session? (y or [n]) [answered N; input not from terminal]
Debuginfod has been disabled.
To make this setting permanent, add 'set debuginfod enabled off' to .gdbinit.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
0x00007fe5fc286982 in __syscall_cancel_arch () from /lib64/libc.so.6
#0  0x00007fe5fc286982 in __syscall_cancel_arch () from /lib64/libc.so.6
#1  0x00007fe5fc27ac3c in __internal_syscall_cancel () from /lib64/libc.so.6
#2  0x00007fe5fc27ac84 in __syscall_cancel () from /lib64/libc.so.6
#3  0x00007fe5fc2eab4f in wait4 () from /lib64/libc.so.6
#4  0x00007fe5ffd02963 in ggml_print_backtrace () from /home/threaded/projects/llama.cpp/build/bin/libggml-base.so.0
#5  0x00007fe5ffd02aaf in ggml_abort () from /home/threaded/projects/llama.cpp/build/bin/libggml-base.so.0
#6  0x00007fe5fc8df839 in ggml_vk_mul_mat_vec_q_f16(ggml_backend_vk_context*, std::shared_ptr<vk_context_struct>&, ggml_cgraph const*, int) () from /home/threaded/projects/llama.cpp/build/bin/libggml-vulkan.so.0
#7  0x00007fe5fc90d1f0 in ggml_vk_build_graph(ggml_backend_vk_context*, ggml_cgraph*, int, ggml_tensor*, int, bool, bool, bool) () from /home/threaded/projects/llama.cpp/build/bin/libggml-vulkan.so.0
#8  0x00007fe5fc90f862 in ggml_backend_vk_graph_compute(ggml_backend*, ggml_cgraph*) () from /home/threaded/projects/llama.cpp/build/bin/libggml-vulkan.so.0
#9  0x00007fe5ffd1cde3 in ggml_backend_sched_graph_compute_async () from /home/threaded/projects/llama.cpp/build/bin/libggml-base.so.0
#10 0x00007fe5ffa384d0 in llama_context::graph_compute(ggml_cgraph*, bool) () from /home/threaded/projects/llama.cpp/build/bin/libllama.so.0
#11 0x00007fe5ffa3a162 in llama_context::process_ubatch(llama_ubatch const&, llm_graph_type, llama_memory_context_i*, ggml_status&) () from /home/threaded/projects/llama.cpp/build/bin/libllama.so.0
#12 0x00007fe5ffa3f03f in llama_context::decode(llama_batch const&) () from /home/threaded/projects/llama.cpp/build/bin/libllama.so.0
#13 0x00007fe5ffa3fe8e in llama_decode () from /home/threaded/projects/llama.cpp/build/bin/libllama.so.0
#14 0x000000000042f2b0 in main ()
[Inferior 1 (process 15299) detached]
Aborted                    (core dumped) ./llama.cpp/build/bin/llama-cli -hf unsloth/Kimi-K2-Thinking-GGUF:UD-Q4_K_XL -ot ".ffn_.*_exps.=CPU" --special
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Eval bug: unsloth/Kimi-K2-Thinking-GGUF:UD-Q4_K_XL -ot ".ffn_.*_exps.=CPU" #17269

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Eval bug: unsloth/Kimi-K2-Thinking-GGUF:UD-Q4_K_XL -ot ".ffn_.*_exps.=CPU" #17269

Description

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions