Eval bug: Qwen3-30b-a3b output stunted from b6793

### Name and Version

$ ./build/bin/llama-server --version
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 ROCm devices:
  Device 0: AMD Radeon RX 7900 XTX, gfx1100 (0x1100), VMM: no, Wave Size: 32
version: 6793 (38355c6c)
built with cc (GCC) 15.2.1 20250813 for x86_64-pc-linux-gnu

Built with:
```
cmake -S . -B build 
    -DGGML_HIP=ON
    -DAMDGPU_TARGETS=gfx1100
    -DCMAKE_BUILD_TYPE=Release
    -DGGML_NATIVE=ON
    -DGGML_HIP_ROCWMMA_FATTN=ON
    -DGGML_HIP_GRAPHS=ON
```

This also occurs with the HIP build on Windows using the same hardware.

### Operating systems

Linux (and Windows)

### GGML backends

HIP

### Hardware

Radeon RX 7900 XTX

### Models

[Qwen3-30b-a3b-thinking-2507 Q4_K_XL (Unsloth)](https://huggingface.co/unsloth/Qwen3-30B-A3B-Thinking-2507-GGUF)

### Problem description & steps to reproduce

When I run
```
llama-server
        --threads 12
        --gpu-layers 99
        --flash-attn auto
        --jinja
        --hf-repo unsloth/Qwen3-30B-A3B-Thinking-2507-GGUF:Q4_K_XL
        --ctx-size 40960
        --temp 0.6
        --top-k 20
        --top-p 0.95
        --min-p 0.0
        --ubatch-size 2048
```

on b6792 output is as normal as you would expect. Outputs are well thought-out, detailed, and somewhat lengthy.

When I run the same on b6793, I get shorter answers with less accurate/detailed information. It is also less inclined to format the output with Markdown.

### First Bad Commit

38355c6c

### Relevant log output

```shell
N/A, logs look normal.
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Eval bug: Qwen3-30b-a3b output stunted from b6793 #16709

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Eval bug: Qwen3-30b-a3b output stunted from b6793 #16709

Description

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions