Eval bug: MTP and NTP output is random garbage

### Name and Version

bee llama-server v0.3.0

### Operating systems

Windows

### GGML backends

CUDA

### Hardware

Ryzen 7500x + RTX 5060 Ti 16 GB

### Models

Qwen3.6-35B-A3B-UD-IQ3_XXS-MTP.gguf

### Problem description & steps to reproduce

MTP output is short and contains random garbage only. Upstream llamacpp works ok with the same settings.

### First Bad Commit

_No response_

### Relevant log output

<details>
<summary>Logs</summary>


```console
llama-server ^
  -m "e:\LMStudio_Models\models\unsloth\Qwen3.6-35B-A3B-MTP-GGUF\Qwen3.6-35B-A3B-UD-IQ3_XXS.gguf" ^
  --alias "Qwen3.6" ^
  --host 127.0.0.1 --port 8001 ^
  --ctx-size 60000 ^
  --fit off ^
  --n-gpu-layers 999 ^
  --temp 0.6 --top-p 0.95 --top-k 20 --min-p 0.0 ^
  --presence-penalty 0.0 --repeat-penalty 1.0 ^
  -ctk q8_0 -ctv q8_0 ^
  --flash-attn on ^
  --batch-size 512 --ubatch-size 256 ^
  --threads 8 --threads-batch 8 ^
  --no-mmap --mlock ^
  --parallel 1 --prio 2 ^
  --spec-type draft-mtp --spec-draft-n-max 3 ^
  --spec-draft-ngl 999 ^
  --log-verbosity 3 ^
  --metrics ^
  --log-colors off ^
  --ctx-checkpoints 0 ^
  --cache-ram 0 ^
  --jinja
```
</details>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Eval bug: MTP and NTP output is random garbage #36

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Eval bug: MTP and NTP output is random garbage #36

Description

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions