Eval bug: How to use Gemma 31B Qat with mtp?

### Name and Version

Hi, I'm trying to use Gemma 31B with the MTP assistant but I'm having trouble.

### Operating systems

Windows

### GGML backends

CUDA

### Hardware

i5 13600KF and RTX 4090

### Models

_No response_

### Problem description & steps to reproduce

llama-server.exe ^
  -m "C:\Users\ajcn2\Desktop\Lamma.cpp\Models\gemma-4-31B-it-qat-UD-Q4_K_XL.gguf" ^
  --model-draft "C:\Users\ajcn2\Desktop\Lamma.cpp\Models\gemma-4-31b-it-qat-q4_0-assistant.gguf" ^
  --spec-type draft-mtp ^
  --spec-draft-n-max 2 ^
  --mmproj "C:\Users\ajcn2\Desktop\Lamma.cpp\Models\mmproj-BF16.gguf" ^
  --no-mmproj-offload ^
  --chat-template-file "C:\Users\ajcn2\Desktop\Lamma.cpp\Models\custom_pub_chat_template_gemma4.jinja" ^
  --port 5051 ^
  -np 1 ^
  -ngl 99 ^
  -ngld 99 ^
  -b 2048 -ub 512 ^
  --ctx-size 32768 ^
  --cache-type-k q8_0 --cache-type-v q8_0 ^
  --flash-attn on ^
  --jinja ^
  --no-mmap --mlock ^
  --no-host ^
  --reasoning on ^
  --temp 1.0 --top-k 64 --top-p 0.95 --min-p 0.0


### First Bad Commit

_No response_

### Relevant log output

<details>
<summary>Logs</summary>


```console

```
</details>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Eval bug: How to use Gemma 31B Qat with mtp? #64

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Eval bug: How to use Gemma 31B Qat with mtp? #64

Description

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions