Misc. bug: server/rerank output result is wrong with most models include qwen3-Rerank

### Name and Version

`llama-server --version`
```
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
  Device 0: NVIDIA GeForce RTX 4090, compute capability 8.9, VMM: yes
version: 6674 (5113efd3)
built with cc (Ubuntu 12.4.0-2ubuntu1~24.04) 12.4.0 for x86_64-linux-gnu
```

### Operating systems

Linux

### Which llama.cpp modules do you know to be affected?

llama-server

### Command line

```shell
llama-server -m Qwen3-Reranker-8B-q4_k_s.gguf -c 4096 -ngl 99  --host 0.0.0.0 --port 8181 --prio 2 --no-webui -ctk q4_0 -ctv q4_0 -fa auto --rerank
```

### Problem description & steps to reproduce

the reranking give bad result with qwen3-reranker (0.6B 4B 8B), bge, mxbai
`llama-server -m Qwen3-Reranker-8B-q4_k_s.gguf -c 4096 -ngl 99  --host 0.0.0.0 --port 8181 --prio 2 --no-webui -ctk q4_0 -ctv q4_0 -fa auto --rerank`

I test with this sh script :
```script
#!/bin/bash

# Set default URL if not provided
URL=${1:-http://127.0.0.1:8181}

curl "$URL/v1/rerank" -H "Content-Type: application/json" \
 -d '{ "model": "M", "query": "What is the recipe to make bread ?",
 "return_text" : false,
 "texts" : true,
 "top_n": 6,"documents": [
 "voici la recette pour faire du pain, il faut de la farine de l eau et du levain et du sel",
 "it is a bear",
 "bread recipe : floor, water, yest, salt",
 "The giant panda (Ailuropoda melanoleuca), sometimes called a panda bear or simply panda, is a bear species endemic to China.",
 "here is the ingedients to bake bread : 500g floor, 350g water, 120g fresh refresh yest, 15g salt",
 "recipe to make cookies : floor, eggs, water, chocolat",
 "here is the recipe to make bread : 500g floor, 350g water, 120g fresh refresh yest, 15g salt",
 "il fait tres beau aujourd hui",
 "je n ai pas faim, je ne veux pas manger",
 "je suis a paris"
 ] }' | jq
```
I alway get result like this :
```json
[
  {
    "index": 5,
    "score": 1.1353239058953662E-28
  },
  {
    "index": 3,
    "score": 3.111864641067425E-29
  },
  {
    "index": 8,
    "score": 2.3408178355156106E-29
  },
  {
    "index": 9,
    "score": 2.6804792039427738E-30
  },
  {
    "index": 1,
    "score": 6.065239931987211E-32
  },
  {
    "index": 2,
    "score": 5.335152733382172E-32
  }
]
```

the only models that seems works with llama-server is jina-reranker
`llama-server -m jina-reranker-v2-base-multilingual-Q8_0.gguf -c 16000 -ngl 99  --host 0.0.0.0 --port 8181 --prio 2 --no-webui -ctk q4_0 -ctv q4_0 -fa auto --rerank`
the result is not too bad but is very differant from https://jina.ai/reranker/
result with llama-server :
```json
[
  {
    "index": 6,
    "score": 0.7979143261909485
  },
  {
    "index": 0,
    "score": 0.3886369466781616
  },
  {
    "index": 2,
    "score": 0.2865810990333557
  },
  {
    "index": 4,
    "score": -0.5105927586555481
  },
  {
    "index": 5,
    "score": -1.9573085308074951
  },
  {
    "index": 8,
    "score": -3.1544036865234375
  }
]
```
result in `https://jina.ai/reranker/` same model, same content
```json
[
    {
      "index": 6,
      "relevance_score": 0.69595832
    },
    {
      "index": 0,
      "relevance_score": 0.60346454
    },
    {
      "index": 2,
      "relevance_score": 0.5559175
    },
    {
      "index": 4,
      "relevance_score": 0.3684057
    },
    {
      "index": 5,
      "relevance_score": 0.12085322
    },
    {
      "index": 7,
      "relevance_score": 0.04146227
    }
  ]
```


### First Bad Commit

_No response_

### Relevant log output

```shell

```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Misc. bug: server/rerank output result is wrong with most models include qwen3-Rerank #16407

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Misc. bug: server/rerank output result is wrong with most models include qwen3-Rerank #16407

Description

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions