Random purging of the one single slot in use...

### Name and Version

86fde91e62c3f72ab7ed8a540dc1be049b735477

### Operating systems

Linux

### Which llama.cpp modules do you know to be affected?

libllama (core library)

### Command line

```shell
./llama-server --grammar-file fuck-emdashes.gbnf -ngl 99 --host 0.0.0.0 -c 190000 -m Qwen3-4B-Instruct-2507-Q8_0.gguf  -fa auto -cram 0 --slots --slot-save-path kv/qwen3-4b --no-mmap
```

### Problem description & steps to reproduce

```
slot update_slots: id  1 | task 9700 | prompt processing progress, n_tokens = 6144, batch.n_tokens = 2048, progress = 0.033170
decode: failed to find a memory slot for batch of size 2048
srv  try_purge_id: purging slot 2 with 1724 tokens
srv  update_slots: failed to find free space in the KV cache, retrying with smaller batch size, i = 0, n_batch = 2048, ret = 1
decode: failed to find a memory slot for batch of size 2048
srv  try_purge_id: purging slot 3 with 184145 tokens
srv  update_slots: failed to find free space in the KV cache, retrying with smaller batch size, i = 0, n_batch = 2048, ret = 1
slot update_slots: id  1 | task 9700 | n_tokens = 6144, memory_seq_rm [6144, end)
slot update_slots: id  1 | task 9700 | prompt processing progress, n_tokens = 8192, batch.n_tokens = 2048, prog
```

There isn't any other slots in use, the code randomly decides after 4-5 requests that it's out of space, and nukes the slot's cached content.

Could you please not do that. Best regards.

Also, might as well revert the RAM caching misfeature, tmpfs (or other ramdrives) exists, and llama.cpp can't detect the total memory size anyways -> OOM killed every time.

https://github.com/ggml-org/llama.cpp/pull/16736

### First Bad Commit

_No response_

### Relevant log output

```shell

```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Random purging of the one single slot in use... #17196

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Random purging of the one single slot in use... #17196

Description

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions