Eval bug: llama server response hangs for /slots/0?action=erase

### Name and Version

b52edd25586fabb70f0c21b274473b307cf14499

### Operating systems

Linux

### GGML backends

CPU

### Hardware

Mac M4

### Models

llama3.2

### Problem description & steps to reproduce

When running llama-server using ramalama (which runs llama.cpp inside the container) and with the necessary argument -slot-save-path /tmp to enable the slots feature when I try to do this command curl -X POST "http://localhost:8080/slots/0?action=erase" it will hang until i do control c then on the server side i see the response. But the response is never received by the curl command. I tried doing it inside the container as well to avoid networking issues but it still hangs

My goal is to clear the prompt cache for a summarization feature ie when the context size is reached clear the cache summarize the history and feed it back. The workaround is to just specify a small timeout but this seems like a bug.

ramalama latest llama.cpp commit = b52edd25586fabb70f0c21b274473b307cf14499



### First Bad Commit

_No response_

### Relevant log output

```shell
bmahabir@bmahabir-mac ramalama % curl -X POST "http://localhost:8080/slots/0?action=erase"
^C
bmahabir@bmahabir-mac ramalama % 


srv  remove_waiti: remove task 9 from waiting list. current waiting = 1 (before remove)
srv  log_server_r: request: POST /slots/0 192.168.127.1 200
srv  log_server_r: request:  
srv  log_server_r: response: {"id_slot":0,"n_erased":43}

The server log only happens after the control C. something is hanging in the llamaserver
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Eval bug: llama server response hangs for /slots/0?action=erase #17387

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Eval bug: llama server response hangs for /slots/0?action=erase #17387

Description

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions