Skip to content

Eval bug: llama server response hangs for /slots/0?action=erase #17387

@bmahabirbu

Description

@bmahabirbu

Name and Version

b52edd2

Operating systems

Linux

GGML backends

CPU

Hardware

Mac M4

Models

llama3.2

Problem description & steps to reproduce

When running llama-server using ramalama (which runs llama.cpp inside the container) and with the necessary argument -slot-save-path /tmp to enable the slots feature when I try to do this command curl -X POST "http://localhost:8080/slots/0?action=erase" it will hang until i do control c then on the server side i see the response. But the response is never received by the curl command. I tried doing it inside the container as well to avoid networking issues but it still hangs

My goal is to clear the prompt cache for a summarization feature ie when the context size is reached clear the cache summarize the history and feed it back. The workaround is to just specify a small timeout but this seems like a bug.

ramalama latest llama.cpp commit = b52edd2

First Bad Commit

No response

Relevant log output

bmahabir@bmahabir-mac ramalama % curl -X POST "http://localhost:8080/slots/0?action=erase"
^C
bmahabir@bmahabir-mac ramalama % 


srv  remove_waiti: remove task 9 from waiting list. current waiting = 1 (before remove)
srv  log_server_r: request: POST /slots/0 192.168.127.1 200
srv  log_server_r: request:  
srv  log_server_r: response: {"id_slot":0,"n_erased":43}

The server log only happens after the control C. something is hanging in the llamaserver

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinghelp wantedNeeds help from the communitymedium severityUsed to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)server/api

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions