Skip to content

Eval bug: llama-serve ignores SIGINT and SIGTERM when running within a container. #11742

@rhatdan

Description

@rhatdan

Name and Version

llama-cli --version
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = llvmpipe (LLVM 19.1.3, 256 bits) (llvmpipe) | uma: 0 | fp16: 1 | warp size: 8 | matrix cores: none
ggml_vulkan: Warning: Device type is CPU. This is probably not the device you want.
version: 4607 (aa6fb13)
built with cc (GCC) 11.5.0 20240719 (Red Hat 11.5.0-2) for x86_64-redhat-linux

Operating systems

Linux

GGML backends

Vulkan

Hardware

When we run llama-serve in a podman container, it ignores kill -TERM and kill -INT. Sent from inside of the container and on the outside.

Models

Granite, but I believe this has nothing to do with the model.

Problem description & steps to reproduce

llama-server --port8080 -m/mnt/models/model.file -c2048 --temp0.8 -ngl -1 --host0.0.0.0

First Bad Commit

/bin/ramalama --image quay.io/ramalama/vulkan bench granite
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Intel(R) Graphics (RPL-S) (Intel open-source Mesa driver) | uma: 1 | fp16: 1 | warp size: 32 | matrix cores: none

model size params backend ngl test t/s
^C^C

Relevant log output

None,

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions