-
Notifications
You must be signed in to change notification settings - Fork 13.4k
Description
Name and Version
llama-cli --version
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = llvmpipe (LLVM 19.1.3, 256 bits) (llvmpipe) | uma: 0 | fp16: 1 | warp size: 8 | matrix cores: none
ggml_vulkan: Warning: Device type is CPU. This is probably not the device you want.
version: 4607 (aa6fb13)
built with cc (GCC) 11.5.0 20240719 (Red Hat 11.5.0-2) for x86_64-redhat-linux
Operating systems
Linux
GGML backends
Vulkan
Hardware
When we run llama-serve in a podman container, it ignores kill -TERM and kill -INT. Sent from inside of the container and on the outside.
Models
Granite, but I believe this has nothing to do with the model.
Problem description & steps to reproduce
llama-server --port8080 -m/mnt/models/model.file -c2048 --temp0.8 -ngl -1 --host0.0.0.0
First Bad Commit
/bin/ramalama --image quay.io/ramalama/vulkan bench granite
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Intel(R) Graphics (RPL-S) (Intel open-source Mesa driver) | uma: 1 | fp16: 1 | warp size: 32 | matrix cores: none
| model | size | params | backend | ngl | test | t/s |
|---|---|---|---|---|---|---|
| ^C^C |
Relevant log output
None,