-
Notifications
You must be signed in to change notification settings - Fork 13.2k
Description
Name and Version
Minor.
The webui should not block the user interface when llama-server is stopped. Conversations in local storage should remain visible and sending a message should still be possible, so that workflows like llama-swap dynamic model weights loading remain possible like the old webui.
(also useful to keep GPU free for other tasks like image generation).
Proposal worked for me : master...ServeurpersoCom:llama.cpp:webui-cache-offline-chat
webui: allow viewing conversations and sending messages even if llama-server is down
- Cached llama.cpp server properties in browser localStorage on startup, persisting successful fetches and reloading them when refresh attempts fail so the chat UI continues to render while the backend is unavailable.
- Cleared the stored server properties when resetting the store to prevent stale capability data after cache-backed operation.
- Kept the original error-splash behavior when no cached props exist so fresh installs still surface a clear failure state instead of rendering stale data.
Another possible approach would be to create a blank fallback state independent of the /props endpoint, instead of persisting the last /props response in localStorage. The goal is to keep the solution more generic and not tied to llama-swap specifically, while still remaining compatible with it. The /v1/models endpoint is always active with llama-swap but inactive in standalone llama-server. I successfully tested the webui offline behavior in standalone mode as well. For my model-swap usage, I just refresh the UI /props endpoint on the first streaming chunk and it works.
I like the integrated webui because it keeps everything light: follow llama.cpp development in real time with a simple "git pull / build / automation", and quickly download and integrate new model releases via llama-swap YAML to test compatibility, compare models, and evaluate quantizations : without waiting for other wrappers or stacks to update.
Video shows two servers:
- llama.cpp / llama-swap / patched svelte-webui / reverse proxy / llama-server offline test / real web domain
- Raspberry Pi 5 with llama.cpp / patched webui / reverse proxy / LAN IP (Qwen3-30B A3B @ 5 tok/s)
Llama.cpp-llama-swap-webui.mp4
Operating systems
Linux
Which llama.cpp modules do you know to be affected?
llama-server
Command line
Problem description & steps to reproduce
Shutdown llama-server and host client on a production reverse proxy
First Bad Commit
Since Svelte WebUI refactor
Relevant log output
Error fetching server props: Error: Failed to fetch server props: 503