Skip to content

Misc. bug: webui: allow viewing conversations local storage and sending messages even if llama-server is down #16120

@ServeurpersoCom

Description

@ServeurpersoCom

Name and Version

Minor.

The webui should not block the user interface when llama-server is stopped. Conversations in local storage should remain visible and sending a message should still be possible, so that workflows like llama-swap dynamic model weights loading remain possible like the old webui.

(also useful to keep GPU free for other tasks like image generation).

Proposal worked for me : master...ServeurpersoCom:llama.cpp:webui-cache-offline-chat

@allozaur

webui: allow viewing conversations and sending messages even if llama-server is down

  • Cached llama.cpp server properties in browser localStorage on startup, persisting successful fetches and reloading them when refresh attempts fail so the chat UI continues to render while the backend is unavailable.
  • Cleared the stored server properties when resetting the store to prevent stale capability data after cache-backed operation.
  • Kept the original error-splash behavior when no cached props exist so fresh installs still surface a clear failure state instead of rendering stale data.

Another possible approach would be to create a blank fallback state independent of the /props endpoint, instead of persisting the last /props response in localStorage. The goal is to keep the solution more generic and not tied to llama-swap specifically, while still remaining compatible with it. The /v1/models endpoint is always active with llama-swap but inactive in standalone llama-server. I successfully tested the webui offline behavior in standalone mode as well. For my model-swap usage, I just refresh the UI /props endpoint on the first streaming chunk and it works.

I like the integrated webui because it keeps everything light: follow llama.cpp development in real time with a simple "git pull / build / automation", and quickly download and integrate new model releases via llama-swap YAML to test compatibility, compare models, and evaluate quantizations : without waiting for other wrappers or stacks to update.

Video shows two servers:

  • llama.cpp / llama-swap / patched svelte-webui / reverse proxy / llama-server offline test / real web domain
  • Raspberry Pi 5 with llama.cpp / patched webui / reverse proxy / LAN IP (Qwen3-30B A3B @ 5 tok/s)
Llama.cpp-llama-swap-webui.mp4

Operating systems

Linux

Which llama.cpp modules do you know to be affected?

llama-server

Command line

Problem description & steps to reproduce

Shutdown llama-server and host client on a production reverse proxy

First Bad Commit

Since Svelte WebUI refactor

Relevant log output

Error fetching server props: Error: Failed to fetch server props: 503

Metadata

Metadata

Assignees

Labels

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions