Misc. bug: webui: allow viewing conversations local storage and sending messages even if llama-server is down

### Name and Version

Minor.

The webui should not block the user interface when llama-server is stopped. Conversations in local storage should remain visible and sending a message should still be possible, so that workflows like llama-swap dynamic model weights loading remain possible like the old webui.

(also useful to keep GPU free for other tasks like image generation).

Proposal worked for me : https://github.com/ggml-org/llama.cpp/compare/master...ServeurpersoCom:llama.cpp:webui-cache-offline-chat

@allozaur 

webui: allow viewing conversations and sending messages even if llama-server is down

- Cached llama.cpp server properties in browser localStorage on startup, persisting successful fetches and reloading them when refresh attempts fail so the chat UI continues to render while the backend is unavailable.
- Cleared the stored server properties when resetting the store to prevent stale capability data after cache-backed operation.
- Kept the original error-splash behavior when no cached props exist so fresh installs still surface a clear failure state instead of rendering stale data.

---
Another possible approach would be to create a blank fallback state independent of the /props endpoint, instead of persisting the last /props response in localStorage. The goal is to keep the solution more generic and not tied to llama-swap specifically, while still remaining compatible with it. The /v1/models endpoint is always active with llama-swap but inactive in standalone llama-server. I successfully tested the webui offline behavior in standalone mode as well. For my model-swap usage, I just refresh the UI /props endpoint on the first streaming chunk and it works.

I like the integrated webui because it keeps everything light: follow llama.cpp development in real time with a simple "git pull / build / automation", and quickly download and integrate new model releases via llama-swap YAML to test compatibility, compare models, and evaluate quantizations : without waiting for other wrappers or stacks to update.

Video shows two servers:
- llama.cpp / llama-swap / patched svelte-webui / reverse proxy / llama-server offline test / real web domain
- Raspberry Pi 5 with llama.cpp / patched webui / reverse proxy / LAN IP (Qwen3-30B A3B @ 5 tok/s)

https://github.com/user-attachments/assets/0e720a0b-d9d2-4c84-a4cb-b4bbb5b23dfb


### Operating systems

Linux

### Which llama.cpp modules do you know to be affected?

llama-server

### Command line

```shell

```

### Problem description & steps to reproduce

Shutdown llama-server and host client on a production reverse proxy

### First Bad Commit

Since Svelte WebUI refactor

### Relevant log output

```shell
Error fetching server props: Error: Failed to fetch server props: 503
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Misc. bug: webui: allow viewing conversations local storage and sending messages even if llama-server is down #16120

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Misc. bug: webui: allow viewing conversations local storage and sending messages even if llama-server is down #16120

Description

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions