Skip to content

v0.5.7

Choose a tag to compare

@github-actions github-actions released this 11 Jun 13:50
· 151 commits to main since this release

Upgrade notes

Breaking changes

Reload any chat tab left open across the upgrade. This release moves image and file attachments from inline base64 frames to a staged multipart-upload pipeline (see Resumable, staged file uploads below). A chat tab that was open before the upgrade is still running the previous client; the first time it tries to send an image it receives a PROTOCOL_OUTDATED error toast instead of delivering the message. Reloading the tab loads the new client and clears the error. Text-only messages from a stale tab are unaffected, and no server-side, config, or data action is required — this is purely a client-refresh transition.

Upgrade notes

v0.5.7 is a large release. The headline changes are a new staged file-upload pipeline, capability-aware attachment handling with inline recovery, resumable streaming across reconnects, and a more reliable first chat on freshly-created agents. Everything below is picked up by the standard docker compose pull && docker compose up -d flow; three additive database migrations (00340036) run automatically on startup and require no manual steps.

Resumable, staged file uploads

Attachments no longer travel inline as base64 inside the chat WebSocket frame. Instead the browser uploads each file to a new endpoint — POST /api/agents/<agentId>/uploads — which stages it server-side and returns an upload id; the chat message then references files by id. This makes large attachments reliable, shows per-file progress and retry in the composer, and keeps the chat socket lean.

  • Limits. Up to 10 attachments per message and 15 MB per file (the endpoint returns 413 above that). Staged files live for 24 hours before they become eligible for cleanup.
  • Wider file support. The picker now accepts CSV, plain text, Markdown, JSON, and YAML in addition to images and PDFs.
  • Automatic cleanup. A garbage-collection sweep runs at startup and hourly, deleting staged files whose 24-hour TTL has elapsed and that were never attached to a sent message. Each sweep is correlated by a sweepId and emits one file.upload.expired audit row per reclaimed file.
  • New audit events. file.upload.staged (a file was accepted, or rejected with outcome: "failure"), file.upload.attached (a staged file was promoted into a sent message), and file.upload.expired (a staged orphan was swept). These are additive — existing audit queries are unaffected.

As noted under Breaking changes, the server now rejects the old inline-image_url frame with PROTOCOL_OUTDATED, so each browser tab must be reloaded once after the upgrade before it can send images again.

Capability-aware attachments and inline recovery

Pinchy now knows which capabilities (vision, documents, audio, video, long-context, tools) each model supports and uses that to stop a doomed attachment before it is sent, rather than failing deep inside the agent.

  • Pre-send block + RecoveryPanel. Attaching an image or PDF to an agent whose current model can't handle it hard-blocks the send and shows an inline RecoveryPanel with a plain-English explanation and a one-click link to switch the agent's model. Nothing is sent until the mismatch is resolved.
  • Capability hints in the model picker. The model picker shows per-capability icons and tooltips, and an amber warning on any model that fails a capability required by the current agent or template.
  • 422 on agent creation. POST /api/agents now returns 422 template_capability_unavailable (with the missing capabilities and a docs link) and writes a failure agent.created audit row when no installed model satisfies a template's required capabilities — instead of silently creating an agent that can't do its job.
  • Onboarding warning. The setup wizard warns when no available model satisfies the default agent's required capabilities.
  • models table — zero configuration. A new models table records each model's capabilities. It is seeded from Pinchy's built-in catalog on every boot; for local Ollama, capabilities are detected from the Ollama API when you save the URL. No manual entry is required.

Earlier docs described this capability handling under the v0.5.4 notes; it actually ships in v0.5.7, and the v0.5.4 section has been corrected accordingly.

Streaming resume across reconnect

When your browser drops mid-stream and reconnects (closed tab, network blip, Wi-Fi handoff), Pinchy now joins the new connection to the in-flight run and keeps streaming chunks into the same assistant bubble — no more "The agent didn't respond" flash bubble for runs that were actually still running server-side. Under the hood:

  • A new in-memory ActiveRuns registry on Pinchy tracks every in-flight chat run by sessionKey, with each entry holding the OpenClaw-correlated runId, the per-turn messageId, and the set of currently-connected listener WebSockets. Multi-tab sessions on the same chat now naturally see the same stream — both tabs end up in the listener set.
  • The in-flight reply text is buffered server-side and replayed on reconnect. Streaming chunks are deltas the server never re-sends, so a tab reloaded mid-stream used to lose the words that had already streamed. The registry now accumulates the emitted text and seeds the resumed bubble with it — the reply arrives complete, not just from the reconnect point onward.
  • A server-side run watchdog scans ActiveRuns every 30 seconds and tears down runs whose absolute age exceeds the per-deployment cap (default 15 minutes). This is the server-side belt to the existing client-side stuck-timer suspenders — it catches runs that hang while the tab is backgrounded or after the laptop sleeps. Stuck runs are aborted via OpenClaw, audited as chat.run_timed_out, and a terminal error frame is broadcast to any still-connected listener.
  • The same watchdog also catches the opposite failure: a run the backend accepts at dispatch but that never produces a first chunk within the first-chunk timeout (default 180 seconds) — a wedged or rate-limited provider/lane — is torn down, audited as chat.run_no_first_chunk, and surfaced to the user as a retryable error so they can resend instead of staring at a blank thread. This is the reload-surviving server-side backstop to the client's 60-second stuck timer.
  • The audit trail gains four new event types: chat.run_timed_out, chat.run_no_first_chunk, chat.run_completed_after_disconnect, and chat.run_aborted (the last is declared in the audit union for a future user-triggered abort UI; emission lands when that UI ships). Existing audit queries are unaffected — the new events are additive.

The ActiveRuns registry is in-memory only — a Pinchy restart drops every entry, which is acceptable because the OpenClaw side is restarted (or unreachable) in that case. No database migration is required for it.

Operationally:

  • New audit-trail queries you might want bookmarked:
    • chat.run_timed_out — how many runs hit the 15-min hard cap last week?
    • chat.run_no_first_chunk — how often does the backend accept a run but never start streaming? A spike here points at a wedged or rate-limited provider/lane (this is the signal that was missing during exactly that class of incident).
    • chat.run_completed_after_disconnect — how often do users abandon a long-running chat? Correlates with background-tab behavior.
  • The 15-minute absolute cap and the 180-second first-chunk timeout are both per-deployment and live in packages/web/src/server/run-watchdog.ts (DEFAULT_MAX_RUN_DURATION_MS and DEFAULT_FIRST_CHUNK_TIMEOUT_MS). There is intentionally no per-agent override; surface one as a deployment env var if you ever see a legitimate need (YAGNI today).

More reliable first chat on freshly-created agents

The first message to a just-created agent could previously fail with unknown agent id when the chat dispatch raced ahead of OpenClaw applying the new agent config. Pinchy now gates the first dispatch on OpenClaw's agents.list runtime-readiness signal, retries config.apply when the gateway rate-limits it instead of silently dropping the change, and applies new agents via hot-reload rather than an atomic file write the gateway's watcher could miss. No action required — it's a reliability fix.

Channel-health watchdog for Telegram

OpenClaw owns the Telegram poller, so a channel worker that crash-loops below the gateway WebSocket used to be invisible — the connection stayed "connected" while the bot silently dropped messages. A new watchdog probes the channel status every 30 seconds and makes those failures operator-visible:

  • A red Degraded badge appears on the agent's Telegram settings while the poller is failing, with the underlying error on hover.
  • The audit trail gains three event types: channel.degraded (first failure), channel.polling_failed (still down after several consecutive probes), and channel.recovered — together they bound the outage window. Error detail is PII-scrubbed.

The classic trigger is one bot token polled by two deployments (staging and production, say): Telegram allows a single getUpdates consumer per bot, so the instances terminate each other in a loop. If you see a persistent degraded badge with a "terminated by other getUpdates request" error, give each environment its own bot. Nothing to configure — the watchdog is on after the upgrade.

Diagnostics export from the Support tab

Admins can now generate a downloadable bug-report bundle from Settings → Support: pick an agent, and Pinchy packages the version triplet (Pinchy, OpenClaw, openclaw-node), the scoped session trace as OTel-style spans with per-turn timing and token usage, and the related audit entries. Secrets are redacted with the audit-trail sanitizer and session keys are hashed before anything is written; the bundle does include conversation text, since that is usually what a bug report needs. Each export is audited as diagnostics.exported.

Setup-wizard secrets fix

v0.5.7 also fixes the v0.5.6 setup-wizard regression where Smithers replied with No API key found for provider '<name>' on the very first chat message after a fresh install. The bug was provider-agnostic: OpenClaw's secrets-provider initialised at gateway boot with a "file missing" state because secrets.json was only written after the user completed the wizard. The provider never reinitialised when Pinchy hot-reloaded openclaw.json (the secrets-section diff was empty), so the gateway ran the rest of the session with no provider keys until a manual container restart.

The fix is Pinchy-side: an inotifywait handler in start-openclaw.sh writes a /openclaw-secrets/.bootstrap-applied marker on the first appearance of secrets.json and SIGTERMs the gateway exactly once so the next boot picks up the file. The health-check loop respawns the gateway within ~10–40 s, and a build-time drift guard in validateBuiltConfig rejects emitted configs whose SecretRef pointers don't resolve in secrets.json — closing the related class of "config references a key that isn't there" bugs that would have produced the same symptom from the Pinchy side.

If you saw the "No API key found" error on v0.5.6 and restarted the OpenClaw container manually to recover, no follow-up action is needed — the secrets-provider re-reads cleanly from disk on every boot.

Upgrade command

No manual steps beyond the standard upgrade flow:

cd /opt/pinchy
sed -i 's/PINCHY_VERSION=v0.5.6/PINCHY_VERSION=v0.5.7/' .env
docker compose pull && docker compose up -d && docker image prune -f

BETTER_AUTH_URL removed — no action needed

Pinchy no longer reads BETTER_AUTH_URL. An investigation (#352) confirmed the variable had no functional consumer:

  • Pinchy doesn't use Better Auth's email-verification or password-reset email flows. Password resets run through Pinchy's own invite-token system, and the reset link is built from the browser's current origin — not from a configured base URL.
  • Origin / CSRF checks read the Domain Lock value (and the request host), not BETTER_AUTH_URL.
  • We don't use OAuth, the one remaining Better Auth feature that needs an absolute callback URL.

The variable is gone from docker-compose.yml, .env.example, and the startup warning that previously told Domain-Locked deployments to set it (added in v0.5.4 — see below). If you set BETTER_AUTH_URL in your .env or a docker-compose.override.yml, you can drop it. Leaving it in place is harmless — it is simply ignored — but the line is now dead config. Domain Lock remains the single source of truth for your public origin; HTTPS, secure cookies, and origin enforcement are unchanged.

Ollama Cloud model catalog refreshed against the live API

The Ollama Cloud library pages over-promise: some models advertise capabilities the live API doesn't actually honor. We re-probed every cloud model against the real /v1/chat/completions endpoint — sending image payloads with non-guessable content for vision and a function schema for tools — and corrected the catalog to match what the API does, not what the library page claims.

  • MiniMax M3 added. minimax-m3 joins the picker as a vision + reasoning + tools model (512K context). Vision and tool calling were both confirmed against the live API. It is now the reasoning-tier vision default.
  • qwen3.5:397b is now text-only. Its library page lists image input, but the live endpoint accepts image payloads and then hallucinates their contents rather than rejecting them — it does not actually see images. It stays available as a text/reasoning model and is no longer offered as a vision choice. Image work that previously could route here now uses the canonical vision line (qwen3-vl > gemini-3-flash-preview > gemma4).
  • qwen3-next:80b removed. On Ollama Cloud's OpenAI-completions endpoint it never emits a structured tool call — it returns empty content or a malformed tool-call blob, even when tools are required. Every Pinchy agent uses tools, so a tool-broken model has no place in the picker. The Ollama Cloud balanced default moves to glm-4.7.

No action is required for agents created the normal way: their model comes from the capability resolver (which never selected qwen3-next:80b), and the global default is recomputed on the upgrade's config regeneration. The only agents affected are ones where someone manually pinned ollama-cloud/qwen3-next:80b — those couldn't call tools before this release either, so switch them to ollama-cloud/glm-4.7 (or another model) in the agent's settings.

Install Pinchy as an app

Pinchy now ships a web-app manifest and platform splash screens, so it can be installed as a standalone app. In Chrome on the desktop an install icon appears in the address bar; on iOS Safari, Share → Add to Home Screen produces a launcher that opens Pinchy full-screen with a branded splash. See the new Install as an app guide. Nothing to configure — it is available after the upgrade.

Operator notes

  • Usage dashboard now records prompt-cache tokens. Previously the poller read the wrong field names from OpenClaw and stored every cache counter as 0 — with providers that cache aggressively (Anthropic), the dashboard could show single-digit input for heavy days. After the upgrade, new usage records carry cache reads/writes, the daily chart gains a dashed Cached Input series, and cost estimates include cache pricing. Historic rows stay as recorded; expect day totals to jump. Totals remain approximate until per-turn accounting lands (#483).
  • Configurable usage poll interval. The usage-counter poller interval is now configurable via PINCHY_USAGE_POLL_INTERVAL_MS (default 60000, minimum 1000). Most deployments need not set it.
  • Database migrations. 0034 adds last_error/last_error_at to integration_connections (surfaces integration auth-failure detail), 0035 creates the uploaded_files staging table, and 0036 creates the models capability table. All three are additive and run automatically on startup.

Dependencies

  • openclaw-node 0.12.1 (was 0.10.0) — Pinchy bundles this internally; nothing for operators to install. It adds agents.list(), which powers the dispatch-readiness gate above, and carries the Gateway-correlated runId on every ChatChunk.
  • OpenClaw 2026.5.28 (was 2026.5.20) — picked up via Dockerfile.openclaw and surfaced in GET /api/version.
  • hono ≥ 4.12.21 — pinned to clear four disclosed CVEs in this transitive dependency.

What's Changed

  • chore(deps): bump deps and adapt to react-hooks 7.1 stricter rules by @clemenshelm in #440
  • fix(chat): reconcile history when WS dropped before first chunk (#310) by @clemenshelm in #439
  • feat(chat): server-side run registry + watchdog for stuck runs (#310 Tier 2a) by @clemenshelm in #441
  • feat(chat): resume in-flight chat stream across browser reconnect (#310 Tier 2b) by @clemenshelm in #442
  • test(chat): pin heartbeat lazy-start on userMessagePersisted (#310 Tier 2c) by @clemenshelm in #443
  • fix(restart): keep overlay correct during OC deferred restarts by @clemenshelm in #446
  • fix: v0.5.6 secrets.json setup race + setup-wizard smoke tests for all providers by @clemenshelm in #445
  • docs: streaming resume, run watchdog, new audit events (#310 Tier 2d) by @clemenshelm in #444
  • chore: gitignore mock-server package-lock.json files by @clemenshelm in #447
  • feat(memory): give agents working long-term memory (write path, prompt, compaction, watcher fix) by @clemenshelm in #448
  • feat(diagnostics): self-service support bundle export by @clemenshelm in #422
  • fix(composer): prevent IME dead-key freeze from assistant-ui 0.14 regression by @clemenshelm in #450
  • ci: guard against untracked test deletions by @clemenshelm in #453
  • feat(chat): accept CSV/text files as workspace attachments by @clemenshelm in #455
  • refactor(auth): drop redundant BETTER_AUTH_URL (#352) by @clemenshelm in #457
  • feat: PWA install support (Chrome/Edge/iOS/Android) by @clemenshelm in #408
  • fix(invite): render reset-specific UI and never overwrite display name (#436) by @clemenshelm in #458
  • fix(pinchy-files): resolve pre-existing tsc errors in pdf-extract/pdf-render by @clemenshelm in #460
  • chore(deps): bump openclaw 2026.5.20 -> 2026.5.28 by @clemenshelm in #452
  • feat(release): enforce package.json/tag/version parity in CI (#438) by @clemenshelm in #454
  • feat: convert DOCX to Markdown, raise size limit to 50 MB by @clemenshelm in #406
  • fix(pinchy-files): read image files as image content blocks (#420) by @clemenshelm in #461
  • fix(openclaw-config): size per-agent bootstrap caps so long AGENTS.md is not truncated (#373) by @clemenshelm in #459
  • feat(integrations): auth_failed status + Edit credentials (#300, #301) by @clemenshelm in #334
  • feat(uploads): multipart attachment endpoint — binary off WebSocket base64 by @clemenshelm in #342
  • feat: model capability tracking + pre-send attachment validation (closes #330) by @clemenshelm in #343
  • fix(chat): deterministic agent-readiness gate + reliable config-push across OC restart (de-flake dispatch E2E) by @clemenshelm in #464
  • test(usage-tracking): implement Tier-2 token-total tests (#426) by @clemenshelm in #462
  • fix(models): refresh Ollama Cloud catalog against the live API by @clemenshelm in #456
  • fix(db): correct out-of-order migration journal timestamps (v0.5.7 upload release blocker) by @clemenshelm in #468
  • ci: harden link-check and ollama install against transient 5xx by @clemenshelm in #471
  • test(db): upgrade-path integration test (behavior guard for the #468 migration skip) by @clemenshelm in #469
  • fix(db): forward repair migration for stranded uploaded_files table by @clemenshelm in #472
  • fix(chat): streaming-resume duplicate message-id crash by @clemenshelm in #470
  • feat(channels): channel-health watchdog for silent Telegram getUpdates-409 conflicts by @clemenshelm in #473
  • feat(chat): server-side first-chunk watchdog for wedged runs (B-1) by @clemenshelm in #474
  • fix: oversize-attachment UX + model-picker capability icons (v0.5.7 staging findings) by @clemenshelm in #475
  • fix(diagnostics): correct bundle timestamps, openclaw-node version, span timing by @clemenshelm in #480
  • fix(diagnostics): per-turn span startTime from prompt.submitted pairing by @clemenshelm in #481
  • fix(usage): record prompt-cache tokens and chart them as Cached Input by @clemenshelm in #482
  • fix(audit): keep numeric token counts readable in sanitized output by @clemenshelm in #484
  • docs: reconcile docs with v0.5.7 shipped code (upgrade notes + 47 fixes) by @clemenshelm in #466

Full Changelog: v0.5.6...v0.5.7