Skip to content

v3.21.0 - Buddy Companion, Model Picker Polish & Linux Startup Reliability

Choose a tag to compare

@siddsachar siddsachar released this 07 May 12:19
· 120 commits to main since this release
1dd188b

v3.21.0 - Buddy Companion, Model Picker Polish & Linux Startup Reliability

This release adds Thoth's Buddy companion foundation, a local-first animated presence that can live in the app sidebar, move around the workspace, and optionally open as a native desktop overlay. It also tightens Settings -> Models behavior, improves provider and Vision model selection, and hardens packaged startup on Windows and Linux so optional native dependency failures are easier to diagnose and less likely to block launch.

Buddy Companion Foundation

  • Buddy subsystem — adds a prompt-generated Buddy architecture with a thread-safe event bus, deterministic behavior brain, persistent config, pack validation, Hatch art/motion generation, canvas playback/effects, one dockable in-app Buddy, and a separate desktop overlay surface.
  • Live Thoth awareness — Buddy receives chat streaming, thinking, tool, approval, workflow, notification, and voice-state events from existing runtime paths.
  • Single configured identity — Buddy surfaces no longer render a separate companion name or duplicate Buddy-name setting; the assistant identity remains owned by Preferences, while Buddy UI focuses on state, personality, and motion.
  • Desktop overlay route — adds /buddy-overlay plus pywebview helpers for a named Buddy window where native overlay support is available.

Buddy Motion & UI Polish

  • Generated animation boundary — Buddy ships with bundled first-party glyph, lumen, ember, pixel, sprout, and orbit motion packs. Hatch-generated custom Buddy art and compact image-to-video motion packs are copied into Thoth's served Buddy assets, while normal playback switches locally across idle, thinking, working, approval, success, and error states without runtime model calls.
  • Generated pack quality — Hatch prompts request keyable backgrounds, frame padding, and rim-lit dark edges so generated packs preserve character detail during transparency compositing; Google Veo starts are paced during fresh bundled regenerations, and runtime corner-keying is gentler so bundled pack edges stay intact.
  • Motion semantics — approval, denial, timeout, cancellation, interruption, and completion states now map to explicit Buddy clips; MP4 playback crossfades state changes, smooths loop restarts, and replays idle motion periodically without looking busy.
  • Dockable in-app presence — Buddy starts inside a sidebar home circle, can be dragged into the workspace, leaves the sidebar dock visibly empty while away, snaps home when released near the dock, and returns home on app restart instead of persisting a stray position.
  • Settings polish — Buddy Settings groups where Buddy appears, behavior, look, and generated-motion guidance in a dense Models-tab-style layout. Visual pack selection uses preview tiles, clears stale Hatch overrides when a bundled pack is selected, and refreshes existing in-app and desktop clients.
  • Hatch save recovery — saving Buddy settings now preserves freshly generated Hatch art and motion pointers instead of falling back to the selected bundled pack. Hatch outputs are promoted into selectable user packs, still-only generated art remains valid when motion is poor or unavailable, generated packs can be switched back to still-only mode, generated Hatch packs can be deleted from the picker, motion retry regenerates the selected user-generated still without overwriting the selectable pack manifest, and new motion requests use provider-compatible 5-second clips. Full Buddy generation now runs as a background job with Home status-bar progress, completion notifications, and private baked-in still/video prompts so simple user concepts do not turn into pose sheets; the visible concept field stays clean while internal personality/style guidance remains private. Generated Hatch motion now preserves full-frame opaque stills and uses the same cover-framed corner-keying path as bundled motion packs, while transparent stills are composited onto a stable keyable background before video generation; existing Hatch packs whose manifest was overwritten by retry metadata are recovered when loaded, and stopping a workflow immediately moves Buddy out of the running-workflow state.

Buddy Desktop Overlay Reliability

  • Native overlay stability — desktop Buddy preserves important approval, denial, workflow, and error bubbles even in Quiet mode, keeps bubbles visible across rapid state settling, applies first-paint transparent document styling, and reveals only after the transparent Buddy document has painted.
  • Window creation fallback — the native overlay retries with simpler pywebview options if a backend rejects transparency or hidden-window hints, avoids snapshot pushes into deleted NiceGUI clients, and guards startup health-check results so transient None values cannot crash the native window.
  • Workflow state cleanup — approval denials and timeouts clear approval and workflow activity immediately; denied, timed-out, stopped, or cancelled workflow endings clear Buddy workflow-step state; successful multi-step workflow endings emit done instead of a misleading cancellation.

Models, Vision & Settings Reliability

  • Settings and timer stability — Settings -> Models opens the provider/model catalog lazily and caps provider rows so very large catalogs no longer crash the UI, while NiceGUI one-shot and polling timers clean up when clients disconnect or parent slots are deleted instead of flooding logs with deleted-slot errors.
  • Model catalog and picker clarity — installed local Ollama chat models appear in Settings -> Models even when their family is not yet in Thoth's curated tool/vision capability lists. Brain and Vision pickers now make it clear that catalog rows must be pinned before they appear as everyday choices.
  • ChatGPT / Codex Vision pins — Codex Vision pins keep their provider-specific image-input capability during Quick Choice refreshes, and the Codex Responses transport preserves multimodal image blocks so captured screenshots are sent to Codex Vision models instead of being flattened to text-only requests.
  • Vision and setting updatesthoth_update_setting validates Brain and Vision model changes against Quick Choices, installed local Ollama models, and provider catalog rows before saving, exposes an explicit vision_model setting, and rejects invented or unavailable model names with actionable guidance.

Linux & Startup Reliability

  • Linux launcher install-path fix — the generated Linux launcher resolves installed symlink chains before computing the app root, so ~/.local/bin/thoth starts the packaged app from ~/.local/share/thoth/current; release CI smokes through the installed user launcher path.
  • Linux packaged startup resilience — packaged Linux launches now report startup log tails, child-process exit details, configurable THOTH_STARTUP_TIMEOUT, and targeted hints for native OpenCV/FAISS/NumPy dependency failures. Camera and screenshot capture degrade gracefully if OpenCV/MSS cannot import instead of blocking app startup.
  • Linux installer UX hardening — source-checkout builds support the root-level bash build_linux_app.sh <version> support command, install success messages print ~/.local/bin/thoth when ~/.local/bin is not on PATH, and maintainer docs distinguish unreleased tarball testing from the one-line installer that resolves published GitHub Release assets.
  • Optional native package diagnostics — startup detects installed-but-broken optional native packages such as TorchCodec, logs a concrete recovery command, and makes Transformers treat broken TorchCodec as unavailable instead of letting optional audio/video helpers crash Thoth during startup.
  • Windows embedded-Python repair hardening — Windows installer repair/upgrade replaces the bundled {app}\python runtime before copying the new payload, preventing manually installed or corrupted packages from surviving an over-the-top reinstall.

Tests & Release Checks

  • Buddy coverage — focused tests cover core event/config/asset behavior, Hatch motion activation, UTF-8 config loading, UI wiring, event source hooks, runtime fallback behavior, dockable in-app behavior, built-in motion semantics, and packaging inclusion. Manual-style browser smokes verify docked, undocked, and overlay playback from the bundled pack.
  • Reliability coverage — startup hardening tests cover broken TorchCodec detection, Linux native dependency recovery hints, launcher log-tail diagnostics, Windows installer embedded-Python replacement, app import smoke, Settings -> Models catalog bounds and picker guidance, status-tool model validation, safe timer cleanup, and installed Linux launcher symlink/default invocation resolution.
  • Provider/Vision coverage — provider tests cover ChatGPT / Codex Vision Quick Choice capability retention and Codex Responses multimodal image payload preservation.
  • Release smoke — release and CI workflows build Windows, macOS, and Linux artifacts for v3.21.0, run focused startup/provider suites before installer builds, and smoke the installed Linux launcher path.
  • Test layout cleanup — root-level test files now live under tests/, pytest discovers that folder by default, CI/release workflows call the moved paths, and installer regressions assert the tests/ tree is not shipped in Windows, Linux, or macOS packages.
  • Current validation — focused release tests pass (110 passed, 2 skipped), the legacy release smoke suite reports ALL TESTS PASSED!, full pytest -q passes (255 passed, 3 skipped), git diff --check is clean, stale-version search only finds the previous release's historical changelog section, and docs/index.html remains untouched.

Release Notes & Risk Notes

  • Desktop overlay support varies by platform — Buddy's in-app surface is the primary supported experience; the native transparent desktop overlay depends on pywebview/backend support and may fall back to simpler window options.
  • Generated Buddy assets are optional — bundled motion packs run locally with no model call; Hatch-generated Buddy art/motion requires the configured image/video generation providers and their normal quotas/rate limits.
  • Linux native capture dependencies are optional — missing OpenCV/MSS native libraries should not block startup, but camera and screenshot tools remain unavailable until the relevant platform packages are installed.
  • Landing page update deferreddocs/index.html is intentionally not updated in this release-prep pass; download links and website version text will be updated separately after the v3.21.0 release assets are published.

Files Changed

File Change
buddy/, static/buddy/, ui/buddy.py Buddy event/config/runtime surfaces, bundled motion packs, in-app docked/undocked UI, and desktop overlay route/runtime assets
ui/settings.py, ui/model_catalog.py, providers/selection.py, providers/catalog.py, providers/codex.py, models.py Settings -> Models stability, picker clarity, Codex Vision capability retention, and provider/model catalog refinements
providers/transports/codex_responses.py, vision.py, tools/thoth_status_tool.py Codex multimodal image payload preservation, startup-safe Vision capture backends, and controlled Brain/Vision setting updates
launcher.py, startup_diagnostics.py, installer/thoth_setup.iss, installer/install_deps.bat Startup diagnostics, Linux readiness failure context, Windows embedded-Python repair, and optional native package recovery hints
installer/build_linux_app.sh, installer/install-linux.sh, build_linux_app.sh, .github/workflows/release.yml, .github/workflows/ci.yml Linux launcher symlink resolution, root build wrapper, installed launcher smoke, and release/CI packaging checks
docs/RELEASING.md, installer/README.md, README.md, docs/ARCHITECTURE.md Release checklist, installer, architecture, and user-facing Linux/provider/model guidance updates
tests/, pytest.ini Focused startup/Linux/provider/model-selection regressions, release-smoke coverage, moved test discovery, and installer exclusion guards
schema: 1
files:
  Thoth-3.21.0-Linux-x86_64.tar.gz: sha256=4347d7d031d6fbf1b9464dcb042609ab4c461cd59af0b3fc5d8e36fa97acf6b9
  Thoth-3.21.0-macOS-arm64.dmg: sha256=5f810c90703f6260d4173efb3de7e8f757ce799243e4591b5949c946a9e67270
  ThothSetup_3.21.0.exe: sha256=b5d5732ea49f8978c0a4074ed0ba76b8632419c246fa981592c693c90aed2e33