A single self-contained Rust binary that pulls image, LLM, audio (STT/TTS), and video jobs from the minis.gg studio API, runs them locally, and posts the results back.
Install the worker on any PC, register once, and it will hold a
hibernatable WebSocket session to the studio API's
WorkerConnections Durable Object. The studio pushes job offers over
the socket as soon as they're queued; the worker accepts, runs the
engine, and posts the result back the same way (or via a single HTTP
multipart route for image / audio / video bytes). The worker also
auto-updates itself between jobs.
studio-worker binary <----- WebSocket -----> WorkerConnections DO <-> D1
^ ^
| HTTP multipart /complete |
+------------------------------------------+ (binary outputs only)
Replaces the previous push-based studio-proxy + cloudflared topology
and the intermediate pull-based polling pipeline. All five legacy
worker HTTP routes (heartbeat, claim, complete-json, fail,
logs) are now WS frame types.
| Kind | Wire kind |
Synthetic engine (default) | Real engine (planned) |
|---|---|---|---|
| Image | image |
real WEBP / PNG via the image crate |
image-candle / sd-cpp |
| LLM | llm |
OpenAI-shape JSON (chat.completion) |
llama (llama.cpp) |
| Audio STT | audio_stt |
Whisper-shape JSON | whisper (whisper.cpp) |
| Audio TTS | audio_tts |
real WAV (sine wave keyed by hash(text)) | tts-piper |
| Video | video |
real WebP image (single-frame stand-in) | video-ffmpeg |
The synthetic engine is the default and exercises the full pipeline end-to-end with no GPU, no model downloads, and ~0 ms per task — exactly what the unattended CI suite uses. Real high-performance backends (llama.cpp, whisper.cpp, candle, Piper, ffmpeg) are wired in via feature flags and are deferred to a follow-up iteration (the trait, contract, and dispatch are already in place).
The worker ships a native desktop window built on egui/eframe that
surfaces every config knob, the live job in flight, the recent-jobs
history, the rolling log tail, and a system-tray icon with Open /
Pause-Resume / Quit. It is on by default — cargo install studio-worker gives you the windowed worker, and studio-worker ui
launches it.
The UI build is free of GTK: the window uses eframe/glow (OpenGL via
dlopen), notifications use notify-rust (pure-Rust zbus on Linux), and
the system tray uses ksni (pure-Rust StatusNotifierItem) on Linux and
the native tray-icon APIs on macOS / Windows. So a source build needs
no pkg-config, no -dev packages, and no OpenSSL (reqwest +
sentry use rustls). Headless rigs can still opt out:
cargo install studio-worker --no-default-features # service / `run` onlyFive tabs:
| Tab | What it shows |
|---|---|
| Status | Worker id, API URL, VRAM total + threshold, busy / idle / paused badge, last heartbeat age + outcome. When the worker isn't registered, an in-window Register form. |
| Jobs | Current job in flight (kind, model, prompt, elapsed time) + bounded ring of the last 50 finished jobs with completed / failed badges. |
| Config | Every config.toml field as an editable widget grouped into Connection / Worker / Engine / Auto-update / Models / Notifications / Background mode. Save writes through config::save and the runtime picks up new values on the next tick. Engine swaps surface a "restart required" banner. |
| Logs | Level filter (info / warn / error), free-text search across category / message / job id, auto-scroll toggle, windowed at the last 500 entries. |
| About | Version, Sentry release name, resolved config path, "Check for updates" button. |
The tray icon reflects state (idle = green, busy = amber, disconnected = red) and exposes:
- Open Window — re-show the window after hide-to-tray.
- Pause / Resume claiming — toggles
auto_enabled, persisted toconfig.toml. - Quit — signals the runtime loops to stop, awaits any in-flight job briefly, then exits.
Closing the window hides it to the tray; the worker keeps running.
For an autostart-on-login workflow, tick the Run in tray on login
toggle on the Config tab (writes ~/.config/autostart/studio-worker-ui.desktop
on Linux, a LaunchAgent plist on macOS, a marker file on Windows).
None for the UI itself on any platform — that's the point of the
GTK-free stack above (no pkg-config, no cairo/gtk -dev
packages, no OpenSSL). A standard Rust toolchain is enough.
The all-backends build (--features all, used for the release
binaries) additionally compiles llama.cpp in-process, which needs
cmake + a C/C++ toolchain. The release runners install cmake
automatically (cargo-dist system dependency); for a local
cargo install studio-worker --features all make sure cmake and a
C++ compiler are on PATH.
curl --proto '=https' --tlsv1.2 -LsSf \
https://github.com/webbertakken/studio-worker/releases/latest/download/studio-worker-installer.sh | shirm https://github.com/webbertakken/studio-worker/releases/latest/download/studio-worker-installer.ps1 | iexcargo install studio-worker # windowed UI by default
cargo install studio-worker --features all # + in-process llama.cpp + media (needs cmake)
cargo install studio-worker --no-default-features # headless service buildThe install script is the turnkey path: its pre-built binaries
already bundle the UI and every backend (in-process llama.cpp LLM +
media engines), auto-start on login, auto-update, and auto-download
models on demand — nothing else to install. cargo install studio-worker from source is UI-first but ships only the synthetic
engine unless you add --features all (which needs a C/C++ toolchain).
Each release ships pre-built binaries for:
x86_64-pc-windows-msvcx86_64-unknown-linux-gnuaarch64-unknown-linux-gnuaarch64-apple-darwinx86_64-apple-darwin
No shared secret to copy around. The worker auto-registers against
https://studio.minis.gg on first launch; the studio operator sees a
row in the dashboard's Pending Workers panel and clicks Approve, and
the worker's next 30s poll picks up its worker_id + auth_token
and starts heartbeating. Two ways to launch:
# Windowed (recommended) — Status tab shows 'Waiting for approval'
# until the operator approves.
studio-worker ui
# Headless — same flow, no window; pipe to journalctl in production.
studio-worker runOptional pre-launch tweaks (none of these talk to the network):
# Pre-set the human label shown in the dashboard's Pending Workers panel.
studio-worker register --label "alice's gaming rig"
# Point at a self-hosted studio instead of studio.minis.gg.
studio-worker register --api-base-url https://my-studio.example.com
# Optionally install the auto-start OS service (systemd --user on Linux,
# launchd on macOS, scheduled task on Windows). Alternative: the desktop
# UI's Config tab has a `Run in tray on login` toggle.
studio-worker install-serviceIf your registration is rejected (or you want to move the worker to a different studio), clear the local state and submit a fresh request:
studio-worker register --reset| Subcommand | Purpose |
|---|---|
run |
Auto-register if needed, then hold the WS session + auto-update loop. |
ui (default) |
Same as run plus the desktop window + tray + notifications. Built unless installed with --no-default-features. |
register |
Persist --label / --api-base-url; --reset clears local state. |
status |
Print the local config + registration state. |
install-service |
Install the auto-start OS service. |
uninstall-service |
Remove the auto-start OS service. |
enable |
Set auto_enabled = true (resume claiming). |
disable |
Set auto_enabled = false (worker online but doesn't claim). |
set-threshold <gb> |
Set the max VRAM (GB) the worker is willing to claim per job. |
config |
Print the resolved config + its on-disk path. |
check-update |
Check the release feed for a newer version (does not install). |
Config lives at:
- Linux/macOS —
~/.config/minis-studio-worker/config.toml - Windows —
%APPDATA%\minis-studio-worker\config.toml
api_base_url = "https://studio.minis.gg"
worker_id = "<filled on operator approval>"
auth_token = "<filled on operator approval>"
vram_threshold_gb = 12.0 # max GB per claim
auto_start = true
# Where on-demand model files are cached (defaults to ~/models).
models_root = "~/models"
# Auto-update — checks the release feed on the cadence below, applies
# updates only when no job is running, then re-execs the new binary.
auto_update_enabled = true
auto_update_interval_secs = 1800
auto_update_feed = "https://api.github.com/repos/webbertakken/studio-worker/releases"
auto_update_prerelease = false
# WebSocket reconnect cap. When the session drops the worker tries
# to reconnect with exponential backoff up to this many times before
# exiting non-zero (and letting systemd/launchd/Task-Scheduler
# restart it). `0` = infinite. Omit to use the default of 5.
ws_reconnect_attempts = 5
# Internal state written by the auto-register flow. Don't edit by hand.
install_id = "<uuidv4>"
registration_request_id = "<rr-...>" # cleared on approval
registration_secret = "<hex>" # cleared on approvalThe worker doesn't ship a shared secret. On first launch:
- Generates a per-install UUID + 256-bit
registration_secretand keeps both inconfig.toml. Only the SHA-256 hash of the secret leaves the box. - POSTs
/workers/register-requesttoapi_base_urlwith hostname, username, VRAM, supported models, optional label. - The studio creates a Pending Workers row. The operator sees it in the studio dashboard, clicks Approve (or Reject), and the worker's next 30s poll picks up the decision.
- On Approve:
worker_id+auth_tokenwritten toconfig.toml, normal heartbeat / claim loops take over. - On Reject: worker stops trying.
studio-worker register --resetclears state and the next launch submits a fresh request.
See docs/architecture/overview.md
for the full state machine + per-install identity details.
- Worker exits with
ws auth failed: ...— the studio API rejected the auth token on the upgrade (HTTP 401) or via a close-code 4001 after a successful upgrade. The token was either revoked, the worker was deleted from the studio admin UI, orconfig.tomlcarries a stale token. Clear local state and let the next launch auto-register again:studio-worker register --resetthenstudio-worker run(orstudio-worker ui). - Worker exits with
ws reconnect cap reached— every reconnect attempt failed (DNS, TLS, or the API is down). Service manager will restart us; if it keeps happening, check the API is reachable from the worker host.
There's no engine-selection knob in the config. The worker advertises
capabilities for every backend compiled into the binary and routes each
incoming job to the first backend that supports its (kind, model) pair
(see MultiEngine).
synthetic(always present, last in the chain) — produces deterministic, real WEBP/PNG/WAV/JSON outputs keyed by SHA-256 of the prompt/text/input. No GPU required. Use for smoke-tests, CI, and end-to-end verification of every modality.sd-cpp— real image inference viastable-diffusion.cppas a subprocess. Self-registers only when thesd-clibinary and at least one model's files are present undermodels_root. Seedocs/engines/sdcpp.md.llama— real LLM inference viallama.cpplinked in-process (llama-cpp-2). Shipped in the release binaries (and any--features all/--features llamabuild); downloads the GGUF named by the offer'sModelSourceinto<models_root>/llm/on demand and advertises thellama-cpp:*wildcard so a fresh worker is claimable.- feature-gated heavyweights —
whisper(STT),image-candle(pure-Rust SD),video,ttsdrop in via the same trait when their cargo feature is enabled.whisperandllamaeach static-link their ownggml, which can't coexist in one binary, sowhisperships in its own bundle (all-engines-stt); the all-backends release pairsllama(in-process) withsd-cli(subprocess) to sidestep the clash.
When the studio offers a model whose engine isn't compiled into the
worker, the job fails loudly with an actionable message (install the
all-backends release, or rebuild with --features all) rather than
silently producing placeholder bytes.
Implement the Engine trait under src/engine/ (see SyntheticEngine
and SdCppEngine for examples). An engine declares its capabilities
(per-kind supported models) and a dispatch(model, task) -> TaskResult
function. Wire it into engine::build() behind a cargo feature, e.g.:
[features]
llama = ["dep:llama-cpp-2"]The trait is already kind-aware so a single binary can host multiple engines (one per modality).
The worker reports two numbers to the API:
vramTotalGb— physical VRAM on the host (probed from/proc/driver/nvidiaon Linux;0when no NVIDIA GPU is present).vramThresholdGb— the max estimated VRAM per claim, controlled by the operator viaset-thresholdor by editingconfig.toml.
The studio API only hands a job to a worker if job.vramGbEstimate ≤ worker.vramThresholdGb and job.model ∈ worker.supportedModels.
Jobs that no worker can take stay queued until either a suitable worker
appears or the operator cancels.
A dedicated background task polls the GitHub Releases feed every
auto_update_interval_secs (default 30 min). When a higher semver is
available the worker:
- Confirms no job is currently in flight (per a shared
busyflag). - Downloads the cargo-dist installer for the current platform.
- Runs it (it overwrites the binary in place).
- Re-execs itself so the new code takes over.
Set auto_update_enabled = false to opt out. Set
auto_update_prerelease = true to track pre-releases.
The worker batches log entries every second and pushes them as a
logBatch frame over the WS session. The DO ingests them into the
workerLogs D1 table; the studio LogViewer reads them from there.
The worker integrates with Sentry for crash + error reporting. Disabled by default — set the following env vars before launching to enable it:
| Env var | Purpose |
|---|---|
SENTRY_DSN |
The project DSN. Telemetry stays off when unset. |
SENTRY_ENVIRONMENT |
Optional environment tag (defaults to production). |
When enabled the worker:
- captures panics automatically (
sentry's default panic handler); - forwards
tracing::error!events as Sentry events; - attaches preceding
tracing::warn!events as breadcrumbs; - tags every event with the worker's
release(=studio-worker@<crate version>, the Sentry-conventional namespaced form) and hostname (server_name).
No DSN is baked into the binary, so the public repo never carries credentials. Performance tracing is intentionally off — Sentry is used purely for error/crash visibility.
cargo test # default (UI) build
cargo test --no-default-features # headless core
cargo test --features all # + llama.cpp + candle (needs cmake)
cargo clippy --tests -- -D warnings
cargo fmt --check
# Coverage gates the headless core (UI rendering isn't unit-testable):
cargo llvm-cov --workspace --no-default-features \
--ignore-filename-regex 'src/main\.rs$|src/engine/sdcpp\.rs$|src/ws/session\.rs$' \
--summary-onlyCoverage CI enforces ≥ 90% line coverage on the headless core. Truly-untestable bits excluded from the gate:
src/main.rs— the CLI bootstrap (all logic lives inlib.rs).src/engine/sdcpp.rs,src/ws/session.rs— subprocess / live-socket paths exercised by the dev loop, not unit tests.- the
uifeature (egui rendering + OS tray glue) — not unit-testable; excluded by gating coverage on--no-default-features. update::RealRunner::{download, run_installer}— real network + process spawn (tested through theUpdateRunnertrait with a fake).update::restart_self— callsexecvp, never returns.sys::detect_vram_gbNVIDIA-specific branch — requires NVIDIA hardware.
Integration tests live under tests/:
tests/ws_wire.rs— round-trip tests for everyWorkerInbound/WorkerOutboundframe against the TS contract.tests/ws_client_contract.rs— the WS client against a live tokio-tungstenite server (upgrade headers, hello roundtrip, 401 → AuthFailed, close 4001 → AuthFailed, binary-frame rejection, close idempotency).tests/ws_session_full_loop.rs— end-to-end walk: hello → welcome → LLM offer → accept + completeJson → STT offer → accept + completeJson → clean close.tests/http_contract.rs— register + multipartcomplete(image- audio) against wiremock.
tests/http_errors.rs— error-status paths for register + multipartcompleteplus the tracing-emission contract.tests/multi_modal.rs— every TaskKind round-trips through the synthetic engine + decoders.tests/auto_update.rs— release feed parsing + apply_with full flow.tests/runtime_helpers.rs— one-shot CLI helpers via wiremock.tests/runtime_ticks.rs— auto-update ticks +run_returns_when_abortedsmoke test that exercises the AuthFailed exit path.
- PRs merge to
mainwith conventional-commit titles (feat:,fix:,docs:, etc. — enforced by the Commit lint workflow). release-pleaseopens a release PR that bumps the version and updates the changelog.- Merging the release PR creates a git tag.
- The tag triggers the
release.ymlworkflow (cargo-dist), which builds binaries for all supported targets and uploads them to the GitHub release alongsideinstaller.sh+installer.ps1one-liners.
MIT. See LICENSE.
