feat: spawner app + native uv support + deploy lifecycle fix#52
Closed
dgokeeffe wants to merge 32 commits intodatasciencemonkey:mainfrom
Closed
feat: spawner app + native uv support + deploy lifecycle fix#52dgokeeffe wants to merge 32 commits intodatasciencemonkey:mainfrom
dgokeeffe wants to merge 32 commits intodatasciencemonkey:mainfrom
Conversation
Contributor
Author
Updated: Spawner admin bootstrap + self-service provisioningNew commits add the full spawner app with:
|
This was referenced Mar 11, 2026
datasciencemonkey
added a commit
that referenced
this pull request
Mar 11, 2026
…nCode OpenCode intermittently sends empty text content blocks in messages, which Databricks Foundation Model API strictly rejects with "text content blocks must be non-empty" (OpenCode #5028). This adds a LiteLLM proxy running on localhost:4000 inside the container that strips these blocks before they reach the API. Simpler alternative to PR #52's fork approach — no fork maintenance, proven fix via LiteLLM PR #20384, preserves full AI Gateway/MLflow/UC governance. Changes: - setup_litellm.py: new setup script, starts LiteLLM proxy with health check - setup_opencode.py: route baseURL through localhost:4000 instead of direct - app.py: add litellm setup step (sequential, before parallel agent setup) - requirements.txt: add litellm>=1.60 - docs/plans: design document with analysis of PR #52 trade-offs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
12 tasks
datasciencemonkey
added a commit
that referenced
this pull request
Mar 11, 2026
Socket.IO reports connected=true even when falling back to HTTP long-polling through the Databricks Apps reverse proxy. The app was prematurely stopping the poll-worker, leaving users with no data transport when true WebSocket wasn't available. Now checks socket.io.engine.transport.name before deciding: - 'websocket' → stop poll-worker, use WS as primary - 'polling' → keep poll-worker active as primary transport - Listen for late 'upgrade' event if transport upgrades later Cherry-picked from PR #52 (dgokeeffe). Fixes #54 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
datasciencemonkey
added a commit
that referenced
this pull request
Mar 11, 2026
… (#59) Socket.IO reports connected=true even when falling back to HTTP long-polling through the Databricks Apps reverse proxy. The app was prematurely stopping the poll-worker, leaving users with no data transport when true WebSocket wasn't available. Now checks socket.io.engine.transport.name before deciding: - 'websocket' → stop poll-worker, use WS as primary - 'polling' → keep poll-worker active as primary transport - Listen for late 'upgrade' event if transport upgrades later Cherry-picked from PR #52 (dgokeeffe). Fixes #54 Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Install OpenCode from dgokeeffe/opencode fork with native Databricks provider (auto-discovers models, shares Claude Code skills) - Add GitHub CLI (gh) setup with xterm.js-safe auth wrapper - Reduce select() timeout 500ms→50ms and poll interval 100ms→50ms - Add Makefile for deployment automation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace single global sessions_lock block in get_output_batch() with 3-step resolve/swap/join pattern matching get_output(). Snapshot session dict in cleanup_stale_sessions() to iterate with per-session locks. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Socket.IO sets connected=true even when falling back to its own long-polling (Databricks Apps proxy blocks WS upgrade). This stopped the fast poll-worker, routing all output through slow long-polling. Now checks socket.io.engine.transport.name and only stops poll-worker when transport is true 'websocket'. Also listens for late upgrades. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy prd-writer, test-generator, implementer, and build-feature agent definitions to ~/.claude/agents/ during setup. Stripped model overrides so agents inherit the Databricks model serving endpoint. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Admin token handles privileged ops (secret scopes, ACLs, deploy) - User PAT creates app (ownership) + stored as runtime secret - SCIM /Me resolves PAT owner to derive app name - Secret resource included in app creation (no separate PATCH) - Each PAT stored with unique UUID key - /api/apps endpoint lists all spawned coding-agents apps - Makefile for deploy/redeploy with run polling - README documenting architecture and token model Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Switch to uv run gunicorn in app.yaml (10-100x faster installs) - Delete requirements.txt, use pyproject.toml + uv.lock exclusively - Require Python >=3.12, add missing deps (gunicorn, flask-socketio) - Run setup scripts via uv run python for consistent Python 3.12 env - Strip DATABRICKS_TOKEN whitespace at startup to fix auth failures - Add ~/.local/bin to PATH in _run_step for uv/tool discovery Co-authored-by: Isaac
Co-authored-by: Isaac
Co-authored-by: Isaac
Co-authored-by: Isaac
Co-authored-by: Isaac
Pipe-based workspace import writes empty content. Write to temp file first, then import with --file flag. Co-authored-by: Isaac
Co-authored-by: Isaac
New apps are UNAVAILABLE until first deploy, so waiting for RUNNING causes a deadlock. Retry the deploy call with backoff. Co-authored-by: Isaac
Deploy API requires compute_status=ACTIVE (~80s after app creation). Gunicorn timeout bumped to 300s to handle the full provision flow. Co-authored-by: Isaac
d2fb8f5 to
98b68cf
Compare
Falls back to direct model serving when no AI Gateway is configured. Co-authored-by: Isaac
Co-authored-by: Isaac
Co-authored-by: Isaac
Prevents fitAddon.fit() from thrashing scroll position on every resize pixel. Adds explicit scrollback and scrollOnUserInput. Co-authored-by: Isaac
Previously skipped install if binary existed, leaving stale versions across redeployments. Co-authored-by: Isaac
Co-authored-by: Isaac
Random UUID secret keys caused re-provisions to store the PAT under a new key while the app still referenced the old one. Users had to enter their PAT twice because the first attempt's secret was orphaned. Co-authored-by: Isaac
Secret values stored via `echo | databricks secrets put-secret` include a trailing newline, causing invalid Authorization headers. Co-authored-by: Isaac
The Databricks Apps API returns state under `app_status`, not `status`. This caused the early-exit check to never detect running apps, and the spawned apps table to always show UNKNOWN. Co-authored-by: Isaac
Provision runs in a background thread so the endpoint returns immediately. UI polls /api/provision-status every 3s showing step-by-step progress with checkmarks. Apps table auto-refreshes every 10s and shows in-flight provisions. Supports multiple concurrent provisions. Co-authored-by: Isaac
Co-authored-by: Isaac
The list apps endpoint doesn't return app_status. Derive state from compute_status and active_deployment.status instead. Co-authored-by: Isaac
Co-authored-by: Isaac
xterm.js intercepts Ctrl+V/C as raw control characters on non-Mac platforms. Added attachCustomKeyEventHandler to let the browser handle Ctrl+V (paste), Ctrl+C (copy when text selected), and Ctrl+Shift+C/V. Also added clipboard section to shortcuts help with platform-aware labels (Cmd on Mac, Ctrl on Windows) and upload toast for image paste. Co-authored-by: Isaac
Runtime image ships an older CLI (v0.251.0). Added a setup step that fetches the latest release from GitHub API and installs it to ~/.local/bin, same pattern as the GitHub CLI install. Co-authored-by: Isaac
Co-authored-by: Isaac
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
spawner/): One-click provisioning UI that creates personal coding-agents instances for any developer. Admin bootstraps once; users paste a PAT and get their own app deployed from a shared template.requirements.txt/pip withpyproject.toml+uv.lockfor both the main app and spawner, enabling the platform's nativeuv sync+uv runflow.compute_status == ACTIVE(~80s after app creation), notapp_status == RUNNING. Added polling + bumped gunicorn timeout to 300s to handle the full provision flow.Test plan
daveokworkspace/api/provision— all 6 steps complete in ~114sThis pull request was AI-assisted by Isaac.