You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Run multiple VS Code sandbox instances concurrently with nginx SSL reverse proxy,
groups-based user management, persistent workspaces, and optional local LLM inference
via Lemonade Server.
Features
Unified CLI: thon init, thon setup, thon run — one config file (thon.yaml) for everything
Multi-Instance: Multiple concurrent VS Code sandboxes from a single command
Groups-Based: Define users and groups in YAML or manage via dashboard
Web Dashboard: Streamlit dashboard for instance, group, Lemonade, and gateway management
REST API: FastAPI REST API with Swagger UI for programmatic access
SSL/TLS: Automatic nginx reverse proxy with mkcert or openssl certificates
Persistent Workspaces: PVC Docker volumes or host bind mounts for workspace persistence
Local LLM: Optional Lemonade Server integration for local inference (chat + embedding)
Semantic Indexing: Embedding model for Kilo Code's semantic code search
AI Gateway: Optional APISIX gateway with per-user or per-group rate limiting and API keys
Authentication: Local password for dashboard; OIDC/OAuth2 (GitHub, GitLab, LinkedIn) for REST API
Config Files: Store and manage groups YAML, kilo.jsonc, and VS Code settings in the database
Kilo Code Ready: Auto-generated config with experimental flags and indexing for Kilo Code
Web UI for instance, group, Lemonade, and gateway management (:8501)
FastAPI REST API
Programmatic API for instances, groups, Lemonade, gateway, auth (:8100)
nginx
SSL termination + WebSocket proxy (per-port server blocks)
code-server
VS Code in the browser, runs HTTP inside each sandbox
Lemonade Server
Optional local LLM inference (chat + embedding models)
APISIX Gateway
Optional rate limiting with per-user or per-group API keys
Network Modes (auto-detected)
Mode
Endpoint Format
Detection
Host
127.0.0.1:8443
No / after port
Bridge
127.0.0.1:52322/proxy/8443
/proxy/ in endpoint
Auto-detected from the server-returned endpoint — not a CLI flag.
Workspace Persistence
Mode
Storage
Lifecycle
PVC Volume
Docker named volume (thon-workspace-*)
Persists across instance recreations
Bind Mount
Host directory (--workspace-dir)
Persists on host filesystem
Ephemeral
Inside container
Lost when container is removed
PVC volumes are created automatically when users are imported via the dashboard
or thon.yaml. When a sandbox is recreated, the same PVC volume is reattached.
SSL/TLS
mkcert (preferred): CA-trusted certs, filename includes IP hash
openssl (fallback): Self-signed certs with IP in SAN
Single shared cert for all instances on port 443
CA cert served at https://<ip>/ca.crt for remote clients
# All groups with nginx SSL (default)
python ./scripts/main.py --groups groups.yaml --external-ip 1.2.3.4
# Single group
python ./scripts/main.py --groups groups.yaml --group alpha --external-ip 1.2.3.4
# From database (uses PVC workspace volumes)
python ./scripts/main.py --from-db --external-ip 1.2.3.4
# Per-user passwords
python ./scripts/main.py --groups groups.yaml --secure --external-ip 1.2.3.4
# Persistent workspaces
python ./scripts/main.py --groups groups.yaml --workspace-dir /thon-workspace --external-ip 1.2.3.4
# Direct HTTP (no nginx)
python ./scripts/main.py --groups groups.yaml --no-nginx
# Single instance (no groups)
python ./scripts/main.py
# With Lemonade LLM inference
python ./scripts/main.py --groups groups.yaml --external-ip 1.2.3.4 --lemonade kilo.jsonc
# With AI Gateway (per-user rate limiting)
python ./scripts/main.py --groups groups.yaml --external-ip 1.2.3.4 --gateway
# With AI Gateway (per-group shared API keys)
python ./scripts/main.py --groups groups.yaml --external-ip 1.2.3.4 --gateway --gateway-per-group
# With AI Gateway + Redis rate limiting
python ./scripts/main.py --groups groups.yaml --external-ip 1.2.3.4 --gateway --gateway-redis-host 127.0.0.1
# With custom VS Code settings
python ./scripts/main.py --groups groups.yaml --external-ip 1.2.3.4 --vscode-settings vscode-settings.jsonc
# Cleanup nginx configs
python ./scripts/main.py --cleanup
Dashboard
THON includes a Streamlit-based web dashboard for managing VS Code sandbox instances,
groups, Lemonade Server, and AI Gateway. The FastAPI REST API provides Swagger UI
for programmatic access.
Quick Start
# Install dependencies
pip install streamlit pandas
# Run the dashboard
streamlit run dashboard/streamlit_app.py --server.port 8501
# Dashboard at http://localhost:8501
Optionally run the FastAPI REST API for programmatic access:
python -m app.main
# API docs at http://localhost:8100/docs
Pages
Page
Features
Instances
List, search/filter, create, pause/resume/kill, bulk actions, recreate with PVC volume
Groups
CRUD groups/users, transfer users, start per-user/group instances with PVC workspaces
Lemonade Server
Status, health, performance, slots, system info, available models
External IP, configuration file management (upload/edit/delete from DB)
Configuration Files in Database
The Settings page stores config files in the database. When main.py runs without
CLI flags, it reads these from the database. Priority: CLI flag > database > none.
Config Key
Description
config_groups_yaml
Groups and users definition
config_kilo_json
Kilo Code provider config
config_vscode_settings
VS Code settings for each sandbox
Lemonade Server (Local LLM Inference)
Provides an OpenAI-compatible API endpoint for VS Code extensions (Kilo Code, Continue, Cline)
inside sandbox containers. Runs as a systemd service on the host. Supports both chat
and embedding models for semantic code search.
Setup
# Full setup (install + configure + API keys + pull model + kilo.jsonc)
bash ./scripts/setup-lemonade.sh \
--groups groups.yaml --generate-keys --external-ip 1.2.3.4
Or use the Python wrapper:
python ./lemonade_server.py run \
--groups groups.yaml --generate-keys --external-ip 1.2.3.4
The embedding model enables Kilo Code's semantic code search. Enabled by default;
disable with --no-embedding. When enabled, max_loaded_models is automatically
set to 2 (1 chat + 1 embedding).
Per-User Scaling
When --groups groups.yaml is passed, context size and parallel slots scale automatically:
Parameter
Chat Model
Embedding Model
ctx_size
262144 per user
32768 per user
-np
num_users
num_users
Lemonade-managed args (reserved, must NOT appear in llamacpp_args):
--ctx-size, -c, -ngl, --gpu-layers, --n-gpu-layers, --jinja, --no-jinja,
--model, -m, --port, --embedding, --embeddings, --mmproj*, --rerank*
Builds llama.cpp with ROCm/HIP for gfx942 and installs to /usr/local. The Lemonade
config uses prefer_system: true with rocm_bin: /usr/local/bin/llama-server by default.
Kilo Code Integration
setup-lemonade.sh --generate-keys creates API keys and writes kilo.jsonc
kilo.jsonc contains: provider (lemonade), base URL, API key, model ID (user.gemma-4-31b-it),
experimental flags, and indexing config for semantic code search
Base URL resolution: --external-ip > Docker bridge gateway > localhost
main.py --lemonade kilo.jsonc injects config into each sandbox at /home/vscode/.config/kilo/config.json
Kilo Code reads the config and connects to the Lemonade server
Full Workflow
# Terminal 1: Set up Lemonade server with groups-based scaling
bash setup-lemonade.sh --groups groups.yaml --generate-keys --external-ip 1.2.3.4
# Terminal 2: Start VS Code sandboxes with Lemonade inference
python ./scripts/main.py --groups groups.yaml --external-ip 1.2.3.4 --lemonade kilo.jsonc
AI Gateway (APISIX Rate Limiting)
An optional APISIX API Gateway provides token-based rate limiting and per-consumer API keys
for LLM endpoints. Creates two routes: /v1/chat/completions (ai-proxy-multi) and
/v1/embeddings (upstream proxy for semantic indexing).
Consumer Modes
Mode
Description
Best For
per-user (default)
Each user gets own API key and rate limit
Individual accountability
per-group
Each group shares one API key with combined limit (rate_limit × num_users)
# Per-user: each user gets their own API key and rate limit
python ./scripts/main.py --groups groups.yaml --external-ip 1.2.3.4 --gateway
# Per-group: shared API key per group
python ./scripts/main.py --groups groups.yaml --external-ip 1.2.3.4 --gateway --gateway-per-group
# With Redis-backed rate limiting
python ./scripts/main.py --groups groups.yaml --external-ip 1.2.3.4 --gateway --gateway-redis-host 127.0.0.1
Rate Limiting Modes
Mode
Redis Host
Policy
Scope
Local
(not set)
local
Per-gateway-instance counters
Redis
127.0.0.1
redis
Shared across all gateway instances
When enabled, main.py generates a gateway-aware kilo.jsonc that points to the
gateway instead of directly to Lemonade. In per-group mode, all users in the same
group receive the same kilo.jsonc with the shared group API key.
Security
Sandbox Instances
Flag
code-server auth
Password
(default)
--auth none
None
--secure
--auth password
Auto-generated per-user (24-char token)
Dashboard Authentication
Two independent mechanisms:
Method
Scope
Mechanism
Local Password
Streamlit dashboard
AUTH_LOCAL_PASSWORD=mysecret — single shared password
OIDC/OAuth2
FastAPI REST API
GitHub, GitLab, or LinkedIn via PKCE flow
# Local password for dashboard
AUTH_LOCAL_PASSWORD=mysecret streamlit run dashboard/streamlit_app.py --server.port 8501
# OIDC for REST API
AUTH_ENABLED=true \
AUTH_SESSION_SECRET=$(openssl rand -hex 32) \
AUTH_GITHUB_CLIENT_ID=xxx \
AUTH_GITHUB_CLIENT_SECRET=xxx \
python -m app.main
REST API Endpoints
The FastAPI REST API on port 8100 provides Swagger UI at /docs.
Instances
Method
Path
Description
GET
/api/instances
List instances (filter by state, paginate)
POST
/api/instances
Create new instance
POST
/api/instances/{id}/pause
Pause instance
POST
/api/instances/{id}/resume
Resume instance
DELETE
/api/instances/{id}
Terminate instance
POST
/api/instances/{id}/renew
Extend TTL
POST
/api/instances/bulk/pause
Bulk pause
POST
/api/instances/bulk/resume
Bulk resume
POST
/api/instances/bulk/kill
Bulk terminate
Groups
Method
Path
Description
GET
/api/groups
List all groups with users
POST
/api/groups
Create a new group
GET
/api/groups/export
Export groups as YAML dict
PUT
/api/groups/{group_id}
Rename a group
DELETE
/api/groups/{group_id}
Delete a group and its users
POST
/api/groups/{group_id}/users
Add a user to a group
DELETE
/api/groups/{group_id}/users/{user_id}
Delete a user
POST
/api/groups/{group_id}/users/{user_id}/transfer
Transfer user to another group
Config Files
Method
Path
Description
GET
/api/config-files
List all config file slots
GET
/api/config-files/{key}
Get config file content
PUT
/api/config-files/{key}
Update config file content
POST
/api/config-files/{key}/upload
Upload a config file
DELETE
/api/config-files/{key}
Delete a config file
Lemonade
Method
Path
Description
GET
/api/lemonade/status
Server status
GET
/api/lemonade/models
Available models
GET
/api/lemonade/health
Proxy: server health
GET
/api/lemonade/stats
Proxy: performance stats
GET
/api/lemonade/slots
Proxy: slot states
POST
/api/lemonade/pull
Proxy: pull a model
POST
/api/lemonade/load
Proxy: load a model
POST
/api/lemonade/unload
Proxy: unload a model
Gateway
Method
Path
Description
GET
/api/gateway/status
Gateway status
GET
/api/gateway/consumers
List consumers
POST
/api/gateway/consumers
Create consumer
DELETE
/api/gateway/consumers/{username}
Delete consumer
POST
/api/gateway/setup
Full setup
POST
/api/gateway/cleanup
Remove all consumers and routes
Auth
Method
Path
Description
GET
/api/auth/providers
List enabled OIDC/OAuth providers
GET
/api/auth/login/{provider}
Start OAuth flow
GET
/api/auth/callback/{provider}
OAuth callback
POST
/api/auth/logout
End session
GET
/api/auth/me
Current user info
Troubleshooting
Service Worker SSL Error
SecurityError: Failed to register a ServiceWorker — An SSL certificate error occurred
Fix: Use mkcert CA-trusted certs. Remote clients must download and import the
CA root from https://<ip>/ca.crt.
Bad Gateway (502)
Caused by --base-path on code-server or including upstream path in proxy_pass.
Do NOT use --base-path and ensure proxy_pass ends with / only.
Model Not Found (404)
The user. prefix is required for user-registered models. Kilo Code should send
user.gemma-4-31b-it as the model name, not gemma-4-31b-it.
Reserved llama.cpp Arguments
Lemonade manages these arguments internally and rejects them in llamacpp_args:
-ngl, --jinja, --ctx-size, -c, -m, --port, --mmproj*, --rerank*
Embedding Model Not Loading
Check max_loaded_models is at least 2 in config.json
Streamlit dashboard: 5 pages with sidebar navigation
dashboard/streamlit_styles.py
Dark theme CSS injection
Config
File
Purpose
config/groups.yaml.example
Groups and users configuration template
config/kilo.jsonc.example
Kilo Code config template
config/vscode-settings.jsonc.example
VS Code settings template
Dockerfile
Sandbox image: python:3.12-slim + code-server
About
The Hackathon Organizer Node (THON) -- Run multiple VS Code sandbox instances concurrently with nginx SSL reverse proxy, groups-based user management, persistent workspaces, and optional local LLM inference via Lemonade Server.