A browser-based dashboard for monitoring remote GPU clusters over SSH. Supports two cluster types — RunAI (Salk, password auth) and DSMLP (UCSD, SSH key + Duo 2FA) — connectable independently or simultaneously. View GPU/CPU/RAM metrics, running processes, run terminal commands, and launch Claude Code sessions from a single browser window.
- Dual-mode login — Dropdown selector to connect RunAI, DSMLP, or both; only connected clusters appear in the dashboard
- Overview — All connected clusters at a glance with GPU utilization, memory, CPU load, RAM, and disk usage
- GPU Detail — Per-GPU stats from
nvidia-smi: utilization, memory, temperature, power, and running processes - Process Viewer — Sortable, filterable process table (
ps aux) per cluster - Interactive Terminal — Tabbed terminal emulator (xterm.js) with one tab per connected cluster, auto-cd to project directory and
screen -d -rto attach sessions - Claude Code — Dedicated tab to launch Claude Code on any connected cluster with a file explorer sidebar and markdown viewer
- File Explorer — Browse remote project files with rendered markdown previews
- Dark / Light Mode — VS Code-inspired themes, saved across sessions
RunAI: browser → FastAPI → SSH (password) → cluster
DSMLP: browser → FastAPI → SSH (key + Duo) → jump box → kubectl exec → pod
jump box → kubesh → pod (terminal)
- Backend: Python FastAPI with paramiko for SSH
- Frontend: Vanilla HTML/CSS/JS with xterm.js and marked.js loaded from CDN (no npm, no build step)
- Auth: RunAI uses SSH password (held in memory only); DSMLP uses SSH key + Duo 2FA to a jump box
conda env create -f environment.yml
conda activate gpu-dashboarduvicorn app:app --reload --host 0.0.0.0 --port 8000Open http://localhost:8000. Use the dropdown to select RunAI, DSMLP, or Both, then click Connect. For DSMLP, approve the Duo push on your phone. The dashboard shows only the clusters you connected to.
All settings live in config.yaml:
project:
directory: /home/jovyan/vast/kaiwen/track-mjx # file explorer root, terminal auto-cd
screen_session: train-vqvae # auto-attach in Terminal tab (screen -d -r)
claude_screen_session: claude # screen session for Claude tab
claude_user: devuser # su to this user for Claude
clusters:
my-cluster:
host: 10.0.0.1
port: 22
username: rootproject.directory— file explorer root, terminal auto-cd, and Claude tab working directoryproject.screen_session— auto-attached viascreen -d -rin Terminal tab if runningclusters— SSH targets (host, port, username), connected via password entered at login
DSMLP uses SSH key + Duo 2FA to a jump box (dsmlp-login.ucsd.edu), then manages Kubernetes pods for GPU access.
dsmlp:
host: dsmlp-login.ucsd.edu
port: 22
username: kbian
key_path: ~/.ssh/id_ed25519 # SSH key for jump box
launch_command: "launch-scipy-ml.sh -W DSC180A_FA25_A00 -g 4 -p low -c 8 -m 64 -n 31 -b"
project:
directory: private/MOSAIC # relative to home, or absolute path
screen_session: train-vqvae
claude_screen_session: claude
claude_user: devuser
auth_guide: "https://github.com/KevinBian107/DSC180-Toolbox"key_path— path to SSH private key (Ed25519)launch_command— command to launch a new pod if none is runningdsmlp.project— same fields as RunAIproject, but for the DSMLP pod environmentdirectory— can be relative (e.g.private/MOSAIC) or absolute
SSH key setup: See the DSC180 Toolbox guide for generating and registering your SSH key with DSMLP.
Pod lifecycle:
- Connect — SSH key auth + Duo 2FA (auto-sends push) to the jump box
- Auto-detect — Checks
kubectl get podsfor an existing Running pod - Launch — If no pod exists, shows "Launch Pod" button; runs the configured
launch_commandand polls until Running - Access — Commands run via
kubectl exec, terminals viakubesh
config.yaml # All configuration (clusters, project, server, dsmlp)
config.py # Loads config.yaml into Python (exports SERVER, PROJECT, CLUSTERS, DSMLP)
app.py # FastAPI app — REST + WebSocket endpoints for both RunAI and DSMLP
ssh_manager.py # SSH connection pool: RunAI (password) + DSMLP (key/Duo/kubectl)
environment.yml # Conda environment (fastapi, uvicorn, paramiko, pyyaml)
static/
index.html # Single-page dashboard UI with dropdown login
style.css # VS Code-inspired dark/light theme styles
app.js # Frontend: dual-mode login, metrics, terminals, file explorer
docs/
vqvae.md # VQVAE training quick reference