v2026.10.43 — VRAM Orchestrator + NVIDIA GPU Skills
What's New
VRAM Orchestrator (v2026.10.43)
Automatic GPU VRAM management for NVIDIA GPUs (tested on RTX 5090, 32GB).
- nvidia-smi polling — periodic GPU state monitoring (VRAM usage, temp, utilization)
- Model auto-swap — when GPU services need VRAM, the orchestrator downgrades the LLM to a smaller model, then upgrades back when done
- Time-bounded leases — services reserve VRAM with auto-expiry, preventing memory hogging
- Async mutex — all VRAM operations serialized, no race conditions
- Emergency OOM — auto-unloads all models if VRAM drops below critical threshold
- 3 agent tools:
vram_status,vram_acquire,vram_release - 4 API endpoints:
GET /api/vram,POST /api/vram/acquire,POST /api/vram/release,GET /api/vram/check - Config schema:
vram.*section with services budget map, thresholds, auto-upgrade
NVIDIA GPU Skills (v2026.10.42)
- cuOpt — GPU-accelerated vehicle routing optimization via NVIDIA cuOpt v26.02 async API (tested live, VRP solved in 74ms)
- AI-Q Research — Deep research via Nemotron Super 49B NIM API with citation extraction
- OpenShell Sandbox — K3s-based secure code execution with declarative YAML policies
- All gated behind
TITAN_NVIDIA=1env var
Fixes (v2026.10.41-42)
- Voice mic leak — VoiceOverlay cleanup
- Tool visibility —
security.allowedToolsdefault - OpenAI-compat
keepModelPrefixbug - 6 TypeScript type errors
- Voice system prompt rewrite (500 tokens vs 3000+)
Stats
- 4,321 tests across 135 files (all passing)
- 0 TypeScript errors, 0 ESLint errors
- ~155 tools across 100+ skills