v2026.10.43 — VRAM Orchestrator + NVIDIA GPU Skills

Djtony707 released this 17 Mar 04:25

· 736 commits to main since this release

1efb194

What's New

VRAM Orchestrator (v2026.10.43)

Automatic GPU VRAM management for NVIDIA GPUs (tested on RTX 5090, 32GB).

nvidia-smi polling — periodic GPU state monitoring (VRAM usage, temp, utilization)
Model auto-swap — when GPU services need VRAM, the orchestrator downgrades the LLM to a smaller model, then upgrades back when done
Time-bounded leases — services reserve VRAM with auto-expiry, preventing memory hogging
Async mutex — all VRAM operations serialized, no race conditions
Emergency OOM — auto-unloads all models if VRAM drops below critical threshold
3 agent tools: vram_status, vram_acquire, vram_release
4 API endpoints: GET /api/vram, POST /api/vram/acquire, POST /api/vram/release, GET /api/vram/check
Config schema: vram.* section with services budget map, thresholds, auto-upgrade

NVIDIA GPU Skills (v2026.10.42)

cuOpt — GPU-accelerated vehicle routing optimization via NVIDIA cuOpt v26.02 async API (tested live, VRP solved in 74ms)
AI-Q Research — Deep research via Nemotron Super 49B NIM API with citation extraction
OpenShell Sandbox — K3s-based secure code execution with declarative YAML policies
All gated behind TITAN_NVIDIA=1 env var

Fixes (v2026.10.41-42)

Voice mic leak — VoiceOverlay cleanup
Tool visibility — security.allowedTools default
OpenAI-compat keepModelPrefix bug
6 TypeScript type errors
Voice system prompt rewrite (500 tokens vs 3000+)

Stats

4,321 tests across 135 files (all passing)
0 TypeScript errors, 0 ESLint errors
~155 tools across 100+ skills

Assets 2