Skip to content

v2026.10.43 — VRAM Orchestrator + NVIDIA GPU Skills

Choose a tag to compare

@Djtony707 Djtony707 released this 17 Mar 04:25
· 736 commits to main since this release

What's New

VRAM Orchestrator (v2026.10.43)

Automatic GPU VRAM management for NVIDIA GPUs (tested on RTX 5090, 32GB).

  • nvidia-smi polling — periodic GPU state monitoring (VRAM usage, temp, utilization)
  • Model auto-swap — when GPU services need VRAM, the orchestrator downgrades the LLM to a smaller model, then upgrades back when done
  • Time-bounded leases — services reserve VRAM with auto-expiry, preventing memory hogging
  • Async mutex — all VRAM operations serialized, no race conditions
  • Emergency OOM — auto-unloads all models if VRAM drops below critical threshold
  • 3 agent tools: vram_status, vram_acquire, vram_release
  • 4 API endpoints: GET /api/vram, POST /api/vram/acquire, POST /api/vram/release, GET /api/vram/check
  • Config schema: vram.* section with services budget map, thresholds, auto-upgrade

NVIDIA GPU Skills (v2026.10.42)

  • cuOpt — GPU-accelerated vehicle routing optimization via NVIDIA cuOpt v26.02 async API (tested live, VRP solved in 74ms)
  • AI-Q Research — Deep research via Nemotron Super 49B NIM API with citation extraction
  • OpenShell Sandbox — K3s-based secure code execution with declarative YAML policies
  • All gated behind TITAN_NVIDIA=1 env var

Fixes (v2026.10.41-42)

  • Voice mic leak — VoiceOverlay cleanup
  • Tool visibility — security.allowedTools default
  • OpenAI-compat keepModelPrefix bug
  • 6 TypeScript type errors
  • Voice system prompt rewrite (500 tokens vs 3000+)

Stats

  • 4,321 tests across 135 files (all passing)
  • 0 TypeScript errors, 0 ESLint errors
  • ~155 tools across 100+ skills