v2.7.5
OpenGUI — Windows-native desktop automation for LLM agents
OpenGUI is a .NET 9 Windows service that gives LLM agents (Claude, GPT, DeepSeek, Hermes) direct control over desktop applications. Communicate over a named pipe with JSON commands.
Key features:
- Named-pipe IPC — sub-millisecond dispatch, no TCP overhead
- UIA tree perception — CacheRequest-optimized, ~11ms for modal scan
- SendInput execution — keyboard, mouse, hotkeys with timing control
- Divergence detection — ActionTruth verification after every action
- Phase 3 agent loop — 7/7 automated tests (open Notepad → type → save → verify → close)
- Crash recovery — external watchdog with challenge-response heartbeat + Job Object isolation
- WinRT OCR — Windows-native OCR with Tesseract fallback
- Overlay feedback — real-time desktop highlights showing agent intent
v2.7.5 updates
- CacheRequest in DetectModals — 12,000ms timeout → 11ms (1080x improvement)
- All UIA tree walks use el.Cached instead of el.Current
- detect-modals bounded test: 5/5 PASS, 11ms
Test suite: 36/36 passing
- Hardening: 14/14
- Timeout Governor: 5/5
- Phase 2 stress: 17/17
Install
powershell -ExecutionPolicy Bypass -File dist\OpenGUI\install.ps1
Connect
from reliable_bridge import send
print(send('status'))