Skip to content

v0.7.0 — Autonomous Windows Agent (6 phases)

Latest

Choose a tag to compare

@sandraschi sandraschi released this 15 Jun 17:22
· 81 commits to master since this release

v0.7.0 - Phases 1-6 Complete

Windows Computer Use for AI Agents - click, screenshot, type, drag, OCR, and verify native Windows UI via MCP. 22 portmanteau tools.

Phase 1: Autonomous Mission Engine

  • automation_mission(run=...) - give it a natural-language goal, it decomposes via LLM, executes each step with retry + verification, returns pass/fail with evidence
  • Outcome verification (verify=True) built into every click and set_text
  • Unified RetryPolicy with strategy chain (refocus, wait_stable, fallback, escalate)

Phase 2: Adaptive Intelligence

  • Adaptive element location - auto-cascades through title, auto_id, control_id, class+type, OCR region scan
  • Telemetry SQLite store - every action logged; query failure patterns per tool
  • automation_system(telemetry=True) - aggregate stats and top failure patterns

Phase 3: Macros and Workflows

  • automation_macro - record, stop, replay, replay_with_verify, list
  • automation_mission(workflow=True) - explicit multi-app steps with timeout and cross-step data chaining

Phase 4: Smart Discovery

  • automation_smart - discover (scan all windows), list_apps, list_controls, click (intent-based across all windows with LLM disambiguation)

Phase 5: Self-Healing and Event-Driven

  • Self-healing missions - checks window alive before each step, re-launches dead apps, aborts after 5 consecutive failures
  • Cross-app data flow - store_as and dollar-ref key references between workflow steps
  • automation_watch - background thread watchers for window_appears, window_closes, text_appears, element_appears

Phase 6: Closing the Loop

  • Telemetry-driven strategy selection - queries best historical strategy before each mission step
  • Updated agent instructions (CLAUDE.md, AGENTS.md, SKILL.md)
  • just smoke - quick import and register verification

Installer and Tooling

  • Tesseract OCR auto-install (NSIS checkbox or just install-tesseract)
  • Playwright e2e suite - 5 spec files, 19 tests, all routes plus REST API
  • Help page overhauled with 6 horizontal tabs mirroring the README stack
  • CUA-NSIS certified - 12 of 12 phases passed