Skip to content

Make MauiDevFlow CLI AI-Agent Friendly #28

@Redth

Description

@Redth

Plan: Make MauiDevFlow CLI AI-Agent Friendly

Context

Based on Justin Poehnelt's article "Rewrite Your CLI for AI Agents", we evaluated the MauiDevFlow CLI against agent-first design principles. The plan was then reviewed by GPT-5.3-Codex, Gemini 3 Pro, and Claude Opus 4.5 — their feedback is synthesized below.

Key insight from reviewers: MauiDevFlow is a debugging/automation tool, not a REST API wrapper. Many article patterns (MCP, JSON input payloads, deep schema introspection) are over-engineered for our use case. The highest-impact changes are: consistent JSON output, context window discipline, and reducing agent round-trips.


Current State — What We Already Do Well ✅

  • --json flag on logs, network, wait, batch
  • JSONL streaming on logs --follow, network --json, batch
  • Comprehensive skill files (SKILL.md + 7 reference docs)
  • URL-escaping of all user inputs in AgentClient
  • Visual tree introspection via tree, element, query, hittest
  • Pagination via --limit / --skip on logs and network

Implementation Plan (Priority-Ordered)

Phase 1: Consistent Machine-Readable Output (P0)

Goal: Every command produces structured JSON when requested. Raw data on stdout, errors on stderr. Use exit codes for success/failure.

1a. Global --json flag + TTY auto-detection

  • Add --json as a global option (like --agent-port) for all commands
  • When Console.IsOutputRedirected is true (piped), default to JSON
  • Environment variable MAUIDEVFLOW_OUTPUT=json as override
  • --no-json to force human output when piped

1b. Output convention (NO envelope wrapper)

All three reviewers agreed: do NOT use a {success, data, error} envelope. Instead:

Success (exit 0): stdout contains raw data directly

[{"id": "abc123", "type": "Button", "text": "Submit"}]

This allows clean piping: maui-devflow MAUI tree --json | jq '.[0].id'

Failure (exit non-zero): stderr contains structured error:

{
  "error": "Element 'btn123' not found",
  "type": "RuntimeError",
  "retryable": false,
  "suggestions": ["Run 'MAUI tree' to refresh element IDs", "Found similar: 'btn124' (Button)"]
}

Error types distinguish InvocationError (bad flags/args) vs RuntimeError (app not running, element not found) — this helps agents decide "fix the command" vs "restart the app."

1c. Commands needing JSON output added

  • MAUI status → JSON status object
  • MAUI tree → JSON array of ElementInfo
  • MAUI element → JSON ElementInfo
  • MAUI query → JSON array of ElementInfo
  • MAUI hittest → JSON ElementInfo
  • MAUI tap/fill/clear/focus/navigate/scroll/set-property → JSON result
  • MAUI property → JSON {"value": "..."}
  • MAUI screenshot → JSON {"path": "...", "size": 12345}
  • MAUI alert detect/dismiss/tree → JSON
  • list → JSON array of agents
  • broker status → JSON

1d. Central OutputWriter abstraction

Create a shared OutputWriter class to eliminate ad-hoc Console.WriteLine branches across all handlers. This enforces consistency and makes adding new formats trivial.


Phase 2: Context Window Discipline (P0)

Goal: Reduce token waste. This is where debugging CLIs differ most from API wrappers — screenshots and full tree dumps are the real context hogs.

2a. Screenshot optimization

  • --max-dimension <px> — auto-resize to cap width/height (default: unlimited)
  • --quality <1-100> — JPEG compression (default: 80)
  • --format png|jpeg — format selection (JPEG much smaller)
  • Consider defaulting to element-level screenshots when --id is specified

2b. Tree output optimization

  • --depth N already exists — document that agents should always use --depth 3 or similar
  • Add --fields for client-side projection: MAUI tree --json --fields "id,type,text,automationId"
  • Add --format compact that returns only id, type, text, automationId, bounds (omits full property bags)

2c. --wait-until semantics (eliminate polling loops)

Agents waste enormous tokens polling for UI state changes. Add wait semantics:

MAUI query --automationId "ResultsList" --wait-until exists --timeout 10
MAUI query --automationId "Spinner" --wait-until gone --timeout 30

This eliminates the find→sleep→find→sleep→find pattern that burns tokens and round-trips.


Phase 3: Reduce Agent Round-Trips (P1)

Goal: Compound operations that eliminate multi-command sequences.

3a. Implicit element resolution on mutating commands

Instead of forcing query-then-act:

MAUI tap --automationId "LoginButton"        # Resolve and tap in one call
MAUI fill --automationId "Username" --value "admin"  # Find by automationId, fill
MAUI tap --type Button --index 0             # First button

Single command instead of two, halving token cost and eliminating stale-ID races.

3b. Post-action verification flags

MAUI tap abc123 --and-screenshot --output after-tap.png
MAUI tap abc123 --and-tree --depth 2

Avoids the tap→screenshot→parse round-trip.

3c. Assertion commands

MAUI assert --id abc123 --property Text --equals "Welcome!"
MAUI assert --id abc123 --property IsVisible --equals true

Returns {"passed": true} or {"passed": false, "expected": "Welcome!", "actual": "Loading..."}.


Phase 4: Input Hardening (P2)

Goal: Defense-in-depth, but scoped appropriately. This is a local debugging tool, not a public SaaS API — don't over-sandbox.

4a. Targeted validation (not blanket rejection)

  • Control characters in element IDs, routes, property names — reject chars below ASCII 0x20
  • Query fragments in IDs — reject ?, # in element IDs (likely hallucinated embedded query params)
  • Double-encoding detection — warn on % in element IDs
  • Don't sandbox file paths to CWD — agents/users legitimately need /tmp, workspace paths, etc.

4b. Output file safety

  • Add --overwrite flag for screenshot/recording commands
  • Default to fail-on-existing to prevent accidental clobbering

Phase 5: Lightweight Schema Discovery (P2)

Goal: Let agents discover command structure at runtime without consuming skill-file tokens.

5a. maui-devflow commands --json

Simple command listing with descriptions:

[
  {"command": "MAUI tree", "description": "Dump visual element tree", "mutating": false},
  {"command": "MAUI tap", "description": "Tap a UI element", "mutating": true},
  ...
]

5b. --help-json on any command

Dump System.CommandLine's built-in metadata as JSON for a specific command. Not a full schema system — just machine-readable help.

Note: All three reviewers agreed that full schema introspection (Phase 2a in original plan) is over-engineered. The skill file is the real schema for MauiDevFlow. A lightweight commands --json listing is sufficient.


Phase 6: Enhanced Skill Files (P2)

Goal: Encode agent-specific invariants that can't be intuited from --help.

6a. Add operational guidance

  • "Always use --json or rely on TTY auto-detection for machine-readable output"
  • "Always use --depth 3 for tree commands to avoid context overflow"
  • "Prefer query --automationId over full tree traversal"
  • "Element IDs are ephemeral — re-query after navigation or state changes"
  • "Use --wait-until instead of polling loops"
  • "Don't trust stale element IDs — refresh with tree or query after actions"

6b. Add canonical agent recipes

Document common multi-step workflows as compact patterns:

  • Login flow: wait → query → fill → tap → wait-until → screenshot
  • Element inspection: query → element → property
  • State verification: tap → wait-until → assert

Descoped / Removed Items

Based on reviewer consensus, these were removed from the plan:

Item Reason
MCP Server Over-engineered. CLI subprocess with JSON output already works well. Agents (Claude, Copilot) use shell commands constantly without MCP. Revisit only if shell-escaping failures become systematic.
Full Schema Introspection Skill file serves this purpose. ~25 stable commands don't warrant dynamic schema generation.
JSON Input Payloads Commands are flat (tap, fill, clear). No deeply nested structures that benefit from JSON input.
Dry-Run (broad) Debugging ops are not destructive. Tapping wrong button is visible and reversible. Only network clear warrants dry-run.
Response Envelope Adds token bloat, breaks jq composition. Use exit codes + raw data on stdout instead.

Reviewer Highlights

GPT-5.3-Codex

  • Reframe around "machine-stable outputs + state-assertive operations + low-token inspection defaults"
  • Prefer --expect-visible / --expect-enabled assertive flags over broad dry-run
  • Central OutputWriter abstraction to enforce consistency
  • Auto-JSON on redirect needs careful opt-out to not break existing scripts

Gemini 3 Pro

  • Drop the envelope anti-pattern — raw data on stdout, errors on stderr
  • Move TTY detection to P0 (it enables everything else)
  • MCP makes describe redundant — but both are over-engineered for this tool
  • Input hardening is over-valued for a local debugging tool
  • Distinguish InvocationError vs RuntimeError in error responses

Claude Opus 4.5

  • Missing: error recovery semantics (suggestions, retryable flag)
  • Missing: screenshot size optimization (the real context hog)
  • Missing: --wait-until to eliminate polling loops (huge token savings)
  • Missing: implicit element resolution (tap --automationId X) to halve round-trips
  • Missing: compound operations (--and-screenshot, --and-tree)
  • Over-engineered: MCP, full schema, JSON input, broad dry-run
  • Pitfall: TTY detection can break | grep and CI pipelines — prefer env var

Summary

Phase Description Effort Impact
1 Consistent JSON output + TTY detection + error semantics Medium Critical
2 Context discipline: screenshot optimization, field masks, --wait-until Medium High
3 Reduce round-trips: implicit resolution, compound ops, assertions Medium High
4 Input hardening (targeted) Low Medium
5 Lightweight schema discovery Low Medium
6 Enhanced skill files Low Medium

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions