Skip to content

Make MauiDevFlow CLI AI-Agent Friendly#29

Merged
Redth merged 12 commits intomainfrom
feature/ai-agent-friendly-cli
Mar 8, 2026
Merged

Make MauiDevFlow CLI AI-Agent Friendly#29
Redth merged 12 commits intomainfrom
feature/ai-agent-friendly-cli

Conversation

@Redth
Copy link
Copy Markdown
Owner

@Redth Redth commented Mar 8, 2026

Summary

Implements Issue #28 — making the MauiDevFlow CLI optimized for AI agent consumption.

Changes

Phase 1-2: Structured Output & JSON Mode

  • OutputWriter — central output abstraction supporting JSON and human-readable modes
  • --json / --no-json flags with TTY auto-detection (JSON when piped, human when interactive)
  • --fields projection and --format compact for reduced token usage
  • --wait-until exists|gone polling with configurable timeout

Phase 3-6: Implicit Resolution, Assertions & Schema Discovery

  • Implicit element resolution (--automationId, --type, --text, --index) eliminates the query→act round-trip
  • Post-action flags (--and-screenshot, --and-tree) for single-call verification
  • MAUI assert command for property value assertions (exit code 0/1)
  • commands --json for runtime schema discovery
  • SKILL.md comprehensive AI agent guidance

Screenshot Improvements

  • --max-width server-side resize via SkiaSharp for HiDPI displays
  • Auto-DPI scaling — screenshots automatically scaled to 1x using per-window display density
  • Platform-specific GetWindowDisplayDensity(IWindow?) (iOS UIScreen.Scale, Android DisplayMetrics, Windows RasterizationScale, macOS BackingScaleFactor, GTK GetScaleFactor)

Comprehensive Scroll Support

  • 6-priority scroll resolution: itemIndex → elementId+ItemsView → elementId+ScrollView → scrollable element → native scroll → first scrollable on page
  • --item-index / --group-index / --position for virtualized CollectionView/ListView
  • TryNativeScroll platform overrides (iOS UIScrollView, Android RecyclerView, Windows ScrollViewer, GTK ScrolledWindow)
  • Searches self → subviews → ancestors for native scroll views (CollectionView wraps scroll view inside container)
  • itemCount in tree metadata for ItemsView elements

Bug Fixes

  • Fixed --and-screenshot triggering on every actionSetDefaultValue(null) on ZeroOrOne arity options caused FindResultFor() to always return non-null
  • Fixed no-element scroll using wrong page — uses Shell.CurrentPage instead of root page, prioritizes ItemsView over ScrollView
  • Fixed iOS dialog detection regression — alert commands now use global broker-aware --agent-port instead of hardcoded port 9223; ResolveAlertPlatformAsync intelligently detects platform from udid/pid/agent/booted-sim

Cross-Platform Validation

  • ✅ iOS Simulator (iPhone 11, iOS 26.2) — all scroll, screenshot, dialog operations verified
  • ✅ Android Emulator (API 35) — all scroll operations verified
  • ✅ Mac Catalyst — all scroll operations verified
  • ✅ macOS AppKit — broker port assignment verified (app crashes on unrelated Platform.Maui.MacOS bug)

Files Changed (13 files, +1762/-299)

  • src/MauiDevFlow.CLI/Program.cs — Major refactor: OutputWriter, implicit resolution, assert, scroll, alert fix
  • src/MauiDevFlow.CLI/OutputWriter.cs — New: structured output abstraction
  • src/MauiDevFlow.Agent.Core/DevFlowAgentService.cs — HandleScroll rewrite, auto-DPI screenshots
  • src/MauiDevFlow.Agent/DevFlowAgentService.cs — Platform-specific TryNativeScroll + GetWindowDisplayDensity
  • src/MauiDevFlow.Agent.Gtk/GtkAgentService.cs — GTK TryNativeScroll + GetWindowDisplayDensity
  • src/MauiDevFlow.Agent.Core/VisualTreeWalker.cs — itemCount for ItemsView
  • src/MauiDevFlow.Driver/AgentClient.cs — ScrollAsync + ScreenshotAsync params
  • .claude/skills/maui-ai-debugging/SKILL.md — Comprehensive AI agent guidance

Closes #28

Redth and others added 12 commits March 5, 2026 15:42
- Add central OutputWriter abstraction for consistent JSON/human output
- Add global --json and --no-json flags (available on all commands)
- TTY auto-detection: default to JSON when stdout is piped/redirected
- Support MAUIDEVFLOW_OUTPUT=json environment variable override
- Structured error output on stderr with error type, retryable flag, and suggestions
- Add JSON output to all MAUI commands: status, tree, query, element, hittest,
  tap, fill, clear, focus, navigate, scroll, resize, property, set-property,
  screenshot, alert detect/dismiss/tree
- Add JSON output to: list, broker status, network detail/clear, logs
- Migrate existing per-command --json options (logs, network, wait, webviews)
  to use the global --json flag
- Add --fields option to tree and query for client-side field projection
  (e.g. --fields "id,type,text,automationId")
- Add --format compact option for tree/query (minimal field set)
- Add --wait-until exists|gone with --timeout to query command
  (eliminates agent polling loops, saves tokens)
- Error types distinguish InvocationError vs RuntimeError

Closes #28

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…t guidance (Issue #28, Phases 3-6)

Phase 3 — Reduce Agent Round-Trips:
- Implicit element resolution: tap/fill/clear/focus accept --automationId,
  --type, --text, --index to resolve and act in one command
- Post-action flags: --and-screenshot, --and-tree, --and-tree-depth on
  tap/fill/clear to verify actions without extra round-trips
- MAUI assert command: quick property assertions with pass/fail result

Phase 4 — Input Hardening:
- Validate element IDs: reject control chars (<0x20), ? and # (hallucinated
  query params), warn on % (double-encoding)
- Screenshot --overwrite flag: default fail-on-existing to prevent clobbering

Phase 5 — Schema Discovery:
- commands subcommand: lists all ~50 commands with descriptions and mutating
  flag as JSON for runtime schema discovery

Phase 6 — Enhanced Skill Files:
- Agent best practices section in SKILL.md: output format, token reduction,
  round-trip elimination, element ID lifecycle
- Canonical workflow recipes: login flow, element inspection, state verification

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Update command reference table with all new CLI options:
  --json/--no-json global flags, --fields, --format compact,
  --wait-until, --overwrite, implicit resolution (--automationId,
  --type, --text, --index), post-action flags (--and-screenshot,
  --and-tree)
- Add missing commands: assert, commands --json
- Fix incorrect --text flag in fill examples (text is positional)
- Document implicit element resolution and post-action flags
  prominently above command table
- Update typical inspection/interaction flows with new features
- Add element ID lifecycle guidance to command reference
- Remove per-command --json options (now global)

Eval results: improved skill 100% pass rate vs 93.3% old skill
across 3 test scenarios (16 assertions).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
MAUI visual trees are deeply nested — a simple control is typically
at depth 10-15, not 3. Updated all depth recommendations from 3 to
15, and added an 'Adaptive Depth Learning' section encouraging
agents to observe where controls appear in their first tree dump
and reuse that depth for subsequent calls.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
HiDPI displays (2x, 3x) produce screenshots 2-3x larger than needed
for AI grounding. Add --max-width option that resizes the captured
PNG server-side (in the agent via SkiaSharp) before transfer.

- Agent: parse maxWidth query param, resize via SKBitmap.Resize()
- Driver: pass maxWidth through AgentClient.ScreenshotAsync()
- CLI: add --max-width option to screenshot command
- SKILL.md: recommend --max-width 800 for AI agents, add
  'Screenshot Size Reduction' guidance section

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Screenshots are now automatically scaled to 1x logical resolution by default.
The agent detects each window's display density via platform-specific APIs:
- iOS/Mac Catalyst: UIWindow.Screen.Scale
- Android: Activity.Resources.DisplayMetrics.Density
- Windows: XamlRoot.RasterizationScale
- macOS AppKit: NSWindow.BackingScaleFactor
- GTK: Widget.GetScaleFactor()

This reduces a 3x iPhone screenshot from 1320x2868 (277KB) to 440x956 (52KB)
without any CLI flags needed. Use --scale native for full resolution.

Also adds displayDensity to the /api/status response and updates SKILL.md
to remove the --max-width 800 recommendation (no longer needed).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Removed SetDefaultValue(null) from andScreenshotOption. With the default
value set, System.CommandLine's FindResultFor() always returned non-null,
causing HandlePostActionFlags to take a screenshot after every action
regardless of whether --and-screenshot was specified.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…e to SKILL.md

Based on ControlGallery testing hurdles (H5-H8, H12-H13):
- Shell routes are case-sensitive, discovered via AppShell.xaml
- Flyout items use generated IDs, dismissal via FlyoutIsPresented
- CollectionView items must be tapped via container, not inner elements
- scroll command doesn't work with CollectionView (native scrolling)
- --text resolution searches entire tree including hidden pages
- Added Shell navigation canonical workflow

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Completely rewrites HandleScroll with multi-tier resolution:
1. Item-index scrolling: --item-index N scrolls to a specific item in
   CollectionView/ListView, even virtualized off-screen items
2. Smart scroll-into-view: elements inside an ItemsView find their data
   item via BindingContext and scroll to the correct index
3. Platform-native delta scroll: TryNativeScroll() virtual method with
   overrides for iOS (UIScrollView), Android (RecyclerView),
   Windows (ScrollViewer), and GTK (ScrolledWindow)
4. ScrollView fallback: existing behavior preserved

Also adds:
- itemCount in tree metadata for ItemsView elements (via NativeProperties)
- --item-index, --group-index, --position CLI options
- ScrollToPosition support (MakeVisible/Start/Center/End)
- Updated SKILL.md with CollectionView scrolling guidance

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
CollectionView's handler wraps the actual UICollectionView (UIScrollView)
inside a container UIView. The native scroll search must check the view
itself, then search subviews (descendants), then ancestors.

Applied the same fix to Android (RecyclerView may be a child of the
handler's platform view) and Windows (already had descendant search
via FindWinUIScrollViewerInChildren, now consolidated to generic
FindWinUIDescendant<T>).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When no element is specified for delta scroll, the handler searched the
entire Shell page tree (all tabs) for scrollable views. This found
ScrollViews on hidden pages (e.g., Home) instead of the CollectionView
on the active page.

Fix: Use Shell.CurrentPage to scope the search to the visible page,
and check ItemsView before ScrollView since CollectionView/ListView
are more common scroll targets in modern MAUI apps.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Alert commands (detect, dismiss, tree) previously defined their own
--agent-port and --platform options with hardcoded defaults (port 9223,
platform 'auto'). With the broker-based port assignment, port 9223 could
belong to a different agent (e.g. Mac Catalyst), causing platform
auto-detection to return 'maccatalyst' instead of 'ios-simulator'.

Changes:
- Alert commands now use global --agent-host and --agent-port options
  (broker-aware port resolution)
- ResolveAlertPlatformAsync no longer takes a platform parameter;
  instead it intelligently detects based on: udid → iOS simulator,
  pid → Mac Catalyst/Windows, agent status, booted simulator check
- When the connected agent reports MacCatalyst but a booted iOS
  simulator exists, prefer iOS simulator (the common alert use case)
- Removed duplicate local option definitions (15 options → 6)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@Redth Redth merged commit 003ea57 into main Mar 8, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make MauiDevFlow CLI AI-Agent Friendly

1 participant