Skip to content

CLI/MCP parity: daemon-based architecture with 8 new subcommands#113

Merged
obj-p merged 55 commits intomainfrom
cli-mcp-parity
Apr 19, 2026
Merged

CLI/MCP parity: daemon-based architecture with 8 new subcommands#113
obj-p merged 55 commits intomainfrom
cli-mcp-parity

Conversation

@obj-p
Copy link
Copy Markdown
Owner

@obj-p obj-p commented Apr 15, 2026

Summary

Brings every MCP tool to the CLI and unifies both surfaces behind a single long-running daemon. Before this branch, the CLI ran AppKit in-process per invocation and exposed only run / snapshot / variants / list. After, every subcommand is a thin daemon client and serve is the sole AppKit entry point.

User-visible changes

  • 8 new subcommands: configure, switch, elements, touch, simulators, stop, plus operational status / kill-daemon.
  • Magical session reuse: snapshot and variants reuse an existing session for the target file when one is running (seconds, not tens of seconds). Fall back to ephemeral capture when no session exists.
  • run --detach: start a session and hand back the sessionID on stdout; script-friendly.
  • Auto-started daemon: first CLI invocation spawns previewsmcp serve --daemon in the background. Unix socket at ~/.previewsmcp/serve.sock. ADB-style — users don't think about daemon lifecycle.

Architectural changes

  • Daemon foundation: NWListener on a Unix domain socket, per-connection MCP.Server, setsid() detachment, atomic PID file, flock-serialized startup.
  • Session resolver: --session <uuid> > --file <path> > sole-running-session. Shared across all session-scoped commands.
  • Client-side platform resolution: detects xcodeproj/xcworkspace up front so the daemon never runs swift package describe on non-SPM trees (fixes a previously-observed hang).
  • Atomic rename for setup dylib: build to .tmp, codesign, rename(2). Fixes a concurrent-link race where the setup dylib briefly didn't exist.
  • PreviewTraits.Field + clearing(_:): explicit mechanism to clear traits (distinct from "no change"). Fixes a silent-no-op bug on `configure --color-scheme ""`.
  • Bug fix: silent success on missing sessions. handlePreviewStop and handlePreviewSnapshot both silently succeeded (or failed with misleading errors) for typo'd sessionIDs. Both now return isError: true with "No session found for ".

Developer experience / refactors

  • Shared DaemonToolError: collapsed 8 copies of XxxCommandError.daemonError(String) into one type.
  • withDaemonClient helper: eliminates ~15 lines of connect/register/disconnect boilerplate per command, and enforces the correct ordering (register log handler before connect(), else handshake-phase notifications are dropped).
  • DaemonTestLock: cross-target flock file so CLIIntegrationTests and MCPIntegrationTests don't race on the daemon socket.
  • @Suite(.serialized) throughout and per-test cleanSlate() that kills any stale daemon before each scenario.

Stats

  • +4,709 / −1,032 lines net
  • 28 source files changed (across CLI, Core, macOS, iOS)
  • ~140 integration tests across CLI and MCP suites; all green locally

Scope explicitly deferred

  • Structured daemon responses (JSON-per-tool-result instead of human-readable prose + regex parsing). The CLI parses responses with regex today — e.g. [N] label: ERROR — <reason>. Works but fragile; we already shipped a Critical-severity fix for a label-substring mis-classification in Migrate variants to daemon client #109. Worth a dedicated follow-up PR.
  • iOS-session variants test, exit-code branch tests for variants (partial/total failure), and shared @OptionGroup for --session/--file.
  • Dead cases in PreviewHost.Mode.{interactive, snapshot} after the AppKit simplification.
  • README/docs update covering the daemon model and the new subcommands.

Commits on this branch

14 merged sub-PRs (#98, #102#109 for parity; #110#112 for refactor/audit).

Test plan

  • swift build
  • Full daemon-client test suite (59 tests across 9 *CommandTests suites) — green
  • MacOSMCPTests (3 tests) — green
  • BuildSystemTests xcodeproj/WORKSPACE regressions — green
  • Manual smoke: end-to-end ephemeral variant capture, multi-session stop --all, iOS touch round-trip against a real simulator

🤖 Generated with Claude Code

obj-p and others added 14 commits April 13, 2026 21:24
Introduces `previewsmcp serve --daemon`, which runs the MCP server on a
Unix domain socket at ~/.previewsmcp/serve.sock. Multiple CLI clients can
connect concurrently; each connection gets its own MCP.Server instance
sharing module-level state (IOSState, ConfigCache) for consistent session
visibility.

New commands:
- `serve --daemon` — starts the daemon listener
- `status` — reports daemon liveness via socket connect
- `kill-daemon` — graceful SIGTERM with stale PID cleanup

Supporting infrastructure:
- DaemonPaths — filesystem constants (~/.previewsmcp/{serve.sock, serve.pid})
- DaemonLifecycle — PID file management + signal handlers
- DaemonListener — NWListener on UDS, accepts connections, spawns per-conn servers

Existing `serve` (stdio mode, used by Claude/Cursor MCP integration) is
unchanged and its integration tests still pass.

First PR in the stack for the CLI/MCP parity spec (docs/cli-mcp-parity-spec.md).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Changes from self-review of PR 97:

- Critical: fix startup race where a second `serve --daemon` could unlink
  the first daemon's socket and clobber it. Previous check was PID-based,
  which misses the window between bind and PID-write (and also misses the
  case where someone deletes the PID file manually). Replaces with a
  socket `connect()` probe via new DaemonProbe module — authoritative
  since the kernel tracks socket-to-fd associations atomically.

- Share Compiler across daemon connections. `configureMCPServer` now
  accepts `sharedCompiler:`; DaemonListener builds one at startup and
  passes it to each accepted connection, avoiding ~seconds of per-client
  xcrun/SDK resolution cost. Stdio mode still creates its own.

- Extract `DaemonProbe.canConnect()` — shared by ServeCommand (startup
  guard) and StatusCommand (liveness report). Previously duplicated.

- Extract `DaemonLifecycle.daemonRunningPID()` — returns PID only if
  alive, replaces two-step readPID/isProcessAlive calls at three sites.

- Readability:
  - Remove unneeded @mainactor from DaemonListener.start
  - Replace clever AsyncStream-based ready-signaling with
    withCheckedThrowingContinuation
  - Remove dead runForever() (NSApp.run() handles this)

- Add regression test secondDaemonRefusesWithMissingPIDFile that would
  hang without the fix (second daemon rebinds and runs forever). Uses a
  bounded wait so it fails fast if the race returns.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Migrate snapshot to daemon client with magical session reuse

Third PR in the CLI/MCP parity stack. `snapshot` becomes a thin daemon
client that reuses an existing preview session when one is running for
the target file, and falls back to an ephemeral session otherwise
(create → capture → stop). No more in-process NSApplication + preview
compile from `snapshot` itself — all of that moves into the daemon.

The magical resolution:
- `--session <uuid>` wins if passed.
- `--file <path>` (or the positional file arg): if exactly one active
  session has that source file, reuse it; else error on ambiguity, or
  ephemeral on miss.
- No flag: if exactly one session is running, reuse it.

Changes:

- New SessionResolver: uses the new `session_list` MCP tool to map
  CLI flags to a session ID. Clear error messages on ambiguity or
  missing session.
- New `session_list` MCP tool: returns one tab-delimited line per
  active session (iOS + macOS). Used by SessionResolver; also useful
  for future discovery commands.
- SnapshotCommand rewrite: AsyncParsableCommand using DaemonClient.
  Handles both reuse and ephemeral paths. Traits are applied only on
  ephemeral start; reusing a session that was configured differently
  prints a note and snapshots with the session's current traits.
  Format is inferred from the output file extension (`.png` → PNG,
  `.jpg`/`.jpeg` → JPEG) unless `--quality` is explicit.
- Platform resolution moved client-side. Previously the daemon called
  `inferredPlatformAsync` when no `--platform` was passed, which could
  hang on xcodeproj sources: the walk-up for Package.swift falls
  through to the repo root, then `swift package describe` runs on a
  large package and can sit forever. New `findPackageDirectory`
  short-circuits when it crosses an xcodeproj/xcworkspace/WORKSPACE
  boundary before finding Package.swift, returning nil instead.
  SnapshotCommand resolves platform locally and always passes an
  explicit `platform` to the daemon.
- Fix MCP tool arg names: `project` → `projectPath`, `device` →
  `deviceUDID`. RunCommand had the same bug; fixed there too.
- PreviewsMacOS.PreviewHost exposes `allSessions` so the new
  `session_list` tool can enumerate macOS sessions alongside iOS.
- DaemonTestLock wraps every SnapshotCommandTests case, matching the
  pattern used by RunCommandTests and DaemonLifecycleTests — without
  it, the daemon state races across suites when they run in parallel.
- SnapshotCommand and RunCommand no longer live in the
  "needs NSApplication" branch of PreviewsMCPApp.main. They're pure
  daemon clients.

Test plan:
- 13 existing snapshot tests still pass (with daemon path instead of
  in-process).
- 243/243 non-iOS tests pass in the full concurrent sweep.
- Manual: `previewsmcp run --detach` then `previewsmcp snapshot <same
  file>` captures the live window in ~0.4s vs ~30s for a cold
  ephemeral.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Address review nits on PR #102

Five small improvements from the review:

1. Surface preview_stop failures on ephemeral teardown. Extract
   stopEphemeralSession helper so a failed stop logs a warning instead
   of being silently swallowed by `try?`. Leaked sessions in a
   long-lived daemon could otherwise confuse future reuse matching.

2. Remove the dead `if lines.isEmpty` branch in handleSessionList —
   `[].joined(separator:)` already produces "".

3. Extract SessionResolver.resolveAgainst as a pure (no MCP client)
   function expressing the resolution policy. Makes the three
   decision branches (file-matches / file-ambiguous / sole-session)
   explicit and testable in principle, even though the current test
   targets are structured such that we exercise them through the
   integration tests rather than import-based unit tests.

4. Regression tests for findPackageDirectory's non-SPM boundary
   detection: one for xcodeproj-sibling, one for Bazel WORKSPACE.
   Guards against reintroducing the hang we chased down in #102.

5. New integration test: "Snapshot reuses an already-running session
   instead of ephemeral". Starts a detached run, then snapshots the
   same file, and asserts the snapshot completes in under 2s —
   reuse is sub-second whereas ephemeral is 5+ seconds. Proves the
   magical resolution actually reuses sessions rather than silently
   falling through to ephemeral.

30/30 daemon-touching tests pass in the targeted sweep.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Add configure command for live trait changes

Fourth PR in the CLI/MCP parity stack. \`previewsmcp configure\`
forwards trait changes to the daemon's \`preview_configure\` MCP tool.

Session targeting mirrors \`snapshot\`:
- \`--session <uuid>\` — explicit
- \`--file <path>\` — resolve by source file
- no flag — use the sole running session

Unlike \`snapshot\`, there's no ephemeral fallback: configuring a
session that doesn't exist is an error (suggesting \`run --detach\`
as the remedy).

Supported traits:
- \`--color-scheme\` (light / dark)
- \`--dynamic-type-size\` (xSmall..accessibility5)
- \`--locale\` (BCP 47)
- \`--layout-direction\` (leftToRight / rightToLeft)
- \`--legibility-weight\` (regular / bold)

Pass an empty string to clear a trait (matches the daemon's signal
for reverting an earlier override).

Validation happens client-side via \`PreviewTraits.validated\` so bad
values fail before an RPC round-trip. Empty strings are allowed
through as the clear-signal.

Tests (5): validation rejects missing traits, validation rejects
invalid values, no-session error path, dark-mode round-trip
(snapshot before + after verifies the change actually took effect),
and --session explicit targeting.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Make "empty string clears trait" actually work in preview_configure

The review of PR #103 caught a real bug: every layer advertised that
passing an empty string to preview_configure would clear a trait
override, but the implementation silently preserved the existing value.

Root cause: parseTraits normalized empty strings to nil, then
PreviewSession.reconfigure used PreviewTraits.merged which does
`other.field ?? self.field` — nil in `other` keeps self's value.
There was no way to signal "clear this field" through the pipeline.

Fixes:

- Add PreviewTraits.Field enum + PreviewTraits.clearing(_:) that
  nulls out the listed fields. clearing is the only way to revert a
  previously-set trait; merged alone can't do it.

- PreviewSession.reconfigure / IOSPreviewSession.reconfigure now
  take an optional clearing: Set<PreviewTraits.Field> and apply
  both merge and clear: `self.traits = self.traits.merged(with: traits).clearing(clearing)`.

- parseTraits now returns a three-tuple (traits, clearedFields,
  error?). clearedFields is populated by inspecting the raw MCP
  params for explicit empty-string values, since
  PreviewTraits.validated normalizes them away before the handler
  can see them.

- handlePreviewConfigure passes the clearedFields set through to
  reconfigure, and considers a call "no-op" only when both traits
  and clearedFields are empty.

- handlePreviewStart explicitly ignores clearedFields — starting a
  session has no existing traits to clear.

New test: configureClearsTrait does a set-then-clear round trip and
asserts the daemon's response summary lists colorScheme=dark after
setting, and has no colorScheme= entry after clearing. Uses the
summary rather than pixel diff because SwiftUI without an explicit
colorScheme falls back to the OS setting, which is dark on a
dark-mode Mac — so a pixel diff would spuriously pass even without
the fix.

All 6 ConfigureCommandTests pass; 235 total non-iOS tests still
green.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Add switch command for changing active #Preview in a live session

Fifth PR in the CLI/MCP parity stack. \`previewsmcp switch <index>\`
forwards to the daemon's \`preview_switch\` MCP tool — recompiles the
session with a different #Preview block selected. @State is reset;
traits (color scheme, dynamic type, locale, …) persist across the
switch.

Session targeting mirrors \`configure\` and \`snapshot\`:
- \`--session <uuid>\` — explicit
- \`--file <path>\` — resolve by source file
- no flag — use the sole running session

As with \`configure\`, switching a session that doesn't exist is an
error (no ephemeral fallback — a switch only makes sense against a
live session).

Client-side validation rejects negative indices before the RPC round
trip. Out-of-range positive indices surface the daemon's "No preview
at index N" error.

Tests (4): negative-index validation, no-session error path, happy-
path round trip (snapshot preview 0, switch to 1, snapshot, assert
different bytes), and out-of-range-index error.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Extract textFromContent helper to a shared extension

The review of PR #104 flagged that textFromContent was now duplicated
across SessionResolver, SnapshotCommand, ConfigureCommand, and
SwitchCommand (4 files) — past the rule-of-three, worth extracting.

Adds MCPContentHelpers.swift with an extension on [Tool.Content]:

    extension Array where Element == Tool.Content {
        func joinedText() -> String { ... }
    }

Call sites become `response.content.joinedText()` instead of
`textFromContent(response.content)`. Reads slightly better at the
call site as a property-like accessor, matches the shape future
commands (elements, touch, stop, simulators) will want.

No behavior change — the four private helpers were byte-identical.
All 35 daemon-touching tests still pass.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Add elements command for dumping iOS accessibility trees

Forwards to the daemon's preview_elements MCP tool and prints the
resulting JSON to stdout so callers can pipe into jq or consume
from scripts. Session targeting mirrors configure/switch: --session,
--file, or the sole running session.

Also ignores .claude/scheduled_tasks.lock so it stops surfacing in
git status.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Address review feedback: tighten macOS-rejection assertion, guard empty stdout

- The macOS-rejection test matched "ios" in lowercased stderr, which
  could incidentally pass on unrelated log lines. Pin it to the
  daemon's exact iOS-only message from MCPServer.swift:871.
- Skip printing a trailing newline when the daemon returns an empty
  payload so downstream jq parsers don't see a stray \n.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Add touch command for injecting taps and swipes into iOS previews

Forwards to the daemon's preview_touch MCP tool. Defaults to a tap;
pass --to-x and --to-y together to perform a swipe, with an optional
--duration. Session targeting mirrors configure/switch/elements.

Validation for --to-x / --to-y pairing and --duration's swipe-only
semantics runs before opening the daemon connection so stray local
mistakes never involve the socket.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Address review feedback: reject non-positive --duration and tighten assertions

- Guard against `--duration 0` and negative values locally before the
  daemon call; WDA's behavior for non-positive durations is undefined.
- Pin the happy-path tap/swipe assertions to the full coordinate
  string (Tap sent at (120, 200), Swipe from (40,300) to (300,300))
  so a regression mis-wiring x/y or toX/toY can't slip past a loose
  substring match.
- Add a local-validation test for the non-positive duration guard.

Declined the reviewer's stdout-vs-stderr suggestion: the current
behavior (operational confirmations on stderr) matches Switch and
Configure. Only Elements uses stdout, and only because its payload
is JSON meant for piping.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Add simulators command for listing available iOS devices

Forwards to the daemon's simulator_list MCP tool. Writes one line
per available device to stdout in the form `<name> — <udid>
[BOOTED] (<runtime>)`, suitable for piping into grep or fzf to pick
a UDID for `--device` on run / snapshot.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Address review feedback: always surface simulator_list output

- simulator_list always returns either device lines or a
  "No available simulator devices found." sentinel. The previous
  `if !text.isEmpty` guard was copied from elements (where empty
  JSON is plausible) and silently dropped the sentinel. Drop the
  guard so bare-runner users see the daemon's reply.
- Broaden the integration test to run on every machine and branch
  the assertion on whether a simulator happens to be available. This
  exercises daemon auto-start on bare runners where the previous
  early-return made the test a no-op.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Add stop command for closing preview sessions

Forwards to the daemon's preview_stop MCP tool. Session targeting
mirrors configure/switch/elements (--session > --file > sole
running session). Adds --all to iterate over session_list and stop
every active session — useful before cleanly shutting down the
daemon.

--all is incompatible with --session/--file. --all on an empty
daemon is a no-op success. Individual stop failures inside --all
are reported but don't abort the sweep; the first error is thrown
after the loop completes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Address review: fix silent stop-nonexistent bug and tighten tests

Daemon:
- `handlePreviewStop` now verifies session existence (iOS via
  `iosState.getSession`, macOS via `App.host.allSessions`) before
  proceeding. Missing sessions return `isError: true` with a "No
  session found for <id>" message. Previously the macOS path called
  `App.host.closePreview` unconditionally — which uses
  `removeValue(forKey:)` — and silently succeeded on typos.
- Updated MacOSMCPTests' preview_stop-nonexistent assertion to
  match: now asserts isError == true rather than documenting the
  old "phantom closed" behavior.

CLI:
- `stop --session <typo>` now surfaces the daemon's error instead
  of succeeding silently.
- Removed the duplicate stderr print inside the --all loop; the
  thrown first-failure is surfaced once by ArgumentParser.
- Added a comment on the sequential-stop choice.

Tests:
- Strengthened `stopSoleSession` follow-up: calls `stop --all`
  (which consults session_list directly) to prove daemon-side
  removal, plus an explicit `--session <ghost-uuid>` check for the
  new isError path.
- Added an iOS-branch stop test that exercises
  `iosState.getSession` + `iosSession.stop()`, gated on simulator
  availability.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Migrate variants to daemon client — no more in-process AppKit

Rewrites VariantsCommand as an AsyncParsableCommand that forwards
to the daemon's preview_variants MCP tool. Matches SnapshotCommand's
"magical" session resolution: reuse an existing session for the
target file (or --session), otherwise spin up an ephemeral session
for the capture and tear it down afterwards.

The daemon returns variants as alternating text+image content
blocks (or a single ERROR text per failed variant). The CLI walks
the stream, decodes each base64 image, and writes it to
<outputDir>/<label>.<ext>. Exit codes match the legacy tool: 0 all
success, 1 partial / validation error, 2 total failure.

As a consequence `serve` is now the only subcommand that drives
NSApplication in-process. Dropped the VariantsCommand special case
from PreviewsMCPApp.swift and simplified the mode/host wiring.

Existing variants tests continue to pass — wrapped them in
DaemonTestLock since they now hit the daemon — plus a new test
confirms the magical reuse branch preserves an active session.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Address review: fix ERROR-in-label misclassification and related issues

Critical fix:
- The response-stream parser used a loose `text.contains("ERROR")`
  check that mis-bucketed any successful variant whose label
  contained "ERROR" (e.g. `{"label":"ERROR_STATE"}`) as a failure,
  silently dropping the image. Replaced with two strict regexes:
  parseSuccessPreamble matches `[N] <label>:` anchored to end of
  string; parseFailurePreamble matches `[N] <label>: ERROR — `
  anchored to the literal daemon separator. Regression test added.

Other review items:
- Reject `--format jpeg --quality 1.0` upfront. Previously the
  value propagated to the daemon where `quality >= 1.0` flips the
  output to PNG — writing PNG bytes into a .jpg file.
- Warn on orphan image blocks (image with no preceding preamble)
  instead of silently dropping them, so a daemon protocol
  regression is visible rather than invisible.
- Dropped unused `isEphemeral` parameter from `captureVariants`.
- Inlined `PreviewHost(mode: .serve)` since serve is now the only
  subcommand that reaches that branch. `PreviewHost.Mode.{interactive,
  snapshot}` cases are now unused — left in place for a followup
  cleanup since they're public API.

Deferred: partial-failure and total-failure exit-code tests. Both
require forcing the daemon to report per-variant failures
deterministically, which isn't cheap to construct. Filed mentally
as a followup — the `summarize` logic is straightforward enough
that the regression window is small.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
#110)

Every daemon-client command had its own single-case
XxxCommandError.daemonError(String) with an identical description
implementation — 8 copies of the same enum across ~80 lines. Replace
them with one shared DaemonToolError in DaemonToolError.swift:

- ElementsCommandError, SwitchCommandError, ConfigureCommandError,
  TouchCommandError, SimulatorsCommandError, StopCommandError,
  VariantsCommandError: deleted. All throw sites now use
  DaemonToolError.daemonError.
- SnapshotCommandError keeps its two snapshot-specific cases
  (invalidImageData, noImageContent). Its daemon-error throws are
  redirected through DaemonToolError.

No behavior change — the thrown messages are byte-for-byte identical
because both enums just surface the daemon's text verbatim. All 41
tests across the 7 daemon-client suites pass.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…erplate (#111)

Every daemon-client command repeated the same 15-line dance:
  let client = try await DaemonClient.connect(...) { client in
      await client.onNotification(LogMessageNotification.self) { ... }
  }
  do { ...work...; await client.disconnect() }
  catch { await client.disconnect(); throw error }

All three parts are load-bearing:
  * The log forwarder MUST be registered before `connect()` or
    handshake-phase notifications are dropped (a footgun we hit
    early in the stack).
  * `disconnect()` must run on both success AND error paths or the
    transport leaks.
  * The log-to-stderr bridge is identical across all commands.

Add `DaemonClient.withDaemonClient(name:configure:body:)` that
enforces the shape once:
  * Connects with the clientName.
  * Registers the stderr log-forwarder by default.
  * Runs `body`, disconnects on success.
  * Disconnects + rethrows on failure.
  * Extra handlers can still be registered via `configure`.

Refactored all 9 daemon-client commands: Run, Snapshot, Configure,
Switch, Elements, Touch, Simulators, Stop, Variants. Each command
loses ~12 lines of boilerplate and gains safety — the
disconnect-on-every-path invariant is now a property of the helper,
not something each command has to remember.

All 59 daemon-client tests pass.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Follow-up to PR #108's fix for `handlePreviewStop`. I walked every
MCP handler that accepts a sessionID to find other places that
would silently succeed (or fail with a misleading error) for an
unknown UUID.

Findings:
- preview_stop: already fixed in PR #108
- preview_elements, preview_touch: iOS-only, already guard with
  `iosState.getSession` + isError
- preview_configure, preview_variants, preview_switch: already
  guard both branches with explicit isError
- preview_snapshot: HAD THE HOLE. The macOS path threw
  `SnapshotError.captureFailed` from `window(for:)` on unknown
  sessionIDs, which surfaced as a misleading "capture failed"
  error instead of "No session found".

Fix mirrors handlePreviewStop: check existence via
`App.host.allSessions[sessionID] != nil` upfront, return
`isError: true` with "No session found for <id>." if missing.

Added an MCP-level regression test alongside the existing
preview_stop nonexistent assertion so both invariants are pinned.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…Mode (#114)

Two cleanups to the CLI/MCP-parity refactor work that were deferred
on the main PR.

SessionTargetingOptions:
- Introduce a shared @OptionGroup with --session <uuid> and
  --file <path>. Adopted by configure, switch, elements, touch, and
  stop — 5 commands whose session-resolution flags were literally
  duplicated with identical help text.
- VariantsCommand and SnapshotCommand keep their positional `file`
  argument (it doubles as source-for-ephemeral-session) and their
  own --session Option, since the @OptionGroup's --file Option
  would collide with the positional.

PreviewHost.Mode cleanup:
- `.interactive` and `.snapshot` became unreachable after the
  variants migration in #109; only `.serve` was ever constructed.
- Delete the Mode enum entirely. PreviewHost is now a zero-argument
  init. `headless` is a constant true; `keepAliveWithoutWindows`
  collapses into `applicationShouldTerminateAfterLastWindowClosed`
  returning false unconditionally.
- Updated the one remaining test call site to use `PreviewHost()`.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
obj-p added a commit that referenced this pull request Apr 15, 2026
Two gaps called out on PR #113's deferred list.

iOS variants:
- All existing happy-path variants tests run against a macOS session
  (App.host.session(for:) + AppKit render path). Added a
  simulator-gated test that exercises the iOS branch of
  handlePreviewVariants — iosState.getSession +
  iosSession.screenshot + trait-restore loop. Passes in ~17s on
  machines with an available simulator.

exitCode unit tests:
- VariantsCommand's documented exit-code semantics (0 all success,
  1 partial, 2 total failure) were untested because forcing
  deterministic per-variant daemon failures is expensive to
  construct. Extracted the pure mapping into a static
  `VariantsCommand.exitCode(successCount:failCount:)` so it can be
  unit-tested without a daemon round-trip.
- Added a new PreviewsCLITests target that @testable imports
  PreviewsCLI (SPM supports this on executable targets) and pins
  all three branches plus a defensive 0/0 case.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two gaps called out on PR #113's deferred list.

iOS variants:
- All existing happy-path variants tests run against a macOS session
  (App.host.session(for:) + AppKit render path). Added a
  simulator-gated test that exercises the iOS branch of
  handlePreviewVariants — iosState.getSession +
  iosSession.screenshot + trait-restore loop. Passes in ~17s on
  machines with an available simulator.

exitCode unit tests:
- VariantsCommand's documented exit-code semantics (0 all success,
  1 partial, 2 total failure) were untested because forcing
  deterministic per-variant daemon failures is expensive to
  construct. Extracted the pure mapping into a static
  `VariantsCommand.exitCode(successCount:failCount:)` so it can be
  unit-tested without a daemon round-trip.
- Added a new PreviewsCLITests target that @testable imports
  PreviewsCLI (SPM supports this on executable targets) and pins
  all three branches plus a defensive 0/0 case.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
obj-p added a commit that referenced this pull request Apr 15, 2026
Follow-up audit from PR #113's deferred items. Walked every MCP
handler looking for concurrent-modification races. Summary:

No new bugs found — PreviewSession is an actor so within-session
state transitions are serialized, and the `preview_snapshot` hole
for missing sessions is already fixed in #112.

One defensive improvement worth landing: when a concurrent
`preview_stop` fires mid-variants-loop, the trait-restore-at-end
block would fire against a now-stopped session, producing a
misleading "Warning: failed to restore original traits" message
for a user who explicitly asked for the stop. Skip the restore
when the session is no longer in the registry (iosState for iOS,
App.host.allSessions for macOS). No user-visible change for the
happy path; the spurious warning goes away for the concurrent-stop
case.

Documented the remaining concurrent-modification caveat on
`handlePreviewVariants`: a second client mutating the same session
via preview_configure / preview_switch while variants is mid-loop
will interleave its trait change into our capture stream. The
daemon intentionally does not hold a per-session lock across tool
calls (that's a more invasive architectural change); callers that
want deterministic variants should own the session for the
duration.

All 14 variants integration tests pass unchanged.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
obj-p and others added 9 commits April 15, 2026 20:01
Follow-up audit from PR #113's deferred items. Walked every MCP
handler looking for concurrent-modification races. Summary:

No new bugs found — PreviewSession is an actor so within-session
state transitions are serialized, and the `preview_snapshot` hole
for missing sessions is already fixed in #112.

One defensive improvement worth landing: when a concurrent
`preview_stop` fires mid-variants-loop, the trait-restore-at-end
block would fire against a now-stopped session, producing a
misleading "Warning: failed to restore original traits" message
for a user who explicitly asked for the stop. Skip the restore
when the session is no longer in the registry (iosState for iOS,
App.host.allSessions for macOS). No user-visible change for the
happy path; the spurious warning goes away for the concurrent-stop
case.

Documented the remaining concurrent-modification caveat on
`handlePreviewVariants`: a second client mutating the same session
via preview_configure / preview_switch while variants is mid-loop
will interleave its trait change into our capture stream. The
daemon intentionally does not hold a per-session lock across tool
calls (that's a more invasive architectural change); callers that
want deterministic variants should own the session for the
duration.

All 14 variants integration tests pass unchanged.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The MCP Swift SDK exposes CallTool.Result.structuredContent: Value?
alongside the traditional [Tool.Content] blocks. Populate it on every
tool handler that returns non-trivial data so clients (CLI scripts
via --json, MCP agents via structured consumption) can consume a
typed payload instead of regex-parsing prose.

Shared DTOs live in Sources/PreviewsCLI/DaemonProtocol.swift — a
caseless-enum namespace with nested Codable structs. Kept separate
from domain types (e.g. PreviewInfoDTO vs PreviewsCore.PreviewInfo)
so the wire contract can drift independently.

Migrated handlers:
- preview_start → PreviewStartResult { sessionID, platform,
  sourceFilePath, deviceUDID?, pid?, traits?, previews[], activeIndex,
  setupWarning? }
- preview_variants → VariantsResult { variants[], successCount,
  failCount }. Each variant carries status="ok"|"error" plus
  imageIndex pointing into the sibling content array (for success) or
  an error string (for failure).
- preview_switch → SwitchResult { sessionID, activeIndex, traits?,
  previews[] }. previews[].active replaces the " <- active" marker.
- preview_list → PreviewListResult { file, previews[] }.
- simulator_list → SimulatorListResult { simulators[] }.
- session_list → SessionListResult { sessions[] }. The tab-delimited
  text body is now rendered from the DTO list.
- preview_elements → { sessionID, elements: <tree> } as a raw Value.
  The accessibility tree is opaque WDA JSON, round-tripped natively
  rather than mirrored field-by-field.

Text content blocks are preserved verbatim on every migrated handler
so MCP clients without structured-content support keep working.

Additive change — the CLI still regex-parses text for now. CLI
migration to decodeStructured + --json lands in the follow-up PR.

Tests:
- New MacOSMCPTests.structuredContentPayloadsDecode exercises
  every migrated tool against the real daemon and decodes each
  response through the canonical DTO. Image-index references are
  validated against the sibling content array.
- MCPIntegrationTests now depends on PreviewsCLI so @testable can
  import DaemonProtocol.
- MCPTestServer.callToolResult + decodeStructured helpers added.

All 4 MacOSMCPTests pass (~56s).

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…118)

Wave 2 PR 2/2. The daemon started emitting `structuredContent` in
#117; this PR wires the CLI to decode those payloads and adds the
`--json` output mode per the plan.

Shared helpers in `MCPContentHelpers.swift`:
- `Client.callToolStructured(name:arguments:)` — returns the full
  `CallTool.Result` including `structuredContent`. The SDK's
  tuple-returning `callTool` overload drops that field.
- `Value.decode(_:)` — decode an MCP `Value` into a `Codable`.
- `emitJSON(_:)` — write a pretty-printed, sorted-keys JSON
  document to stdout. Used by every `--json` mode.

Regex parsers deleted (3 call sites):
- `RunCommand.extractSessionID`
- `SnapshotCommand.extractSessionID`
- `VariantsCommand.extractSessionID`
- `VariantsCommand.parseSuccessPreamble`
- `VariantsCommand.parseFailurePreamble`
Every site that needed the sessionID now decodes
`PreviewStartResult.sessionID` from `structuredContent`.

`VariantsCommand.captureVariants` rewrite: instead of walking the
text/image content interleave with regex, decode `VariantsResult`
and use each `VariantOutcomeDTO.imageIndex` to index into the
sibling content array. The Critical ERROR-in-label bug from #109 is
structurally unreachable now — outcome classification comes from a
typed `status` field, not a substring match on prose.

`--json` added to 7 read-oriented commands (per plan):
- `run --detach --json` → full `PreviewStartResult`.
- `snapshot --json` → `{ sessionID, outputPath, format, bytes }`
  synthesized client-side.
- `variants --json` → `{ variants[], successCount, failCount }` with
  per-variant status/path/error. Files still written to disk.
- `list --json` → `PreviewListResult` (client-side only; list does
  not hit the daemon).
- `status --json` → `{ state, running, pid, socketPath }`
  synthesized client-side.
- `simulators --json` → daemon's `SimulatorListResult` passthrough.
- `elements --json` → `{ sessionID, elements }` envelope instead of
  the bare tree.

Skipped on imperative commands (`stop`, `touch`, `configure`,
`switch`, `kill-daemon`) per plan — nothing worth structuring on
stdout.

All 44 tests across 7 suites green. Smoke-tested end-to-end:
- `run --detach --json` → emits full PreviewStartResult
- `snapshot --json` → emits outputPath+bytes+format
- `variants --json` → emits per-variant status/path
- `status --json`, `list --json`, `simulators --json` work offline

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…119)

* Update README for CLI/MCP parity: daemon model, new commands, --json

The CLI section hadn't been updated since the parity stack landed.
Expands it to cover:

- All 12 subcommands grouped by purpose (previewing, snapshotting,
  inspecting, session management, enumeration/diagnostics).
- The daemon model (auto-start, socket location, persistence,
  kill-daemon).
- Magical session reuse for snapshot/variants.
- --session / --file / sole-session resolution convention.
- --json flag on 7 read-oriented commands with jq examples.
- Separated "Project config" into its own section for scannability.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Address staleness audit: AGENTS.md, help text, --json tests

Subagent audit found several stale items after the parity migration.

AGENTS.md:
- Updated test count (~34 → ~100+) and added daemon-touching
  test examples.
- Added "Daemon model" section documenting the architecture, key
  files, and how PreviewsMCPApp routes commands.
- Added "CLI subcommands" table listing all 13 commands with
  purpose and daemon/local classification.
- Added "Structured output (--json)" section documenting the
  convention and the structuredContent wire format.
- Updated MCP tools list to include session_list.
- Added note about structuredContent on tool responses.

Help text:
- SnapshotCommand and VariantsCommand had "(ephemeral session only)"
  qualifiers on --width, --height, --preview that were misleading
  post-migration. Changed to "(new session only; ignored when
  reusing a live session)" which is accurate.

--json tests:
- Added SimulatorsCommandTests.simulatorsJSON — asserts stdout is
  valid JSON with a "simulators" array.
- Added ListCommandTests.listJSON — asserts stdout is valid JSON
  with "file" and "previews" fields, correct preview count.
- These were the two commands with zero --json test coverage.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
"only used by serve" undersells the module — PreviewsMacOS is the
rendering engine for all macOS sessions. It runs inside the daemon
process (which is serve), and every CLI command talks to the daemon.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Eliminate App.host global — inject PreviewHost as a parameter

The `App.host` force-unwrapped global in PreviewsMCPApp.swift was
referenced across 3 files (MCPServer.swift: 18 sites,
BuildHelpers.swift: 3 sites, PreviewsMCPApp.swift: 1 site). This
coupled MCPServer's handler functions to a cross-file global that
any code could read or write.

Replace with explicit injection:
- `configureMCPServer(host:sharedCompiler:)` receives the
  PreviewHost as a parameter from its callers (ServeCommand,
  DaemonListener).
- MCPServer.swift stores it in a file-private `mcpHost` var, set
  once per daemon lifetime before any tool calls arrive. Handler
  functions reference `mcpHost` instead of `App.host`.
- BuildHelpers' `launchMacOSPreview` and `launchIOSPreview` take
  `host:` as a parameter.
- `ServeCommand.sharedHost` is the minimal static handoff between
  PreviewsMCPApp (which creates the host) and ServeCommand.run()
  (which passes it to configureMCPServer). ParsableCommand.run()
  can't take parameters, so this is the narrowest possible shared
  state.
- The `App` enum is deleted.

This is PR 0 of the package rearchitecture: purely mechanical, no
target changes, no behavior change. Prepares for PR 1 (extract
PreviewsEngine) by making the host dependency explicit and scoped.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Rename mcpHost → host for consistency

The file-private var was named mcpHost to avoid shadowing the
configureMCPServer parameter. Since the parameter already uses an
internal label (previewHost), the file-private var can just be host
— matching BuildHelpers' parameter name and reading naturally
everywhere.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Address review: guard host assignment, delete dead launch functions

- Guard `host = previewHost` with `if host == nil` so the comment
  ("set once per daemon lifetime") matches behavior. Previously
  re-assigned on every per-connection configureMCPServer call.
- Delete launchMacOSPreview and launchIOSPreview from
  BuildHelpers.swift — zero callers since the daemon migration.
  Removes unused AppKit and PreviewsMacOS imports from that file.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Extract PreviewsEngine library target from PreviewsCLI monolith

Creates a new `PreviewsEngine` library containing the business logic
that has no MCP or ArgumentParser dependency:

Moved from MCPServer.swift:
- IOSState → IOSSessionManager (public actor)
- ConfigCache (public actor)
- traitsSummary, formatPreviewList (public functions)
- cleanupStaleTempDirs (public function)

Moved from BuildHelpers.swift (entire file):
- StderrProgressReporter, loadProjectConfig, buildSetupFromConfig,
  detectAndBuild, resolveDeviceUDID

PreviewsEngine depends on PreviewsCore, PreviewsMacOS, PreviewsIOS.
It does NOT depend on MCP or ArgumentParser — it can be consumed
independently for embedding preview capabilities in any Swift app.

Injection changes:
- configureMCPServer now takes (host:iosManager:configCache:
  sharedCompiler:) — all engine instances are injected rather than
  created internally.
- DaemonListener.start creates ONE shared IOSSessionManager and
  ConfigCache for all connections (preserves session persistence
  across CLI invocations).
- ServeCommand.runStdio creates its own instances (single-connection
  mode, no sharing needed).
- MCPServer.swift stores the injected instances in
  nonisolated(unsafe) file-private vars (set-once-before-use
  pattern, Swift 6 requires the annotation).

MCPServer.swift is ~100 lines shorter. Everything that remains in
it depends on MCP types (CallTool.Parameters, Server, Value).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Address review: LocalizedError, drop PreviewsMacOS dep, stale comments

Critical:
- NoSimulatorError now conforms to LocalizedError with
  errorDescription (was CustomStringConvertible, which causes
  localizedDescription to return the generic "operation could not
  be completed" per AGENTS.md convention).

Important:
- Removed PreviewsMacOS from PreviewsEngine's Package.swift
  dependencies — no engine file imports it. Engine stays
  platform-agnostic.
- Deleted orphaned doc comment in MCPServer.swift (leftover from
  cleanupStaleTempDirs extraction).
- Updated DaemonListener doc comment: IOSState → IOSSessionManager,
  module-level → shared instances.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Auto-formatted 12 files: removed extra blank lines, fixed trailing
whitespace, added required line breaks, wrapped long line in
SnapshotCommand help text.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Critical fix:
- RunCommand sent "preview" but the daemon expects "previewIndex",
  silently ignoring the --preview flag (always used index 0). Fixed
  key name. Verified: `run --preview 1 --json` now shows
  activeIndex: 1.
- Also removed dead "config" key that the daemon schema doesn't
  accept.

Security:
- Set ~/.previewsmcp/ directory permissions to 0700 so the Unix
  socket is restricted to the current user on shared machines.

Concurrency:
- Replaced nonisolated(unsafe) on iosState/configCache with
  @mainactor — writes are now in the same MainActor.run block as
  the host assignment. Eliminates the theoretical data race if
  configureMCPServer were ever called concurrently.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
obj-p and others added 16 commits April 17, 2026 10:00
The CLI integration tests step had no timeout — defaulting to
GitHub Actions' 6-hour limit. If tests hang (orphan daemons,
resource pressure from parallel compilation), the job runs until
cancelled. Add a 30-minute timeout as a safety net.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three RunCommandTests spawned Process() without setting
PREVIEWSMCP_SOCKET_DIR, so the daemon and CLI talked to different
sockets. Added CLIRunner.applySocketDir(to:) helper and called it
before every direct proc.run().

Also fixed hardcoded ~/.previewsmcp paths in RunCommandTests:
- socketPath now reads from CLIRunner.socketDir
- PID file lookups derive from socketPath, not hardcoded home dir

64/65 CLI integration tests pass. The 1 remaining failure is an
iOS touch test flake (simulator session evicted between tap and
swipe during parallel execution) — pre-existing, not caused by
this change.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Pure parallelism overwhelmed CI runners (3-4 cores) with concurrent
swiftc compilations — multiple daemons compiling previews
simultaneously caused resource contention and timeouts.

Hybrid approach: flock serializes test execution (no parallel
compilation overload) while per-test socket directories provide
clean isolation (no shared daemon state between tests). This
eliminates both the old "kill/restart shared daemon per test"
overhead AND the new "parallel compilation" contention.

Results: 65/65 CLI integration tests pass in ~4.5 min serialized.
Zero orphan daemons. No flaky parallel failures.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The DaemonTestLock files had indentation violations from manual
editing. Auto-formatted.

Also bumped CLI integration tests timeout from 30 to 60 minutes.
Locally the serialized suite takes ~4.5 min, but CI runners are
significantly slower (~10x for compilation-heavy tests). The
previous 30-min limit was hit.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Per-test socket isolation created a new daemon per test (~65 daemon
startups). On CI runners this took 60+ min. The original pattern
(one shared daemon, flock for serialization, cleanSlate between
suites) is ~10x faster because it reuses the daemon within each
suite.

Reverted:
- DaemonTestLock back to pure flock (no per-test temp dirs)
- cleanSlate back to kill-daemon + remove socket/pid files
- CLIRunner: removed socketDir/@TaskLocal/applySocketDir
- RunCommandTests: removed socketDir-aware paths

Kept from the exploration:
- PREVIEWSMCP_SOCKET_DIR env var in DaemonPaths (useful for
  future per-suite isolation when CI gets faster runners)
- timeout-minutes: 60 on CI step
- 10-minute per-test time limits

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The ios-tests job runs on the same CI runner and leaves an orphan
daemon at ~/.previewsmcp/serve.sock. When build-and-test's CLI
integration tests start, the first test's cleanSlate tries to
connect to this stale daemon — which may be from a different
binary version — and hangs. The test binary produces zero output
for 60 minutes until the timeout kills it.

Fix: add explicit kill-daemon steps before both CLI and MCP
integration test steps.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ention

Root cause of CI hangs: build-and-test and ios-tests run
concurrently on the same macOS runner, both spawning daemons at
~/.previewsmcp/serve.sock. The DaemonTestLock flock only
serializes within a single swift-test process — it can't prevent
two separate CI jobs from stomping on each other's daemon.

Fix: each CI job sets a unique PREVIEWSMCP_SOCKET_DIR. The daemon
reads this env var and binds to an isolated socket path.
build-and-test uses /tmp/previewsmcp-ci-build, ios-tests uses
/tmp/previewsmcp-ci-ios. No cross-job contention.

Removed the kill-daemon steps added in the previous attempt —
socket isolation makes them unnecessary.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Every test on CI fails with "daemon did not become ready" — the
10-second startup timeout is too short for the CI runner. Increased
to 30 seconds.

Also redirect daemon stderr to DaemonPaths.logFile (serve.log)
instead of /dev/null so startup failures on CI are diagnosable.

Simplified cleanSlate to just kill-daemon (no hardcoded path
cleanup — DaemonPaths resolves the correct directory via the env
var).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The non-blocking flock + Task.sleep polling loop starved Swift's
cooperative thread pool on CI runners with small pools (~3-4
threads). 10 suites started concurrently, each polling the lock via
Task.sleep(50ms). The 9 waiting tasks consumed all cooperative
threads, preventing the lock-holding task's subprocess completion
handlers (CLIRunner.runProcess continuation.resume) from firing.
Result: zero test output for 60 minutes, then timeout.

Fix: acquire the flock on a DispatchQueue.global thread via
withCheckedThrowingContinuation. The blocking flock(LOCK_EX) sleeps
in the kernel — not on a cooperative thread. When the lock becomes
available, the continuation resumes on the cooperative pool with
only one task active, leaving threads free for subprocess callbacks.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The test binary hangs on CI producing zero output. We can't SSH
into the runner. Add a background monitor that logs the process
tree every 30s so we can see whether swift-test, swiftpm-testing,
or previewsmcp processes are alive and what state they're in.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Root cause: the daemon subprocess spawned via setsid() on CI
runners can't connect to the Quartz window server.
NSApplication.shared blocks indefinitely waiting for
CGSDefaultConnection, the daemon never binds its socket, and every
test waits 30s then times out — producing zero test output for 60
minutes.

Fix: check CGMainDisplayID() before NSApplication.shared. If it
returns 0, the window server is unavailable — exit immediately
with a clear error instead of hanging. The DaemonClient sees the
daemon exit code and reports startup failure in <1s instead of
blocking for 30s per test.

Also added a diagnostic CI step to log CGMainDisplayID() so we
can confirm the hypothesis.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The python Quartz module isn't available on the CI runner. Use
swift with CoreGraphics directly. Also added continue-on-error
so the diagnostic doesn't block the test step.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The process monitor revealed the tests weren't hanging — they were
running iOS simulator tests (boot simulator, compile iOS dylib,
install host app) which take 5-10 min each on CI. The --skip
snapshotIOS only skipped 1 of 5 iOS tests. Added skips for all:
stopIOSSession, touchIOSHappyPath, capturesIOSVariants,
elementsReturnsJSONTree.

These are already covered by the ios-tests job. The build-and-test
job should only run macOS-path tests.

Also removed the CGMainDisplayID guard — the diagnostic confirmed
the window server IS available (CGMainDisplayID: 1). Removed the
process monitor (no longer needed). Removed CoreGraphics import.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
On CI, the job-level env var overrides the daemon's socket
directory, but DaemonLifecycleTests.socketPath only checked the
@TaskLocal (set inside DaemonTestLock.run) and fell back to the
hardcoded ~/.previewsmcp/ path. The daemon bound to
/tmp/previewsmcp-ci-build/serve.sock but the test looked for
~/.previewsmcp/serve.sock.

Fix: read ProcessInfo.processInfo.environment as an intermediate
fallback between the @TaskLocal and the hardcoded default.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The 4 iOS CLI integration tests (stopIOSSession, touchIOSHappyPath,
capturesIOSVariants, elementsReturnsJSONTree) were skipped in
build-and-test but not run anywhere else. Added an "iOS CLI
integration tests" step to the ios-tests job so these code paths
are covered.

Also removed the Check window server diagnostic step — it confirmed
CGMainDisplayID: 1 (window server available) and is no longer needed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Added hard-won knowledge from debugging the CI hang:
- DaemonTestLock must use blocking flock on GCD, not polling
  with Task.sleep (cooperative thread pool starvation)
- Concurrent CI jobs need PREVIEWSMCP_SOCKET_DIR isolation
- iOS CLI tests are split across jobs — add --skip when adding new ones
- Daemon requires window server; check CGMainDisplayID if tests hang
- Daemon startup is 10-20x slower on CI than locally

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@obj-p obj-p enabled auto-merge (squash) April 18, 2026 20:20
obj-p and others added 4 commits April 18, 2026 18:02
DaemonTestLock's flock path was hardcoded to /tmp, so concurrent CI jobs
(build-and-test and ios-tests) contended on the same lock file — if one
job's test hung while holding the lock, the other blocked with zero
output until timeout.

- DaemonTestLock (both targets): lockPath now reads PREVIEWSMCP_SOCKET_DIR
  so each CI job gets its own lock file
- RunCommandTests: replace hardcoded ~/.previewsmcp with daemonDir that
  respects the env var (socket and PID assertions were wrong on CI)
- ci.yml: remove duplicate snapshotIOS from iOS CLI integration tests
  step (already runs in the preceding iOS CLI snapshot test step)
- DaemonProbe: use defer for lock release consistency
- AGENTS.md: document stdout-for-data / stderr-for-side-effects convention

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
On CI, PREVIEWSMCP_SOCKET_DIR points to /tmp/previewsmcp-ci-build which
may not exist yet. open(O_CREAT) creates the file but not intermediate
directories — add createDirectory before acquiring the flock.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The "iOS CLI integration tests" step has never passed — it hangs with
zero output. Root causes:
- Previous step (snapshotIOS) leaves a daemon running; the next step's
  tests may contend with it before cleanSlate() fires
- No NSUnbufferedIO, so Swift Testing output is invisible during hangs
- 20-minute timeout too tight for 4 iOS tests plus a ~4-min build phase

Kill stale daemon between steps, enable unbuffered IO for visibility,
and bump timeout to 30 minutes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The 4 iOS tests run in parallel suites but serialize on DaemonTestLock.
Touch (266s) and variants (562s) complete first; elements and stop wait
~560s for their turn then exceed the 10-minute time limit before their
own execution finishes. 20 minutes accommodates the worst-case queue
wait plus execution.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@obj-p obj-p disabled auto-merge April 19, 2026 02:02
obj-p and others added 6 commits April 18, 2026 23:13
Each iOS test independently paid ~100s for daemon + compile + simulator
setup, then serialized on the flock. Combined: one setup, then touch →
elements → variants → stop in sequence. Expected ~5 min vs ~18 min.

Also simplified the build-and-test skip list since the individual iOS
test names no longer exist.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The daemon fails to start with no visible error. Dump serve.log and
socket directory listing on failure to diagnose why.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The step hung for 60 min with zero output on a slow runner. Adding
unbuffered IO so future hangs show test progress for diagnosis. The
ios-tests steps already have this set.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
snapshotIOS hung with zero output for 10 min on a bad runner. Set
NSUnbufferedIO: YES on every swift test step so hangs always show
progress for diagnosis.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
On slow free-tier GitHub runners, flock queue wait (~5 min) plus slow
test execution (~30-40s each) pushes later tests past the 10-minute
per-test limit. Three macOS tests hit this on the last run. 20 minutes
gives enough headroom for the worst-case queue + execution on slow
runners while still catching genuinely stuck tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Document that DaemonTestLock uses PREVIEWSMCP_SOCKET_DIR for lock path
- Update iOS CLI test split docs: individual tests replaced by
  IOSCLIWorkflowTests.iosCLIWorkflow

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@obj-p obj-p merged commit 3c4f0b6 into main Apr 19, 2026
4 checks passed
@obj-p obj-p deleted the cli-mcp-parity branch April 19, 2026 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant