Skip to content

CLI test summary reports false-positive errors from xcodebuild's NSError dictionary dump #383

@cameroncooke

Description

@cameroncooke

Summary

The xcodebuildmcp simulator test CLI parser emits a false-positive entry in the Errors (N): section of its final summary whenever xcodebuild prints a multi-line Cocoa NSError dictionary to stdout — even when xcodebuild itself reports ** TEST EXECUTE SUCCEEDED ** and every test case passes. This produces summaries like

✅ 12 tests passed, 0 skipped (⏱️ 105.5s)
…
Errors (1):
  ✗ } (error = Error Domain=FBSOpenApplicationServiceErrorDomain Code=1 …)

which are internally contradictory (success + error in the same summary) and have a junk-looking entry name (}) caused by parsing the closing brace of the dictionary dump as the start of an error.

Three independent issues compound here:

  1. The error-line regex matches error = inside a Cocoa NSError description, not just error: compiler diagnostics.
  2. When the regex matches, the entire raw line (including the leading } from the dictionary dump's closing brace) is stored verbatim as the error message.
  3. state.errors accumulates these fragments unconditionally; nothing reconciles them against ** TEST EXECUTE SUCCEEDED ** or the absence of test-case failures.

Reproducing run (real evidence from the user's machine)

CLI invocation, in ~/Developer/XcodeBuildMCP/example_projects/Weather:

xcodebuildmcp simulator test

xcodebuild ran with parallel test execution across simulator clones, and one xctrunner launch on Clone 3 was rejected by SpringBoard (transient state, likely from a prior wedged session). xcodebuild rebalanced the work to Clones 1 and 2, finished cleanly, and emitted ** TEST EXECUTE SUCCEEDED **. All 12 test cases passed. Excerpts from the build log (~/Library/Developer/XcodeBuildMCP/logs/test_sim_2026-05-01T08-38-50-203Z_pid54049.log):

Lines 265–305 — the offending xcodebuild stdout (one multi-line NSError dictionary dump):

2026-05-01 09:40:07.637 xcodebuild[…]  iOSSimulator: 33AA6A40-…: Failed to launch app with identifier: com.sentry.weather.WeatherUITests.xctrunner and options: {
    "activate_suspended" = 1;
    arguments =     (
    );
    environment =     {
        …
    };
    stderr = "/dev/ttys016";
    stdout = "/dev/ttys016";
    "terminate_running_process" = 1;
    "wait_for_debugger" = 1;
} (error = Error Domain=FBSOpenApplicationServiceErrorDomain Code=1 "Simulator device failed to launch …" UserInfo={… SimCallingSelector=launchApplicationWithID:options:pid:error:, BSErrorCodeDescription=RequestDenied})

Line 317:

** TEST EXECUTE SUCCEEDED **

Lines 320–334 — actual test cases, all passing across 3 sim clones:

Test suite 'WeatherTests' started on 'Clone 1 of iPhone 17 Pro - Weather'
Test case 'WeatherTests/emptySearchReturnsNoResults()' passed …
Test case 'WeatherTests/temperatureFormattingMatchesPrototypeRules()' passed …
…
Test case 'WeatherUITests.testLaunchPerformance()' passed on 'Clone 2 of iPhone 17 Pro - WeatherUITests-Runner' (31.398 seconds)
…
Test case 'WeatherUITests.testSettingsSheetOpens()' passed on 'Clone 2 of iPhone 17 Pro - WeatherUITests-Runner' (7.063 seconds)

Total: 4 in WeatherTests + 4 in WeatherUITests + 4 in WeatherUITestsLaunchTests = 12. The 12 tests passed count is correct. The Errors (1) block is the bug.

Root-cause walkthrough

1. Overly loose regex misclassifies NSError description as a compiler diagnostic

src/utils/xcodebuild-line-parsers.ts:52

const BUILD_ERROR_DIAGNOSTIC_PATTERN = /(?:^|[\s:])(?:fatal error|error):\s*\S/iu;

The intent is to recognise compiler-diagnostic lines such as File.swift:42:10: error: missing semicolon or xcodebuild: error: foo. But the case-insensitive pattern fires on a substring inside the dictionary dump on log line 305:

… SimCallingSelector=launchApplicationWithID:options:pid:error:, …

pid:error:, matches: the : before error satisfies [\s:], the literal error: matches, and the , immediately following satisfies \s*\S. So the regex returns true even though there is no compiler diagnostic in this line at all — the substring is just a Cocoa selector name terminating in error: followed by a punctuation character.

2. Fallback path stores the entire raw line as the error message

src/utils/xcodebuild-line-parsers.ts:150-184

export function parseBuildErrorDiagnostic(line: string): ParsedBuildError | null {
  // file:line:col error: message  — does not match (line begins with `}`)
  // /path: error: message          — does not match
  // prefix: error: message         — does not match (line begins with `}`)

  if (!isBuildErrorDiagnosticLine(line)) {
    return null;
  }
  return { message: line, renderedLine: line };  // <-- line 184: whole line as message
}

None of the three structured patterns (lines 152, 163, 174) match the offending line, so we fall through to the catch-all on line 180. With the regex returning true, line 184 stores the entire raw stdout line — } (error = Error Domain=… RequestDenied) — as message.

3. Fragment is added to state.errors unconditionally; never reconciled with TEST EXECUTE SUCCEEDED

src/utils/xcodebuild-event-parser.ts:383-392 emits a compiler-diagnostic event with severity: 'error'.

src/utils/xcodebuild-run-state.ts:173-180

if (severity === 'error') {
  acceptDedupedDiagnostic(fragment, state.errors);
}

state.errors is purely additive; nothing inspects ** TEST EXECUTE SUCCEEDED ** (no grep hits for that marker in either xcodebuild-event-parser.ts or xcodebuild-run-state.ts) to gate or clear it, and there is no cross-check that says "if the test-case stream completed successfully, drop entries that came from xcodebuild's NSError dump".

4. The renderer faithfully prints what it was given

src/utils/renderers/domain-result-text.ts:238-247

if (diagnostics.errors.length > 0) {
  sections.push(
    createSection(
      `Errors (${diagnostics.errors.length}):`,
      createMarkedDiagnosticLines(diagnostics.errors, '✗'),
      

createMarkedDiagnosticLines (lines 207-219) prepends to the first line of entry.message. Since the message starts with }, the output reads ✗ } (error = …), which looks like a test named } failed. There is no separate "test name" field — the } is just the literal first character of the captured stdout line.

Test-count side note (this part is correct)

The ✅ 12 tests passed figure is computed at src/utils/xcodebuild-domain-results.ts:375-377 from the parsed Test case '…' passed/failed/skipped stream and matches reality (12 passes, 0 failures across 3 sim clones). The bug is only in the Errors (N): aggregator. So the inconsistency the user sees — green tick alongside an error block — is the parser inventing an error that has no corresponding test failure or compiler diagnostic.

Suggested fixes (owner's call)

  1. Tighten BUILD_ERROR_DIAGNOSTIC_PATTERN so it does not match error: followed by punctuation (e.g., require \s or end-of-line after the colon, or require a non-empty trailing message that does not begin with ,)};). Specifically, exclude the case where what follows the colon is a Cocoa-selector terminator.
  2. Detect Cocoa NSError dumps explicitly and either suppress them entirely or emit them as warnings, not errors. xcodebuild logs them in a recognisable shape: a line ending in and options: { opens the dump, and a line beginning with } (error = Error Domain= closes it. Treat the whole block as one informational event.
  3. Reconcile state.errors against ** TEST EXECUTE SUCCEEDED **. If the marker is observed and state.testFailures.length === 0, drop or downgrade error entries that came from the catch-all parseBuildErrorDiagnostic fallback (line 184) rather than from a structured pattern (lines 152/163/174). The structured matches are real compiler errors; the fallback is best-effort and is the one producing the false positives.
  4. Cosmetic: never emit an error entry whose first character is structural punctuation (}, ], ) etc.) — that is always a parser-misalignment signal.

Related UX problem worth noting

During the 105 s test run the CLI prints no progress output between Test\n Scheme: … and the final summary — only a single static glyph. Users routinely interpret this as a hang (the user who reported this issue assumed it was hung and asked for a process sample after ~2 minutes of silence). Even minimal incremental output ("Building tests…", "Launching test runner…", "Running tests…", "Collecting results…") would prevent that misread. Possibly a separate issue — flagging here for context.

Bonus observation: log-directory growth

~/Library/Developer/XcodeBuildMCP/logs/ currently contains 55,371 files going back to 2026-04-11. There is no apparent rotation or cap — every run leaves multiple .log files behind. Worth a separate housekeeping issue.

Environment

  • macOS 26.3.1 (25D2128)
  • Xcode 26.4.0
  • xcodebuildmcp via /opt/homebrew/bin/xcodebuildmcp (global npm install, latest)
  • Project: example_projects/Weather from getsentry/XcodeBuildMCP main
  • iOS Simulator UDID: A2C64636-37E9-4B68-B872-E7F0A82A5670 (iPhone 17 Pro, iOS 26.4)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions