Skip to content

fix(ios-tests): wait for STOPPING before signalling node exit#59

Merged
gmaclennan merged 2 commits into
mainfrom
claude/fix-flaky-stop-shutdown-test
May 7, 2026
Merged

fix(ios-tests): wait for STOPPING before signalling node exit#59
gmaclennan merged 2 commits into
mainfrom
claude/fix-flaky-stop-shutdown-test

Conversation

@gmaclennan
Copy link
Copy Markdown
Member

Summary

Fixes a race in testStopSendsShutdownMessageOverIPC that caused intermittent macOS Swift Package Test failures (e.g. on the #58 CI runs).

The race

The test dispatched service.stop(timeout: 2) to a background queue and then immediately called signalExit() on the main thread:

DispatchQueue.global().async {
    service.stop(timeout: 2)
    stopFinished.fulfill()
}
signalExit()

When the runner was loaded enough that the dispatched stop() hadn't started executing by the time signalExit() ran, the mock node thread's semaphore.wait() unblocked and the closure returned before stop() had set stopRequested = true. The state derivation rule Runtime exited unexpectedly → ERROR then fired (since the exit looked Unexpected), and when stop() finally ran it hit the guard:

guard state == .started || state == .starting else {
    log("Cannot stop: state is \(state.rawValue)")  // "ERROR"
    return
}

…and bailed without sending the shutdown frame. Result: service.state == .error and backend.receivedShutdown == false, both of which the test asserts against.

The fix

Register a STOPPING expectation alongside the existing STARTED one, and wait(for: [stoppingExpectation]) between dispatching stop() and calling signalExit(). The STOPPING transition is emitted inside stop() immediately after applyAndEmit { stopRequested = true }, so observing it guarantees stopRequested is true before the runtime exits — the exit is then classified as Requested, not Unexpected, and the state derives correctly to STOPPED.

Test plan

  • swift test --filter NodeJSServiceTests/testStopSendsShutdownMessageOverIPC × 50 — all pass, durations cluster 53–63 ms
  • swift test (full suite, 67 tests) — clean

🤖 Generated with Claude Code

`testStopSendsShutdownMessageOverIPC` dispatched `service.stop()` to a
background queue and then immediately called `signalExit()` on the main
thread. When the bg dispatch was scheduled late, the mock node's
`semaphore.wait()` unblocked before `stop()` had a chance to set
`stopRequested`, the runtime exit was classified as Unexpected, and
the service derived to ERROR — which made `stop()` bail on its
`state == .started || .starting` guard, leaving `receivedShutdown`
false. The CI failure in #58 was this race.

Fix: register a STOPPING expectation alongside STARTED, and wait for
that transition (which happens inside `stop()` after `stopRequested`
is set) before signalling exit. Verified with 50 consecutive runs of
the test and a full-suite run; all green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the iOS Swift test suite to remove a race in NodeJSServiceTests/testStopSendsShutdownMessageOverIPC by ensuring stop() has begun (and the service has entered STOPPING) before the mock Node runtime is signalled to exit, preventing intermittent .error outcomes during CI.

Changes:

  • Add a STOPPING expectation in testStopSendsShutdownMessageOverIPC.
  • Replace the initial broad waitForExpectations with an explicit wait for STARTED, then wait for STOPPING before calling signalExit().

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread ios/Tests/NodeJSServiceTests.swift Outdated
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@gmaclennan gmaclennan enabled auto-merge (squash) May 7, 2026 15:23
@gmaclennan gmaclennan merged commit ddee1aa into main May 7, 2026
7 checks passed
@gmaclennan gmaclennan deleted the claude/fix-flaky-stop-shutdown-test branch May 7, 2026 15:35
gmaclennan added a commit that referenced this pull request May 19, 2026
…ycleStateTransitions (#71)

## Summary

Fixes a race in
[`testFullLifecycleStateTransitions`](ios/Tests/NodeJSServiceTests.swift#L582)
that intermittently fails the macOS Swift Package Tests job ([example
failure on
#70](https://github.com/digidem/comapeo-core-react-native/actions/runs/26056852629/job/76606978112)):

```
NodeJSServiceTests.swift:624: error:
  -[ComapeoCoreTests.NodeJSServiceTests testFullLifecycleStateTransitions] :
  XCTAssertLessThan failed: ("3") is not less than ("2") -
  STOPPING should come before STOPPED
```

The recorded `transitions` array ended up `[starting, started, stopped,
stopping]` rather than the expected `[starting, started, stopping,
stopped]`.

## The race

Same shape as PR #59, slightly different underlying mechanic. PR #59
fixed the sister test `testStopSendsShutdownMessageOverIPC`; this test
was introduced in PR #47 and wasn't covered by that fix.

`NodeJSService.applyAndEmit` releases the service lock *before* invoking
`onStateChange?(derived)`
([NodeJSService.swift:277](ios/NodeJSService.swift#L277), then
[:319](ios/NodeJSService.swift#L319)) — a deliberate requirement of
[`testObserverCanReenterLockedMethodFromCallback`](ios/Tests/NodeJSServiceTests.swift#L634),
which asserts the callback can re-enter locked methods like `cleanup()`
without deadlocking. The state mutations are linearised under the lock,
but the post-unlock observer invocations are not.

The test calls `signalExit()` *before* `service.stop(timeout: 1)`:

```swift
// Stop
signalExit()
service.stop(timeout: 1)
```

That makes the node thread's `applyAndEmit` (writing `nodeRuntime =
.exited(.requested)` → derives STOPPED) race the main thread's
`applyAndEmit` from inside `stop()` (writing `stopRequested = true` →
derives STOPPING). Even when the locked state transitions run in the
right order, the post-unlock `onStateChange` callbacks can be reordered
by thread scheduling, so the recorded array sees STOPPED before
STOPPING. On a quiet runner the main thread usually wins; on a loaded
macOS-14-arm64 CI runner it sometimes loses.

## The fix

Identical pattern to PR #59: dispatch `stop()` to a background queue,
register a STOPPING expectation, and `wait(for: [stoppingExpectation])`
before calling `signalExit()`. That pins the observer ordering —
STOPPING is appended to `transitions` before the node thread is ever
unblocked, so the later STOPPED append from the node thread necessarily
lands at a higher index.

```swift
let stopFinished = expectation(description: "stop() returned")
DispatchQueue.global().async {
    service.stop(timeout: 1)
    stopFinished.fulfill()
}
wait(for: [stoppingExpectation], timeout: 5)
signalExit()
wait(for: [stopFinished], timeout: 5)
```

## Test plan

- [ ] `swift test --filter
NodeJSServiceTests/testFullLifecycleStateTransitions` × 50 — all pass
- [ ] `swift test` (full suite) — clean

(I'm on a Linux container so can't run `swift test` here; verifying via
CI on this PR.)

https://claude.ai/code/session_01BRgG9bgdFy9pZCov2uHyef

---
_Generated by [Claude
Code](https://claude.ai/code/session_01BRgG9bgdFy9pZCov2uHyef)_

Co-authored-by: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants