Skip to content

fix: replace blocking waitUntilExit with async terminationHandler to prevent thread starvation#249

Merged
pepicrft merged 2 commits into
tuist:mainfrom
irena327:fix-async-deadlock
May 8, 2026
Merged

fix: replace blocking waitUntilExit with async terminationHandler to prevent thread starvation#249
pepicrft merged 2 commits into
tuist:mainfrom
irena327:fix-async-deadlock

Conversation

@irena327
Copy link
Copy Markdown
Contributor

@irena327 irena327 commented May 8, 2026

Summary

Replace process.waitUntilExit() with process.terminationHandler + CheckedContinuation to avoid blocking the cooperative thread pool

Problem

CommandRunner.run() executes inside a Task.detached, which runs on Swift's cooperative thread pool. The call to process.waitUntilExit() blocks that thread until the subprocess finishes. When multiple commands run concurrently (e.g., during manifest loading in a large monorepo), each blocked thread reduces pool capacity. Once all pool threads are occupied by waitUntilExit() calls, the stdoutTask and stderrTask pipe-reading tasks can never be scheduled. The subprocesses then fill their pipe buffers and block on write, while waitUntilExit() waits for processes that can never finish — a classic thread starvation deadlock.

This causes tuist generate and tuist test to hang indefinitely at "Loading and constructing the graph" in large repos.

Fix

Replace the blocking waitUntilExit() with an async-friendly terminationHandler that suspends the task via CheckedContinuation. This frees the pool thread while waiting for the process to exit, allowing pipe-draining tasks to be scheduled normally.

Irena Lee and others added 2 commits May 7, 2026 20:13
@fortmarek
Copy link
Copy Markdown
Member

Added a regression test in CommandRunnerRaceTests that exercises the pipe-buffer starvation case described in the PR.

What changed:

  • Added runsManyConcurrentCommandsWithLargeOutput_successfully.
  • The test starts 64 concurrent /bin/sh commands.
  • Each command writes 256 KiB to stdout and 256 KiB to stderr.
  • The test asserts that CommandRunner drains the full byte count from both streams.

Local verification:

  • Temporarily reverted the implementation back to process.run(); process.waitUntilExit() while keeping the new test. The build completed, the test started, and then it hung, reproducing the deadlock shape this PR is fixing.
  • Restored the async terminationHandler + continuation implementation from this PR and reran the same test successfully:
    • swift test --scratch-path /private/tmp/command-pr-249-regression-build --filter CommandRunnerRaceTests.runsManyConcurrentCommandsWithLargeOutput_successfully
    • Passed in 0.212s on the final run.

This gives us a TDD-style regression check: old behavior hangs, fixed behavior drains the pipes and completes.

@fortmarek fortmarek requested a review from pepicrft May 8, 2026 07:59
@pepicrft pepicrft merged commit 8e9ba18 into tuist:main May 8, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants