Fix ToolTask hang when grandchild processes inherit pipe handles#13351
Merged
Conversation
When a tool spawned by ToolTask creates child processes that inherit stdout/stderr pipe handles, Process.WaitForExit() blocks forever because the parameterless overload waits for pipe EOF via AsyncStreamReader.WaitUtilEOF(). The grandchild holds the pipe open indefinitely, causing the MSBuild node to hang and orphan worker processes with file locks. The fix (behind ChangeWave 18.6) replaces the parameterless WaitForExit() with a two-step approach: 1. WaitForExit(Timeout.Infinite) - waits for process handle only, not pipe EOF 2. WaitAll on EOF sentinel events with bounded 2s timeout - waits for AsyncStreamReader to deliver Data=null (the EOF callback), which fires after all data including the final partial line has been flushed This provides identical data guarantees to the original WaitForExit() in the normal case (EOF arrives within milliseconds), while preventing infinite hangs when grandchild processes hold pipe handles. Fixes #2981
Contributor
There was a problem hiding this comment.
Pull request overview
Addresses a long-standing ToolTask hang caused by Process.WaitForExit() waiting for stdout/stderr pipe EOF when grandchild processes inherit the redirected pipe handles, by switching to a bounded EOF-drain wait observed via DataReceived EOF sentinels (behind ChangeWave 18.6).
Changes:
- Add stdout/stderr EOF
ManualResetEvents and use them to wait (bounded) for async stream-drain completion after waiting for the process handle. - Update the
DataReceivedhandler to signal EOF events whenData == null. - Add regression tests for the hang scenario and for preserving tool output; introduce ChangeWave 18.6 and document it.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| src/Utilities/ToolTask.cs | Implements Wave18.6 guarded hang fix by decoupling process-exit wait from pipe-EOF wait, using EOF callbacks with a bounded timeout. |
| src/Utilities.UnitTests/ToolTask_Tests.cs | Adds regression tests for the grandchild-inherited-pipe hang and for output capture behavior. |
| src/Framework/ChangeWaves.cs | Introduces Wave18_6 and adds it to AllWaves. |
| documentation/wiki/ChangeWaves.md | Documents the new 18.6 change wave and its associated feature. |
- Update remarks to clarify hang affects both .NET Framework and modern .NET - Shorten ping duration from 120 to 10 in test (still exceeds 2s EOF timeout) - Use Shouldly assertions (ShouldContain) instead of engine.AssertLogContains
adamsitnik
approved these changes
Mar 19, 2026
Member
adamsitnik
left a comment
There was a problem hiding this comment.
LGTM @YuliiaKovalova ! I left some comments, but they are all just nits.
Co-authored-by: Adam Sitnik <adam.sitnik@gmail.com>
baronfel
reviewed
Mar 23, 2026
baronfel
approved these changes
Mar 23, 2026
Member
baronfel
left a comment
There was a problem hiding this comment.
This looks targeted, tested, and protected by a change wave. Nice work!
This was referenced Mar 23, 2026
dfederm
pushed a commit
to dfederm/msbuild
that referenced
this pull request
Apr 9, 2026
…net#13351) ## Context `ToolTask` can hang indefinitely when the tool it spawns creates grandchild processes that inherit stdout/stderr pipe handles. This is a long-standing issue reported in dotnet#2981 (opened 2018), with a previous fix attempt in dotnet#10297 that was **reverted** (dotnet#10395) because it caused output loss (dotnet#10378). ### Root Cause On .NET Framework (and .NET Core), the parameterless `Process.WaitForExit()` internally calls `AsyncStreamReader.WaitUtilEOF()`, which blocks until **all** write handles to the stdout/stderr pipes are closed. When a tool like `cl.exe` spawns a grandchild process (e.g., `mspdbsrv.exe`) that inherits the pipe handles, EOF is never reached even though the tool itself has exited - causing an infinite hang. ### Why the previous fix (dotnet#10297) failed PR dotnet#10297 changed `proc.WaitForExit()` to `proc.WaitForExit(int.MaxValue)`. The `int` overload skips `WaitUtilEOF()`, preventing the hang. **But it also skips waiting for all `DataReceived` callbacks to be delivered.** For fast tools like `command -v ls`, the `AsyncStreamReader` hadn't delivered its output before the drain calls ran, causing `ConsoleOutput` to be empty. ## Changes Made The fix (behind **ChangeWave 18.6**) uses the `Data==null` EOF sentinel that `AsyncStreamReader` sends via `DataReceived` when each pipe reaches EOF: ### Step 1: `proc.WaitForExit(int.MaxValue)` Waits for the **process handle only**, not pipe EOF. Since `_toolExited` already fired before `WaitForProcessExit` is called, the process is dead and this returns immediately. ### Step 2: `WaitHandle.WaitAll(eofEvents, 2000)` Waits for our own `_standardOutputEOF` / `_standardErrorEOF` events, which are set by `ReceiveStandardErrorOrOutputData` when `Data==null` arrives from the `AsyncStreamReader`. - **Normal case (no grandchild):** Pipe closes immediately after tool exits, EOF arrives within milliseconds, events fire, all data including final partial line is guaranteed delivered. This provides **identical guarantees** to the original `proc.WaitForExit()`. - **Grandchild case:** Grandchild holds pipe open, EOF never arrives, events time out after 2 seconds, proceed. The tool's line-by-line output was already delivered during the `HandleToolNotifications` loop. ### Why this doesn't lose data (unlike dotnet#10297) Our `_standardOutputEOF` fires **inside** `AsyncStreamReader.FlushMessageQueue()`, which is called **before** the internal `eofEvent.Set()`. By the time our event fires, the `AsyncStreamReader` has already: 1. Read all remaining bytes from the pipe buffer 2. Decoded them into characters 3. Flushed the final partial line from its `StringBuilder` 4. Delivered every line (including the final one) via `DataReceived` callbacks This is functionally equivalent to `WaitUtilEOF` - just observed from our callback instead of the internal event. ## Testing ### ToolTaskDoesNotHangWhenGrandchildInheritsPipeHandles Spawns `cmd.exe /c echo hello & start /b ping -n 120 127.0.0.1 > nul`: - `cmd.exe` writes "hello" and exits immediately - `ping` inherits pipe handles and runs for 120 seconds - **With fix:** Test completes in ~2 seconds, "hello" is captured - **Without fix (MSBUILDDISABLEFEATURESFROMVERSION=18.6):** Test hangs until 30s timeout ### ToolTaskCapturesAllOutputWithFix Spawns `cmd.exe /c echo line1 & echo line2 & echo line3` (no grandchild): - Verifies all three lines are captured - regression test for dotnet#10378 --------- Co-authored-by: Adam Sitnik <adam.sitnik@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Context
ToolTaskcan hang indefinitely when the tool it spawns creates grandchild processes that inherit stdout/stderr pipe handles. This is a long-standing issue reported in #2981 (opened 2018), with a previous fix attempt in #10297 that was reverted (#10395) because it caused output loss (#10378).Root Cause
On .NET Framework (and .NET Core), the parameterless
Process.WaitForExit()internally callsAsyncStreamReader.WaitUtilEOF(), which blocks until all write handles to the stdout/stderr pipes are closed. When a tool likecl.exespawns a grandchild process (e.g.,mspdbsrv.exe) that inherits the pipe handles, EOF is never reached even though the tool itself has exited - causing an infinite hang.Why the previous fix (#10297) failed
PR #10297 changed
proc.WaitForExit()toproc.WaitForExit(int.MaxValue). Theintoverload skipsWaitUtilEOF(), preventing the hang. But it also skips waiting for allDataReceivedcallbacks to be delivered. For fast tools likecommand -v ls, theAsyncStreamReaderhadn't delivered its output before the drain calls ran, causingConsoleOutputto be empty.Changes Made
The fix (behind ChangeWave 18.6) uses the
Data==nullEOF sentinel thatAsyncStreamReadersends viaDataReceivedwhen each pipe reaches EOF:Step 1:
proc.WaitForExit(int.MaxValue)Waits for the process handle only, not pipe EOF. Since
_toolExitedalready fired beforeWaitForProcessExitis called, the process is dead and this returns immediately.Step 2:
WaitHandle.WaitAll(eofEvents, 2000)Waits for our own
_standardOutputEOF/_standardErrorEOFevents, which are set byReceiveStandardErrorOrOutputDatawhenData==nullarrives from theAsyncStreamReader.proc.WaitForExit().HandleToolNotificationsloop.Why this doesn't lose data (unlike #10297)
Our
_standardOutputEOFfires insideAsyncStreamReader.FlushMessageQueue(), which is called before the internaleofEvent.Set(). By the time our event fires, theAsyncStreamReaderhas already:StringBuilderDataReceivedcallbacksThis is functionally equivalent to
WaitUtilEOF- just observed from our callback instead of the internal event.Testing
ToolTaskDoesNotHangWhenGrandchildInheritsPipeHandles
Spawns
cmd.exe /c echo hello & start /b ping -n 120 127.0.0.1 > nul:cmd.exewrites "hello" and exits immediatelypinginherits pipe handles and runs for 120 secondsToolTaskCapturesAllOutputWithFix
Spawns
cmd.exe /c echo line1 & echo line2 & echo line3(no grandchild):