Fix Process.Kill(entireProcessTree: true) when intermediate child has KillOnParentExit#128598
Fix Process.Kill(entireProcessTree: true) when intermediate child has KillOnParentExit#128598Copilot wants to merge 7 commits into
Conversation
|
Tagging subscribers to this area: @dotnet/area-system-diagnostics-process |
…te process Adds a regression test to KillOnParentExitTests.cs that mimics issue #107992: when an intermediate child process has KillOnParentExit (job object with JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE), calling Kill(entireProcessTree: true) on the root should kill grandchildren not in the job object. The test creates a 3-level process hierarchy: - Process A (root) -> Process B (KillOnParentExit=true) -> Process C (long-running) The test is marked with [ActiveIssue] since the underlying bug is not yet fixed. Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/44753eb7-cbae-41b4-b37c-592baab16686 Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>
The test should pass on every OS supported by the test class and should be executed by CI. Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>
adamsitnik
left a comment
There was a problem hiding this comment.
@copilot please address my feedback
Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>
Converted to |
Fixed in the latest commit. The issue was that the original The fix separates tree collection from killing:
This ensures the entire tree is discovered while all processes are stopped, before any kill signals trigger cascading deaths. All 12 |
adamsitnik
left a comment
There was a problem hiding this comment.
@copilot please address my feedback, push the changes after ensuring all process tests are passing locally and THEN invoke the code-review skill and post your analysis/results as a comment on this PR
…licitly to Win32Exception - Use `[]` collection expression instead of `new List<Process>()` - Use `GetLastErrorInfo()` and pass `RawErrno` explicitly to `Win32Exception` ctor to avoid potential last-error corruption from intermediate allocations Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>
Applied both suggestions in commit aaddffb. All process tests pass locally:
🤖 Copilot Code Review — PR #128598Holistic AssessmentMotivation: The fix addresses a real bug where Approach: Separating tree enumeration (SIGSTOP + collect) from killing (SIGKILL all) is the correct fix — it prevents cascading Summary: ✅ LGTM. The Unix fix is sound, the test is well-structured, and both Detailed Findings✅ Correctness — Two-phase stop-then-kill approachThe refactoring correctly addresses the root cause: killing a parent triggers ✅ Error handling — Explicit
|
adamsitnik
left a comment
There was a problem hiding this comment.
⚠️ Windows parity — Follow-up needed (non-blocking)The Windows
KillTreestill has the original bug described in issue #107992 (kills Process A first, then enumerates children of intermediate Process B which may already be dead from job object termination). This is out of scope for this PR but should be tracked separately.
Process.Kill(entireProcessTree: true)fails to kill grandchild processes when an intermediate child is in a job object withJOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE(Windows) or hasPR_SET_PDEATHSIGset (Unix, exposed viaProcessStartInfo.KillOnParentExit). Killing the root immediately terminates the intermediate child (via job object handle close on Windows, or parent death signal on Unix) before the tree traversal can discover its descendants.
This is not true. The test that we have added here is passing on Windows, which means that we can't reproduce bug described in issue #107992. It's because on Windows every process inherits Job objects from the parent process, unless they escape the Job.
@copilot please update the PR description to reflect the fact that the newly added test has discovered a bug on Linux (which also got fixed here) and was not able to reproduce the bug reported in #107992 because the Job got inherited by grand child process. Do not modify the source code.
Updated the PR description to reflect that:
|
There was a problem hiding this comment.
Pull request overview
This PR adjusts Unix process-tree termination in System.Diagnostics.Process to improve correctness when Kill(entireProcessTree: true) is used in the presence of intermediate processes that may terminate immediately on parent death (e.g., KillOnParentExit / parent-death signal scenarios). It also adds a regression test that builds a 3-level hierarchy and asserts the grandchild is terminated.
Changes:
- Refactors Unix
KillTreeto collect (and stop) the full process tree before issuingSIGKILL. - Adds a regression test covering a root → child (optionally
KillOnParentExit) → grandchild hierarchy. - Improves errno propagation in the Unix tree-kill path by using
GetLastErrorInfo().RawErrnowhen recording exceptions.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| src/libraries/System.Diagnostics.Process/src/System/Diagnostics/Process.Unix.cs | Refactors Unix KillTree into a stop/collect phase followed by a kill phase to avoid missing descendants. |
| src/libraries/System.Diagnostics.Process/tests/KillOnParentExitTests.cs | Adds a regression test to validate killing a tree also terminates a grandchild when the intermediate child may be killed on parent exit. |
adamsitnik
left a comment
There was a problem hiding this comment.
@copilot please address the feedback
…ally, fix test cleanup Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>
The newly added test discovered a bug on Linux where
Process.Kill(entireProcessTree: true)fails to kill grandchild processes when an intermediate child hasPR_SET_PDEATHSIGset (exposed viaProcessStartInfo.KillOnParentExit). Killing the root immediately terminates the intermediate child (via parent death signal) before the tree traversal can discover its descendants.The test passes on Windows without any code changes — on Windows every process inherits Job objects from the parent process (unless they explicitly escape the Job), so the grandchild is already part of the same Job and gets terminated along with its parent. The bug described in issue #107992 could not be reproduced.
Changes
Fix: Unix
KillTreeinProcess.Unix.csRefactored the
KillTreemethod to separate tree collection from killing into two phases:StopTree: Recursively SIGSTOPs all processes depth-first and collects them into a flat list. This captures the entire tree while all processes are alive and stopped. Returnstrueif the process was successfully stopped and added to the list,falseif it exited early or SIGSTOP failed (in which case the caller disposes it).finallyblock so it always runs even ifStopTreethrows partway through.Previously, the algorithm killed each process immediately after enumerating its direct children. This caused processes with
PR_SET_PDEATHSIG(KillOnParentExit) to be killed by the kernel before their own children could be discovered, leaving grandchild processes orphaned and running.Test: Regression test in
KillOnParentExitTests.csAdds a
[ConditionalTheory]parameterized with[InlineData(true)]and[InlineData(false)]that creates a 3-level process hierarchy:Kill(entireProcessTree: true)on Process AKillOnParentExitis enabled on the intermediate child