Replace WMI with NtQueryInformationProcess for process command line retrieval#13539
Replace WMI with NtQueryInformationProcess for process command line retrieval#13539JanProvaznik wants to merge 1 commit intodotnet:mainfrom
Conversation
|
cc @JeremyKuhne who has some stuff coming down the pipe for this as well |
fe76728 to
bf09eee
Compare
|
/review |
|
❌ Expert Code Review (command) failed. Please review the logs for details. Review complete. The expert-reviewer sub-agent posted inline comments and a final COMMENT verdict on PR #13539. No additional action needed from the orchestrator. |
There was a problem hiding this comment.
Pull request overview
This PR updates MSBuild’s Windows implementation of ProcessExtensions.TryGetCommandLine to avoid WMI/COM queries (which can hang indefinitely when WMI is unhealthy) by switching to a direct NtQueryInformationProcess call using ProcessCommandLineInformation (info class 60).
Changes:
- Replaced Windows command-line retrieval from WMI (
Win32_Process) toNtQueryInformationProcess(ProcessCommandLineInformation). - Removed the large set of WMI COM interop declarations/constants previously required for the query.
- Added
SafeProcessHandleinterop forOpenProcess.
…etrieval The Windows implementation of TryGetCommandLine used WMI (Win32_Process query via COM) with WBEM_INFINITE timeout, which could hang indefinitely when the WMI service was unresponsive, corrupted, or overloaded by third-party providers. This blocked the main thread during node reuse scanning, freezing the entire build. Replace the WMI approach with NtQueryInformationProcess using ProcessCommandLineInformation (info class 60, available since Windows 8.1). This is a direct kernel query with no service dependency, matching the pattern used by Linux (/proc/pid/cmdline) and macOS (sysctl KERN_PROCARGS2). Benefits: - Eliminates infinite hang risk (no WMI service dependency) - Faster: single syscall vs cross-process RPC - Lower privilege: PROCESS_QUERY_LIMITED_INFORMATION vs WMI impersonation - Removes ~450 lines of COM interface definitions - Consistent with Linux/macOS: direct OS query, no service dependency Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
bf09eee to
5a1c4da
Compare
There was a problem hiding this comment.
Review Summary: WMI → NtQueryInformationProcess migration
Overall: Solid improvement 🟢 — no blocking issues found.
Replacing WMI COM interop with NtQueryInformationProcess is a clear win: eliminates ~400 lines of COM interface boilerplate, removes the problematic process-wide CoInitializeSecurity call, and is dramatically simpler and faster. The memory management is correct (no leak paths), the UNICODE_STRING pointer interpretation is correct, and SafeProcessHandle is properly used.
Findings by Severity
| Severity | Count | Summary |
|---|---|---|
| BLOCKING | 0 | — |
| MAJOR | 0 | — |
| MINOR | 3 | LibraryImport consistency, initial buffer size, returnLength guard |
| NIT | 1 | Documentation of API stability / version requirement |
Technical Correctness Verified
- ✅ UNICODE_STRING interpretation: Buffer pointer is an absolute address pointing into the caller's allocation.
Marshal.PtrToStructure+Marshal.PtrToStringUnireads this correctly. - ✅ Memory management:
Marshal.AllocHGlobal/FreeHGlobalproperly paired viafinallyblock. No leak on resize path (old buffer freed before new allocation, OOM propagates safely). - ✅ Handle management:
SafeProcessHandlewithusingdeclaration ensuresCloseHandleon all paths. - ✅ x86/x64 struct layout:
LayoutKind.Sequentialwithout explicitPackcorrectly handles the 4-byte padding on x64 betweenMaximumLengthandBuffer. - ✅ Error handling: All failure paths return
null, caller hascatch { return false; }, so any unexpected exception is gracefully handled.
Existing Precedent
NativeMethods.cs already P/Invokes NtQueryInformationProcess (with ProcessBasicInformation for parent PID retrieval), so this is not the first use of ntdll.dll APIs in the codebase.
Test Coverage
Existing tests in ProcessExtensions_Tests.cs cover the TryGetCommandLine path with live processes on Windows, verifying both executable name and argument retrieval. These should exercise the new code path on Windows CI.
Note
🔒 Integrity filter blocked 2 items
The following items were blocked because they don't meet the GitHub integrity level.
- #13539
pull_request_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved". - #13539
pull_request_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
To allow these resources, lower min-integrity in your GitHub frontmatter:
tools:
github:
min-integrity: approved # merged | approved | unapproved | noneGenerated by Expert Code Review (command) for issue #13539 · ● 4.4M
| [DllImport("kernel32.dll", SetLastError = true)] | ||
| private static extern SafeProcessHandle OpenProcess(int dwDesiredAccess, [MarshalAs(UnmanagedType.Bool)] bool bInheritHandle, int dwProcessId); | ||
|
|
||
| [DllImport("ntdll.dll")] |
There was a problem hiding this comment.
MINOR — Style: DllImport → LibraryImport
The BSD class (line 290) uses source-generated LibraryImport, but the new Windows class uses the older runtime-interpreted DllImport. For consistency within this file and to benefit from compile-time source generation (avoids runtime marshaling stubs), consider migrating to LibraryImport here. The class would need to become partial and the methods static partial:
private static partial class Windows
{
[LibraryImport("kernel32.dll", SetLastError = true)]
[return: MarshalAs(UnmanagedType.Bool)]
private static partial SafeProcessHandle OpenProcess(int dwDesiredAccess, [MarshalAs(UnmanagedType.Bool)] bool bInheritHandle, int dwProcessId);
[LibraryImport("ntdll.dll")]
private static partial int NtQueryInformationProcess(
SafeProcessHandle processHandle,
int processInformationClass,
IntPtr processInformation,
int processInformationLength,
out int returnLength);
}Not blocking — DllImport works correctly — but the inconsistency stands out since the file already uses the modern pattern.
JeremyKuhne
left a comment
There was a problem hiding this comment.
I'd love to see this change, but we need to find official documentation to be able to use this. @baronfel if we can't find it, we can ask Windows to document.
|
Note that I have a commit that I'm creating that moves to direct COM usage. That doesn't fix the WMI dependency but does remove the COM interop overhead. (Again, would much rather not use WMI.) |
I tried to replace the WMI with IDebugClient: #13560 |
Summary
Replace the Windows implementation of
TryGetCommandLineinProcessExtensions.csto useNtQueryInformationProcesswithProcessCommandLineInformation(info class 60) instead of WMI COM queries.Fixes #13522
Problem
The previous implementation queried WMI (
Win32_Process) via COM withWBEM_INFINITEtimeout to retrieve process command lines during node reuse scanning. When the WMI service was unresponsive, corrupted, or overloaded (common with third-party endpoint protection/monitoring software), this caused an indefinite hang on the calling thread, freezing the entire build.The hang was observed in a dump with the main thread blocked at
ZwAlpcSendWaitReceivePortwaiting for the WMI RPC response that never came.Fix
Use
NtQueryInformationProcesswithProcessCommandLineInformation— a direct kernel query with no service dependency. This matches the pattern already used by the Linux (/proc/pid/cmdline) and macOS (sysctl KERN_PROCARGS2) implementations in the same file.Benefits
PROCESS_QUERY_LIMITED_INFORMATIONvs WMI impersonationTesting
TryGetCommandLine_RunningProcess_ContainsExpectedExecutable— passesTryGetCommandLine_RunningProcess_ContainsArguments— passesContext
The WMI-based code path is gated behind ChangeWave 18.5 (node mode filtering). Users hitting the hang can work around it by setting
MSBUILDDISABLEFEATURESFROMVERSION=18.5, but this fix eliminates the root cause.