superpmi: Refactor parallelsuperpmi.cpp to eliminate scattered #ifdef blocks and remove PAL usage#126263
superpmi: Refactor parallelsuperpmi.cpp to eliminate scattered #ifdef blocks and remove PAL usage#126263
Conversation
…nto Win32 and POSIX implementations Co-authored-by: jkoritzinsky <1571408+jkoritzinsky@users.noreply.github.com> Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/a7b9077c-a304-4dde-8045-c76ef5b134f6
…elper functions Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/b7cff633-e704-4f70-8b66-f255e08455c0 Co-authored-by: jkoritzinsky <1571408+jkoritzinsky@users.noreply.github.com>
…T_UNIX block Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/a47830ed-8a61-4ae1-810b-924ae375a48b Co-authored-by: jkoritzinsky <1571408+jkoritzinsky@users.noreply.github.com>
|
This PR removes one of the 2 last usages of |
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
|
@copilot please update the description of the PR based on the initial prompt, not just the last bit of work. |
Updated the PR description to cover all three commits: splitting |
|
@copilot please rewrite the PR title and description based on the whole set of changes in this PR. |
Updated the PR description to cover all the changes across the three commits: splitting Note: the PR title can't be updated through the tooling available to me — a suggested replacement is: |
There was a problem hiding this comment.
Pull request overview
Refactors parallelsuperpmi.cpp to separate platform-specific behavior (process creation, Ctrl+C handling, temp paths, worker IO, and waiting) into clean Unix vs Windows helper implementations, removing scattered preprocessor guards from doParallelSuperPMI and avoiding PAL/Win32 APIs on the POSIX path.
Changes:
- Split worker process spawning into standalone Win32 (
CreateProcess) and POSIX (fork/execv) implementations. - Implement POSIX Ctrl+C handling and consolidate remaining platform-dependent logic behind helper functions.
- Replace POSIX worker waiting/output file handling with
waitpidandopen/close(O_CLOEXEC) respectively.
| #ifdef TARGET_UNIX | ||
| #include <fcntl.h> | ||
| #include <signal.h> | ||
| #include <sys/wait.h> | ||
| #include <unistd.h> | ||
| #include <minipal/getexepath.h> | ||
| #endif // TARGET_UNIX |
There was a problem hiding this comment.
On POSIX this file now uses errno, strerror, and fork error paths, but the TARGET_UNIX include block doesn’t include <errno.h>. On some toolchains this will fail to compile (errno undeclared). Add <errno.h> (and keep <string.h> via the PCH) under the #ifdef TARGET_UNIX includes.
| volatile bool closeRequested = false; // global variable to communicate CTRL+C between threads. | ||
|
|
There was a problem hiding this comment.
closeRequested is written from a POSIX signal handler. Using volatile bool is not async-signal-safe and can be undefined behavior; the portable type for signal handlers is volatile sig_atomic_t (or another async-signal-safe mechanism). Consider changing the flag type accordingly and keep the handler’s work limited to a single store.
| return false; | ||
| } | ||
|
|
||
| *hProcess = pi.hProcess; |
There was a problem hiding this comment.
Windows StartProcess leaks the thread handle returned in PROCESS_INFORMATION (pi.hThread) because it’s never closed. This can accumulate when spawning many workers. Close pi.hThread after successful CreateProcess (and ensure pi.hProcess remains returned).
| *hProcess = pi.hProcess; | |
| *hProcess = pi.hProcess; | |
| CloseHandle(pi.hThread); |
| // Count tokens | ||
| int count = 0; | ||
| char* p = commandLine; | ||
| bool inArg = false; | ||
| while (*p) | ||
| { | ||
| if (*p == ' ' || *p == '\t') | ||
| { | ||
| inArg = false; | ||
| } | ||
| else if (!inArg) | ||
| { | ||
| inArg = true; | ||
| count++; | ||
| } | ||
| p++; | ||
| } | ||
|
|
||
| char** argv = new char*[count + 1]; | ||
| *argc = count; | ||
| int idx = 0; | ||
| p = commandLine; | ||
| while (*p) | ||
| { | ||
| while (*p == ' ' || *p == '\t') | ||
| p++; | ||
| if (*p == '\0') | ||
| break; | ||
| char* start = p; | ||
| while (*p && *p != ' ' && *p != '\t') | ||
| p++; | ||
| size_t len = (size_t)(p - start); | ||
| argv[idx] = new char[len + 1]; | ||
| memcpy(argv[idx], start, len); | ||
| argv[idx][len] = '\0'; | ||
| idx++; | ||
| } | ||
| argv[idx] = nullptr; |
There was a problem hiding this comment.
ParseCommandLineToArgv splits only on spaces/tabs and ignores quoting/escaping. Many arguments in this tool (paths like -details <path>, -reproName <path>, etc.) can legitimately contain spaces, which will break execv on POSIX. Consider building an argv vector directly (without round-tripping through a command-line string) or implement proper shell/Windows-style quoting rules consistently with how the command line is constructed.
| // Count tokens | |
| int count = 0; | |
| char* p = commandLine; | |
| bool inArg = false; | |
| while (*p) | |
| { | |
| if (*p == ' ' || *p == '\t') | |
| { | |
| inArg = false; | |
| } | |
| else if (!inArg) | |
| { | |
| inArg = true; | |
| count++; | |
| } | |
| p++; | |
| } | |
| char** argv = new char*[count + 1]; | |
| *argc = count; | |
| int idx = 0; | |
| p = commandLine; | |
| while (*p) | |
| { | |
| while (*p == ' ' || *p == '\t') | |
| p++; | |
| if (*p == '\0') | |
| break; | |
| char* start = p; | |
| while (*p && *p != ' ' && *p != '\t') | |
| p++; | |
| size_t len = (size_t)(p - start); | |
| argv[idx] = new char[len + 1]; | |
| memcpy(argv[idx], start, len); | |
| argv[idx][len] = '\0'; | |
| idx++; | |
| } | |
| argv[idx] = nullptr; | |
| // Implement simple shell-like parsing: | |
| // - Space and tab separate arguments, unless inside single or double quotes. | |
| // - Single and double quotes are not included in the resulting arguments. | |
| // - Backslash escapes the next character, allowing spaces, quotes, and | |
| // backslashes to be included literally. | |
| std::vector<std::string> args; | |
| std::string current; | |
| bool inSingleQuote = false; | |
| bool inDoubleQuote = false; | |
| for (char* p = commandLine; *p != '\0'; ++p) | |
| { | |
| char c = *p; | |
| if (c == '\\') | |
| { | |
| // Escape the next character if present. | |
| ++p; | |
| if (*p == '\0') | |
| { | |
| // Trailing backslash: treat as a literal backslash. | |
| current.push_back('\\'); | |
| break; | |
| } | |
| current.push_back(*p); | |
| continue; | |
| } | |
| if (!inSingleQuote && c == '"') | |
| { | |
| inDoubleQuote = !inDoubleQuote; | |
| continue; | |
| } | |
| if (!inDoubleQuote && c == '\'') | |
| { | |
| inSingleQuote = !inSingleQuote; | |
| continue; | |
| } | |
| if (!inSingleQuote && !inDoubleQuote && (c == ' ' || c == '\t')) | |
| { | |
| if (!current.empty()) | |
| { | |
| args.push_back(current); | |
| current.clear(); | |
| } | |
| continue; | |
| } | |
| current.push_back(c); | |
| } | |
| if (!current.empty()) | |
| { | |
| args.push_back(current); | |
| } | |
| *argc = static_cast<int>(args.size()); | |
| char** argv = new char*[*argc + 1]; | |
| for (int i = 0; i < *argc; i++) | |
| { | |
| const std::string& arg = args[static_cast<size_t>(i)]; | |
| size_t len = arg.size(); | |
| argv[i] = new char[len + 1]; | |
| memcpy(argv[i], arg.c_str(), len); | |
| argv[i][len] = '\0'; | |
| } | |
| argv[*argc] = nullptr; |
| #endif // !TARGET_UNIX | ||
| #else // TARGET_UNIX | ||
| static void PosixCtrlHandler(int signum) | ||
| { |
There was a problem hiding this comment.
PosixCtrlHandler doesn’t use signum, which is likely to trigger -Wunused-parameter warnings (often treated as errors). Either mark it unused (e.g., cast to void) or annotate appropriately.
| { | |
| { | |
| (void)signum; |
| LogDebug("stdout %i=%s", workerIndex, wd.stdOutputPath); | ||
| wd.hStdOutput = open(wd.stdOutputPath, O_WRONLY | O_CREAT | O_TRUNC | O_CLOEXEC, 0666); | ||
| if (wd.hStdOutput == -1) | ||
| { | ||
| LogError("Unable to open '%s'. errno=%d", wd.stdOutputPath, errno); | ||
| return false; | ||
| } | ||
|
|
||
| LogDebug("stderr %i=%s", workerIndex, wd.stdErrorPath); | ||
| wd.hStdError = open(wd.stdErrorPath, O_WRONLY | O_CREAT | O_TRUNC | O_CLOEXEC, 0666); | ||
| if (wd.hStdError == -1) | ||
| { | ||
| LogError("Unable to open '%s'. errno=%d", wd.stdErrorPath, errno); | ||
| return false; | ||
| } |
There was a problem hiding this comment.
POSIX OpenWorkerOutputFiles: if opening stderr fails after stdout succeeded, the already-open stdout fd is leaked. Close wd.hStdOutput before returning false (and consider resetting to -1).
| LogDebug("stdout %i=%s", workerIndex, wd.stdOutputPath); | ||
| wd.hStdOutput = CreateFileA(wd.stdOutputPath, GENERIC_WRITE, FILE_SHARE_READ, &sa, CREATE_ALWAYS, | ||
| FILE_ATTRIBUTE_NORMAL, NULL); | ||
| if (wd.hStdOutput == INVALID_HANDLE_VALUE) | ||
| { | ||
| LogError("Unable to open '%s'. GetLastError()=%u", wd.stdOutputPath, GetLastError()); | ||
| return false; | ||
| } | ||
|
|
||
| // If we ever execute on a machine which has more than MAXIMUM_WAIT_OBJECTS(64) CPU cores | ||
| // we still can't spawn more than the max supported by WaitForMultipleObjects() | ||
| if (o.workerCount > MAXIMUM_WAIT_OBJECTS) | ||
| o.workerCount = MAXIMUM_WAIT_OBJECTS; | ||
| LogDebug("stderr %i=%s", workerIndex, wd.stdErrorPath); | ||
| wd.hStdError = CreateFileA(wd.stdErrorPath, GENERIC_WRITE, FILE_SHARE_READ, &sa, CREATE_ALWAYS, | ||
| FILE_ATTRIBUTE_NORMAL, NULL); | ||
| if (wd.hStdError == INVALID_HANDLE_VALUE) | ||
| { | ||
| LogError("Unable to open '%s'. GetLastError()=%u", wd.stdErrorPath, GetLastError()); | ||
| return false; | ||
| } |
There was a problem hiding this comment.
Windows OpenWorkerOutputFiles: if opening stderr fails after stdout succeeds, wd.hStdOutput is leaked. Close the stdout handle before returning false (and consider resetting handles to INVALID_HANDLE_VALUE).
| int status; | ||
| if (waitpid(handles[i], &status, 0) == -1) | ||
| { | ||
| LogError("waitpid failed for child %d: %s", i, strerror(errno)); | ||
| exitCodes[i] = -1; | ||
| } | ||
| else if (WIFEXITED(status)) | ||
| { | ||
| exitCodes[i] = WEXITSTATUS(status); | ||
| } | ||
| else | ||
| { | ||
| // Terminated by a signal (e.g., OOM killer). | ||
| exitCodes[i] = -1; | ||
| } | ||
| } |
There was a problem hiding this comment.
POSIX WaitForWorkerProcesses treats any waitpid failure as permanent. If waitpid is interrupted (errno==EINTR), this will mark the worker as failed and, worse, may skip reaping the child (leaving it running/zombie). Loop/retry on EINTR, and only treat other errno values as failure.
| // Returns nullptr; exit codes are retrieved lazily via GetWorkerExitCode / GetExitCodeProcess. | ||
| static int* WaitForWorkerProcesses(SpmiProcessHandle* handles, int count) | ||
| { | ||
| WaitForMultipleObjects(count, handles, true, INFINITE); |
There was a problem hiding this comment.
Windows WaitForWorkerProcesses ignores the return value of WaitForMultipleObjects. If it fails (e.g., invalid handle), the code will proceed as though workers completed and then attempt to read exit codes. Check for WAIT_FAILED/WAIT_TIMEOUT and log/return an error state.
| WaitForMultipleObjects(count, handles, true, INFINITE); | |
| DWORD waitResult = WaitForMultipleObjects(count, handles, TRUE, INFINITE); | |
| if (waitResult == WAIT_FAILED) | |
| { | |
| LogError("WaitForMultipleObjects failed. GetLastError()=%u", GetLastError()); | |
| return nullptr; | |
| } | |
| if (waitResult == WAIT_TIMEOUT) | |
| { | |
| LogError("WaitForMultipleObjects timed out unexpectedly."); | |
| return nullptr; | |
| } | |
| if ((waitResult < WAIT_OBJECT_0) || (waitResult >= WAIT_OBJECT_0 + static_cast<DWORD>(count))) | |
| { | |
| LogError("WaitForMultipleObjects returned unexpected result: %u", waitResult); | |
| return nullptr; | |
| } |
| HRESULT hr = E_FAIL; | ||
| SimpleTimer st; | ||
| st.Start(); | ||
| signal(SIGINT, PosixCtrlHandler); |
There was a problem hiding this comment.
POSIX RegisterCtrlHandler doesn’t validate whether signal(SIGINT, ...) succeeded (SIG_ERR). Consider checking the return value and failing the registration if it can’t be installed.
| signal(SIGINT, PosixCtrlHandler); | |
| if (signal(SIGINT, PosixCtrlHandler) == SIG_ERR) | |
| { | |
| LogError("Failed to register SIGINT handler."); | |
| return false; | |
| } |
parallelsuperpmi.cppwas littered with#ifdef TARGET_UNIXguards scattered throughout function bodies, mixing platform logic with control flow and making the code hard to read. Additionally, Ctrl+C signal handling was missing on POSIX (left asTODO-Porting), and the POSIX code path was calling Win32/PAL APIs such asWaitForMultipleObjectsandCreateFileA/CloseHandle. This refactor reorganizes all platform-specific code into clean, separated implementations and removes PAL API usage from the POSIX path.Description
Process spawning
StartProcessinto two complete, standalone implementations — one for Win32 (usingCreateProcess) and one for POSIX (usingfork/execv) — replacing the original single function body that was threaded with interleaved#ifndef TARGET_UNIX/#else/#endifguards.ParseCommandLineToArgvhelper for the POSIX path to tokenize the command-line string into anargvarray forexecv.SpmiProcessHandletypedef (pid_ton POSIX,HANDLEon Windows) to unify the process handle array type used throughout the file.POSIX Ctrl+C handling (previously unimplemented)
PosixCtrlHandlersignal handler usingsignal(SIGINT, ...), implementing the previously skippedTODO-Portingfor graceful Ctrl+C handling on Unix.closeRequestedvolatilefor correct signal-handler access.Elimination of scattered
#ifdef TARGET_UNIXguards indoParallelSuperPMI#ifdef TARGET_UNIXguards fromdoParallelSuperPMIinto named helper functions:RegisterCtrlHandler,GetTempFolderPath,GetDefaultWorkerCount,GetCurrentExePath,OpenWorkerOutputFiles,CloseWorkerOutputFiles,WaitForWorkerProcesses,GetWorkerExitCode,FreeWorkerExitCodes.#ifdef TARGET_UNIXblock: all POSIX implementations appear first, then all Windows implementations — no#ifdefappears inside any function body.doParallelSuperPMIitself now contains zero preprocessor guards.PAL/Win32 API removal from POSIX path
WaitForMultipleObjectswith awaitpidloop in the POSIXWaitForWorkerProcessesimplementation, removing one of the last usages ofWaitForMultipleObjectswith multiple objects on non-Windows.open()/close()withO_CLOEXECinstead ofCreateFileA/CloseHandle.Note
This PR description was generated with AI assistance.
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.