-
Notifications
You must be signed in to change notification settings - Fork 15.2k
Description
I am developing a new tool based on llvm. When I use ExecuteAndWait
to execute a program and the error occurs in the execution
llvm-project/llvm/lib/Support/Program.cpp
Lines 32 to 54 in 11b0cf8
int sys::ExecuteAndWait(StringRef Program, ArrayRef<StringRef> Args, | |
std::optional<ArrayRef<StringRef>> Env, | |
ArrayRef<std::optional<StringRef>> Redirects, | |
unsigned SecondsToWait, unsigned MemoryLimit, | |
std::string *ErrMsg, bool *ExecutionFailed, | |
std::optional<ProcessStatistics> *ProcStat, | |
BitVector *AffinityMask) { | |
assert(Redirects.empty() || Redirects.size() == 3); | |
ProcessInfo PI; | |
if (Execute(PI, Program, Args, Env, Redirects, MemoryLimit, ErrMsg, | |
AffinityMask, /*DetachProcess=*/false)) { | |
if (ExecutionFailed) | |
*ExecutionFailed = false; | |
ProcessInfo Result = Wait( | |
PI, SecondsToWait == 0 ? std::nullopt : std::optional(SecondsToWait), | |
ErrMsg, ProcStat); | |
return Result.ReturnCode; | |
} | |
if (ExecutionFailed) | |
*ExecutionFailed = true; | |
return -1; |
I get error message Couldn't execute program 'clang++.exe': �������� (0x57)
. This is clearly an encoding error, as my console is UTF-8 encoded. After reviewing the relevant code, I've located the source of the error.
llvm-project/llvm/lib/Support/Windows/Program.inc
Lines 111 to 128 in 11b0cf8
bool MakeErrMsg(std::string *ErrMsg, const std::string &prefix) { | |
if (!ErrMsg) | |
return true; | |
char *buffer = NULL; | |
DWORD LastError = GetLastError(); | |
DWORD R = FormatMessageA(FORMAT_MESSAGE_ALLOCATE_BUFFER | | |
FORMAT_MESSAGE_FROM_SYSTEM | | |
FORMAT_MESSAGE_MAX_WIDTH_MASK, | |
NULL, LastError, 0, (LPSTR)&buffer, 1, NULL); | |
if (R) | |
*ErrMsg = prefix + ": " + buffer; | |
else | |
*ErrMsg = prefix + ": Unknown error"; | |
*ErrMsg += " (0x" + llvm::utohexstr(LastError) + ")"; | |
LocalFree(buffer); | |
return R != 0; | |
} |
This function uses the system's default ANSI codepage for formatting error messages. For my region (Simplified Chinese), this codepage is GBK, which results in encoding errors when the output is consumed by UTF-8 applications.
Ideally, I would like to retrieve the error message in UTF-8 encoding.
Considering that the prefix for the error message is already in English, should we consider forcing the returned system message to always be in English to avoid these encoding issues?
Alternatively, perhaps we could add a new parameter to allow users to request the error message in UTF-8 specifically?
My main motivation is to avoid writing platform-specific code in my own project. As a user of this library, I want to avoid explicitly checking for Windows (e.g., #ifdef _WIN32
), including <Windows.h>
, and handling the encoding conversion myself.