-
Notifications
You must be signed in to change notification settings - Fork 15.3k
[clang-tidy] Fix OOB access in FormatStringConverter with signed chars
#169215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@llvm/pr-subscribers-github-workflow @llvm/pr-subscribers-clang-tools-extra Author: mitchell (zeyi2) Changes
The negative values were passed to This closes #169198 Full diff: https://github.com/llvm/llvm-project/pull/169215.diff 1 Files Affected:
diff --git a/clang-tools-extra/clang-tidy/utils/FormatStringConverter.cpp b/clang-tools-extra/clang-tidy/utils/FormatStringConverter.cpp
index 23dae04916e9b..a3af9504e6542 100644
--- a/clang-tools-extra/clang-tidy/utils/FormatStringConverter.cpp
+++ b/clang-tools-extra/clang-tidy/utils/FormatStringConverter.cpp
@@ -700,6 +700,7 @@ void FormatStringConverter::finalizeFormatText() {
/// Append literal parts of the format text, reinstating escapes as required.
void FormatStringConverter::appendFormatText(const StringRef Text) {
for (const char Ch : Text) {
+ const unsigned char UCh = static_cast<unsigned char>(Ch);
if (Ch == '\a')
StandardFormatString += "\\a";
else if (Ch == '\b')
@@ -724,10 +725,10 @@ void FormatStringConverter::appendFormatText(const StringRef Text) {
} else if (Ch == '}') {
StandardFormatString += "}}";
FormatStringNeededRewriting = true;
- } else if (Ch < 32) {
+ } else if (UCh < 32) {
StandardFormatString += "\\x";
- StandardFormatString += llvm::hexdigit(Ch >> 4, true);
- StandardFormatString += llvm::hexdigit(Ch & 0xf, true);
+ StandardFormatString += llvm::hexdigit(UCh >> 4, true);
+ StandardFormatString += llvm::hexdigit(UCh & 0xf, true);
} else
StandardFormatString += Ch;
}
|
|
@llvm/pr-subscribers-clang-tidy Author: mitchell (zeyi2) Changes
The negative values were passed to This closes #169198 Full diff: https://github.com/llvm/llvm-project/pull/169215.diff 1 Files Affected:
diff --git a/clang-tools-extra/clang-tidy/utils/FormatStringConverter.cpp b/clang-tools-extra/clang-tidy/utils/FormatStringConverter.cpp
index 23dae04916e9b..a3af9504e6542 100644
--- a/clang-tools-extra/clang-tidy/utils/FormatStringConverter.cpp
+++ b/clang-tools-extra/clang-tidy/utils/FormatStringConverter.cpp
@@ -700,6 +700,7 @@ void FormatStringConverter::finalizeFormatText() {
/// Append literal parts of the format text, reinstating escapes as required.
void FormatStringConverter::appendFormatText(const StringRef Text) {
for (const char Ch : Text) {
+ const unsigned char UCh = static_cast<unsigned char>(Ch);
if (Ch == '\a')
StandardFormatString += "\\a";
else if (Ch == '\b')
@@ -724,10 +725,10 @@ void FormatStringConverter::appendFormatText(const StringRef Text) {
} else if (Ch == '}') {
StandardFormatString += "}}";
FormatStringNeededRewriting = true;
- } else if (Ch < 32) {
+ } else if (UCh < 32) {
StandardFormatString += "\\x";
- StandardFormatString += llvm::hexdigit(Ch >> 4, true);
- StandardFormatString += llvm::hexdigit(Ch & 0xf, true);
+ StandardFormatString += llvm::hexdigit(UCh >> 4, true);
+ StandardFormatString += llvm::hexdigit(UCh & 0xf, true);
} else
StandardFormatString += Ch;
}
|
|
✅ With the latest revision this PR passed the C/C++ code linter. |
|
Please add tests and release note |
|
|
||
| void printf_utf8_text() { | ||
| // Non-ASCII UTF-8 in format string should not crash. | ||
| printf("你好世界\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For clarity: "你好世界" is the Chinese equivalent of "Hello, World".
I skip the testcase on Windows, now the CI passes. Also, there may be another issue with CI: https://github.com/llvm/llvm-project/actions/runs/19613463479?pr=169215 The |
|
|
||
| void printf_utf8_text() { | ||
| // Hex encodes U+4F60 U+597D U+4E16 U+754C (你好世界) in UTF-8 | ||
| printf("\xE4\xBD\xA0\xE5\xA5\xBD\xE4\xB8\x96\xE7\x95\x8C\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be verified using: python3 -c 'print(b"\xE4\xBD\xA0\xE5\xA5\xBD\xE4\xB8\x96\xE7\x95\x8C\n".decode("utf-8"), end="")'
5dd20c6 to
88e6348
Compare
FormatStringConverter::appendFormatTextincorrectly treated non-ASCII characters (e.g. UTF-8) as negative values when using signed chars. This caused them to pass the< 32check for control characters.The negative values were passed to
llvm::hexdigit, resulting in an OOB access and a crash.This closes #169198