-
Notifications
You must be signed in to change notification settings - Fork 15.1k
[libc] Migrate ctype_utils to use char instead of int where applicable. #166225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Functions like isalpha / tolower can operate on chars internally. This allows us to get rid of unnecessary casts and open a way to creating wchar_t overloads with the same names (e.g. for isalpha), that would simplify templated code for conversion functions (see 315dfe5). Add the int->char converstion to public entrypoints implementation instead. We also need to introduce bounds check on the input argument values - these functions' behavior is unspecified if the argument is neither EOF nor fits in "unsigned char" range, but the tests we've had verified that they always return false for small negative values. To preserve this behavior, cover it explicitly.
|
@llvm/pr-subscribers-libc Author: Alexey Samsonov (vonosmas) ChangesFunctions like isalpha / tolower can operate on chars internally. This allows us to get rid of unnecessary casts and open a way to creating wchar_t overloads with the same names (e.g. for isalpha), that would simplify templated code for conversion functions (see 315dfe5). Add the int->char converstion to public entrypoints implementation instead. We also need to introduce bounds check on the input argument values - these functions' behavior is unspecified if the argument is neither EOF nor fits in "unsigned char" range, but the tests we've had verified that they always return false for small negative values. To preserve this behavior, cover it explicitly. Patch is 38.38 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/166225.diff 37 Files Affected:
diff --git a/libc/src/__support/ctype_utils.h b/libc/src/__support/ctype_utils.h
index be0f25330af9e..61b7a0aeb5b67 100644
--- a/libc/src/__support/ctype_utils.h
+++ b/libc/src/__support/ctype_utils.h
@@ -27,7 +27,7 @@ namespace internal {
// as well as a way to support non-ASCII character encodings.
// Similarly, do not change these functions to use case ranges. e.g.
-// bool islower(int ch) {
+// bool islower(char ch) {
// switch(ch) {
// case 'a'...'z':
// return true;
@@ -37,7 +37,7 @@ namespace internal {
// EBCDIC. Technically we could use some smaller ranges, but that's even harder
// to read.
-LIBC_INLINE static constexpr bool islower(int ch) {
+LIBC_INLINE static constexpr bool islower(char ch) {
switch (ch) {
case 'a':
case 'b':
@@ -71,7 +71,7 @@ LIBC_INLINE static constexpr bool islower(int ch) {
}
}
-LIBC_INLINE static constexpr bool isupper(int ch) {
+LIBC_INLINE static constexpr bool isupper(char ch) {
switch (ch) {
case 'A':
case 'B':
@@ -105,7 +105,7 @@ LIBC_INLINE static constexpr bool isupper(int ch) {
}
}
-LIBC_INLINE static constexpr bool isdigit(int ch) {
+LIBC_INLINE static constexpr bool isdigit(char ch) {
switch (ch) {
case '0':
case '1':
@@ -123,7 +123,7 @@ LIBC_INLINE static constexpr bool isdigit(int ch) {
}
}
-LIBC_INLINE static constexpr int tolower(int ch) {
+LIBC_INLINE static constexpr char tolower(char ch) {
switch (ch) {
case 'A':
return 'a';
@@ -182,7 +182,7 @@ LIBC_INLINE static constexpr int tolower(int ch) {
}
}
-LIBC_INLINE static constexpr int toupper(int ch) {
+LIBC_INLINE static constexpr char toupper(char ch) {
switch (ch) {
case 'a':
return 'A';
@@ -241,7 +241,7 @@ LIBC_INLINE static constexpr int toupper(int ch) {
}
}
-LIBC_INLINE static constexpr bool isalpha(int ch) {
+LIBC_INLINE static constexpr bool isalpha(char ch) {
switch (ch) {
case 'a':
case 'b':
@@ -301,7 +301,7 @@ LIBC_INLINE static constexpr bool isalpha(int ch) {
}
}
-LIBC_INLINE static constexpr bool isalnum(int ch) {
+LIBC_INLINE static constexpr bool isalnum(char ch) {
switch (ch) {
case 'a':
case 'b':
@@ -371,7 +371,7 @@ LIBC_INLINE static constexpr bool isalnum(int ch) {
}
}
-LIBC_INLINE static constexpr int b36_char_to_int(int ch) {
+LIBC_INLINE static constexpr int b36_char_to_int(char ch) {
switch (ch) {
case '0':
return 0;
@@ -476,7 +476,7 @@ LIBC_INLINE static constexpr int b36_char_to_int(int ch) {
}
}
-LIBC_INLINE static constexpr int int_to_b36_char(int num) {
+LIBC_INLINE static constexpr char int_to_b36_char(int num) {
// Can't actually use LIBC_ASSERT here because it depends on integer_to_string
// which depends on this.
@@ -559,7 +559,7 @@ LIBC_INLINE static constexpr int int_to_b36_char(int num) {
}
}
-LIBC_INLINE static constexpr bool isspace(int ch) {
+LIBC_INLINE static constexpr bool isspace(char ch) {
switch (ch) {
case ' ':
case '\t':
@@ -574,7 +574,7 @@ LIBC_INLINE static constexpr bool isspace(int ch) {
}
// not yet encoding independent.
-LIBC_INLINE static constexpr bool isgraph(int ch) {
+LIBC_INLINE static constexpr bool isgraph(char ch) {
return 0x20 < ch && ch < 0x7f;
}
diff --git a/libc/src/__support/integer_to_string.h b/libc/src/__support/integer_to_string.h
index 29449bd739730..5e7369de00962 100644
--- a/libc/src/__support/integer_to_string.h
+++ b/libc/src/__support/integer_to_string.h
@@ -378,9 +378,8 @@ template <typename T, typename Fmt = radix::Dec> class IntegerToString {
using UNSIGNED_T = make_integral_or_big_int_unsigned_t<T>;
LIBC_INLINE static char digit_char(uint8_t digit) {
- const int result = internal::int_to_b36_char(digit);
- return static_cast<char>(Fmt::IS_UPPERCASE ? internal::toupper(result)
- : result);
+ const char result = internal::int_to_b36_char(digit);
+ return Fmt::IS_UPPERCASE ? internal::toupper(result) : result;
}
LIBC_INLINE static void
diff --git a/libc/src/ctype/CMakeLists.txt b/libc/src/ctype/CMakeLists.txt
index 8830c1bccf9ea..68e982bd4529e 100644
--- a/libc/src/ctype/CMakeLists.txt
+++ b/libc/src/ctype/CMakeLists.txt
@@ -6,6 +6,7 @@ add_entrypoint_object(
isalnum.h
DEPENDS
libc.include.ctype
+ libc.src.__support.CPP.limits
libc.src.__support.ctype_utils
)
@@ -16,6 +17,7 @@ add_entrypoint_object(
HDRS
isalpha.h
DEPENDS
+ libc.src.__support.CPP.limits
libc.src.__support.ctype_utils
)
@@ -50,6 +52,7 @@ add_entrypoint_object(
HDRS
isdigit.h
DEPENDS
+ libc.src.__support.CPP.limits
libc.src.__support.ctype_utils
)
@@ -60,6 +63,7 @@ add_entrypoint_object(
HDRS
isgraph.h
DEPENDS
+ libc.src.__support.CPP.limits
libc.src.__support.ctype_utils
)
@@ -70,6 +74,7 @@ add_entrypoint_object(
HDRS
islower.h
DEPENDS
+ libc.src.__support.CPP.limits
libc.src.__support.ctype_utils
)
@@ -88,6 +93,7 @@ add_entrypoint_object(
HDRS
ispunct.h
DEPENDS
+ libc.src.__support.CPP.limits
libc.src.__support.ctype_utils
)
@@ -97,6 +103,9 @@ add_entrypoint_object(
isspace.cpp
HDRS
isspace.h
+ DEPENDS
+ libc.src.__support.CPP.limits
+ libc.src.__support.ctype_utils
)
add_entrypoint_object(
@@ -106,6 +115,7 @@ add_entrypoint_object(
HDRS
isupper.h
DEPENDS
+ libc.src.__support.CPP.limits
libc.src.__support.ctype_utils
)
@@ -116,6 +126,7 @@ add_entrypoint_object(
HDRS
isxdigit.h
DEPENDS
+ libc.src.__support.CPP.limits
libc.src.__support.ctype_utils
)
@@ -126,6 +137,7 @@ add_entrypoint_object(
HDRS
tolower.h
DEPENDS
+ libc.src.__support.CPP.limits
libc.src.__support.ctype_utils
)
@@ -144,6 +156,7 @@ add_entrypoint_object(
HDRS
toupper.h
DEPENDS
+ libc.src.__support.CPP.limits
libc.src.__support.ctype_utils
)
@@ -160,6 +173,7 @@ add_entrypoint_object(
isalnum_l.h
DEPENDS
libc.include.ctype
+ libc.src.__support.CPP.limits
libc.src.__support.ctype_utils
libc.hdr.types.locale_t
)
@@ -171,6 +185,7 @@ add_entrypoint_object(
HDRS
isalpha_l.h
DEPENDS
+ libc.src.__support.CPP.limits
libc.src.__support.ctype_utils
libc.hdr.types.locale_t
)
@@ -202,6 +217,7 @@ add_entrypoint_object(
HDRS
isdigit_l.h
DEPENDS
+ libc.src.__support.CPP.limits
libc.src.__support.ctype_utils
libc.hdr.types.locale_t
)
@@ -224,6 +240,7 @@ add_entrypoint_object(
HDRS
islower_l.h
DEPENDS
+ libc.src.__support.CPP.limits
libc.src.__support.ctype_utils
libc.hdr.types.locale_t
)
@@ -257,6 +274,8 @@ add_entrypoint_object(
isspace_l.h
DEPENDS
libc.hdr.types.locale_t
+ libc.src.__support.CPP.limits
+ libc.src.__support.ctype_utils
)
add_entrypoint_object(
@@ -266,6 +285,7 @@ add_entrypoint_object(
HDRS
isupper_l.h
DEPENDS
+ libc.src.__support.CPP.limits
libc.src.__support.ctype_utils
libc.hdr.types.locale_t
)
@@ -277,6 +297,7 @@ add_entrypoint_object(
HDRS
isxdigit_l.h
DEPENDS
+ libc.src.__support.CPP.limits
libc.src.__support.ctype_utils
libc.hdr.types.locale_t
)
@@ -288,6 +309,7 @@ add_entrypoint_object(
HDRS
tolower_l.h
DEPENDS
+ libc.src.__support.CPP.limits
libc.src.__support.ctype_utils
libc.hdr.types.locale_t
)
@@ -299,6 +321,7 @@ add_entrypoint_object(
HDRS
toupper_l.h
DEPENDS
+ libc.src.__support.CPP.limits
libc.src.__support.ctype_utils
libc.hdr.types.locale_t
)
diff --git a/libc/src/ctype/isalnum.cpp b/libc/src/ctype/isalnum.cpp
index 54a3e35748879..102b5e79e4a18 100644
--- a/libc/src/ctype/isalnum.cpp
+++ b/libc/src/ctype/isalnum.cpp
@@ -7,15 +7,18 @@
//===----------------------------------------------------------------------===//
#include "src/ctype/isalnum.h"
-#include "src/__support/ctype_utils.h"
+#include "src/__support/CPP/limits.h"
#include "src/__support/common.h"
+#include "src/__support/ctype_utils.h"
#include "src/__support/macros/config.h"
namespace LIBC_NAMESPACE_DECL {
LLVM_LIBC_FUNCTION(int, isalnum, (int c)) {
- return static_cast<int>(internal::isalnum(static_cast<unsigned>(c)));
+ if (c < 0 || c > cpp::numeric_limits<unsigned char>::max())
+ return 0;
+ return static_cast<int>(internal::isalnum(static_cast<char>(c)));
}
} // namespace LIBC_NAMESPACE_DECL
diff --git a/libc/src/ctype/isalnum_l.cpp b/libc/src/ctype/isalnum_l.cpp
index 671d9b75c4c33..173e1c174121e 100644
--- a/libc/src/ctype/isalnum_l.cpp
+++ b/libc/src/ctype/isalnum_l.cpp
@@ -7,15 +7,18 @@
//===----------------------------------------------------------------------===//
#include "src/ctype/isalnum_l.h"
-#include "src/__support/ctype_utils.h"
+#include "src/__support/CPP/limits.h"
#include "src/__support/common.h"
+#include "src/__support/ctype_utils.h"
#include "src/__support/macros/config.h"
namespace LIBC_NAMESPACE_DECL {
LLVM_LIBC_FUNCTION(int, isalnum_l, (int c, locale_t)) {
- return static_cast<int>(internal::isalnum(static_cast<unsigned>(c)));
+ if (c < 0 || c > cpp::numeric_limits<unsigned char>::max())
+ return 0;
+ return static_cast<int>(internal::isalnum(static_cast<char>(c)));
}
} // namespace LIBC_NAMESPACE_DECL
diff --git a/libc/src/ctype/isalpha.cpp b/libc/src/ctype/isalpha.cpp
index 78b26f6a486ea..8b6ecf192e129 100644
--- a/libc/src/ctype/isalpha.cpp
+++ b/libc/src/ctype/isalpha.cpp
@@ -8,6 +8,7 @@
#include "src/ctype/isalpha.h"
+#include "src/__support/CPP/limits.h"
#include "src/__support/common.h"
#include "src/__support/ctype_utils.h"
#include "src/__support/macros/config.h"
@@ -15,7 +16,9 @@
namespace LIBC_NAMESPACE_DECL {
LLVM_LIBC_FUNCTION(int, isalpha, (int c)) {
- return static_cast<int>(internal::isalpha(static_cast<unsigned>(c)));
+ if (c < 0 || c > cpp::numeric_limits<unsigned char>::max())
+ return 0;
+ return static_cast<int>(internal::isalpha(static_cast<char>(c)));
}
} // namespace LIBC_NAMESPACE_DECL
diff --git a/libc/src/ctype/isalpha_l.cpp b/libc/src/ctype/isalpha_l.cpp
index 0619d979bedf2..982bcc569faaf 100644
--- a/libc/src/ctype/isalpha_l.cpp
+++ b/libc/src/ctype/isalpha_l.cpp
@@ -8,6 +8,7 @@
#include "src/ctype/isalpha_l.h"
+#include "src/__support/CPP/limits.h"
#include "src/__support/common.h"
#include "src/__support/ctype_utils.h"
#include "src/__support/macros/config.h"
@@ -15,7 +16,9 @@
namespace LIBC_NAMESPACE_DECL {
LLVM_LIBC_FUNCTION(int, isalpha_l, (int c, locale_t)) {
- return static_cast<int>(internal::isalpha(static_cast<unsigned>(c)));
+ if (c < 0 || c > cpp::numeric_limits<unsigned char>::max())
+ return 0;
+ return static_cast<int>(internal::isalpha(static_cast<char>(c)));
}
} // namespace LIBC_NAMESPACE_DECL
diff --git a/libc/src/ctype/isdigit.cpp b/libc/src/ctype/isdigit.cpp
index 1f711943861f8..43553c794a2f3 100644
--- a/libc/src/ctype/isdigit.cpp
+++ b/libc/src/ctype/isdigit.cpp
@@ -7,6 +7,8 @@
//===----------------------------------------------------------------------===//
#include "src/ctype/isdigit.h"
+
+#include "src/__support/CPP/limits.h"
#include "src/__support/common.h"
#include "src/__support/ctype_utils.h"
#include "src/__support/macros/config.h"
@@ -14,7 +16,9 @@
namespace LIBC_NAMESPACE_DECL {
LLVM_LIBC_FUNCTION(int, isdigit, (int c)) {
- return static_cast<int>(internal::isdigit(static_cast<unsigned>(c)));
+ if (c < 0 || c > cpp::numeric_limits<unsigned char>::max())
+ return 0;
+ return static_cast<int>(internal::isdigit(static_cast<char>(c)));
}
} // namespace LIBC_NAMESPACE_DECL
diff --git a/libc/src/ctype/isdigit_l.cpp b/libc/src/ctype/isdigit_l.cpp
index ca981362bfe83..40b5618906dac 100644
--- a/libc/src/ctype/isdigit_l.cpp
+++ b/libc/src/ctype/isdigit_l.cpp
@@ -7,6 +7,8 @@
//===----------------------------------------------------------------------===//
#include "src/ctype/isdigit_l.h"
+
+#include "src/__support/CPP/limits.h"
#include "src/__support/common.h"
#include "src/__support/ctype_utils.h"
#include "src/__support/macros/config.h"
@@ -14,7 +16,9 @@
namespace LIBC_NAMESPACE_DECL {
LLVM_LIBC_FUNCTION(int, isdigit_l, (int c, locale_t)) {
- return static_cast<int>(internal::isdigit(static_cast<unsigned>(c)));
+ if (c < 0 || c > cpp::numeric_limits<unsigned char>::max())
+ return 0;
+ return static_cast<int>(internal::isdigit(static_cast<char>(c)));
}
} // namespace LIBC_NAMESPACE_DECL
diff --git a/libc/src/ctype/isgraph.cpp b/libc/src/ctype/isgraph.cpp
index 74bb2e75d138e..86ac0cb0d2714 100644
--- a/libc/src/ctype/isgraph.cpp
+++ b/libc/src/ctype/isgraph.cpp
@@ -8,6 +8,7 @@
#include "src/ctype/isgraph.h"
+#include "src/__support/CPP/limits.h"
#include "src/__support/common.h"
#include "src/__support/ctype_utils.h"
#include "src/__support/macros/config.h"
@@ -15,7 +16,9 @@
namespace LIBC_NAMESPACE_DECL {
LLVM_LIBC_FUNCTION(int, isgraph, (int c)) {
- return static_cast<int>(internal::isgraph(static_cast<unsigned>(c)));
+ if (c < 0 || c > cpp::numeric_limits<unsigned char>::max())
+ return 0;
+ return static_cast<int>(internal::isgraph(static_cast<char>(c)));
}
} // namespace LIBC_NAMESPACE_DECL
diff --git a/libc/src/ctype/isgraph_l.cpp b/libc/src/ctype/isgraph_l.cpp
index cbef6df148aed..dddcb9be4f80c 100644
--- a/libc/src/ctype/isgraph_l.cpp
+++ b/libc/src/ctype/isgraph_l.cpp
@@ -8,6 +8,7 @@
#include "src/ctype/isgraph_l.h"
+#include "src/__support/CPP/limits.h"
#include "src/__support/common.h"
#include "src/__support/ctype_utils.h"
#include "src/__support/macros/config.h"
@@ -15,7 +16,9 @@
namespace LIBC_NAMESPACE_DECL {
LLVM_LIBC_FUNCTION(int, isgraph_l, (int c, locale_t)) {
- return static_cast<int>(internal::isgraph(static_cast<unsigned>(c)));
+ if (c < 0 || c > cpp::numeric_limits<unsigned char>::max())
+ return 0;
+ return static_cast<int>(internal::isgraph(static_cast<char>(c)));
}
} // namespace LIBC_NAMESPACE_DECL
diff --git a/libc/src/ctype/islower.cpp b/libc/src/ctype/islower.cpp
index 831aad32d3a22..920bfc1cc1a59 100644
--- a/libc/src/ctype/islower.cpp
+++ b/libc/src/ctype/islower.cpp
@@ -7,15 +7,18 @@
//===----------------------------------------------------------------------===//
#include "src/ctype/islower.h"
-#include "src/__support/ctype_utils.h"
+#include "src/__support/CPP/limits.h"
#include "src/__support/common.h"
+#include "src/__support/ctype_utils.h"
#include "src/__support/macros/config.h"
namespace LIBC_NAMESPACE_DECL {
LLVM_LIBC_FUNCTION(int, islower, (int c)) {
- return static_cast<int>(internal::islower(static_cast<unsigned>(c)));
+ if (c < 0 || c > cpp::numeric_limits<unsigned char>::max())
+ return 0;
+ return static_cast<int>(internal::islower(static_cast<char>(c)));
}
} // namespace LIBC_NAMESPACE_DECL
diff --git a/libc/src/ctype/islower_l.cpp b/libc/src/ctype/islower_l.cpp
index b9be6acc81c99..da97026dc59a7 100644
--- a/libc/src/ctype/islower_l.cpp
+++ b/libc/src/ctype/islower_l.cpp
@@ -7,15 +7,18 @@
//===----------------------------------------------------------------------===//
#include "src/ctype/islower_l.h"
-#include "src/__support/ctype_utils.h"
+#include "src/__support/CPP/limits.h"
#include "src/__support/common.h"
+#include "src/__support/ctype_utils.h"
#include "src/__support/macros/config.h"
namespace LIBC_NAMESPACE_DECL {
LLVM_LIBC_FUNCTION(int, islower_l, (int c, locale_t)) {
- return static_cast<int>(internal::islower(static_cast<unsigned>(c)));
+ if (c < 0 || c > cpp::numeric_limits<unsigned char>::max())
+ return 0;
+ return static_cast<int>(internal::islower(static_cast<char>(c)));
}
} // namespace LIBC_NAMESPACE_DECL
diff --git a/libc/src/ctype/ispunct.cpp b/libc/src/ctype/ispunct.cpp
index 0635294220b9c..4950036e9b81f 100644
--- a/libc/src/ctype/ispunct.cpp
+++ b/libc/src/ctype/ispunct.cpp
@@ -8,6 +8,7 @@
#include "src/ctype/ispunct.h"
+#include "src/__support/CPP/limits.h"
#include "src/__support/common.h"
#include "src/__support/ctype_utils.h"
#include "src/__support/macros/config.h"
@@ -15,7 +16,9 @@
namespace LIBC_NAMESPACE_DECL {
LLVM_LIBC_FUNCTION(int, ispunct, (int c)) {
- const unsigned ch = static_cast<unsigned>(c);
+ if (c < 0 || c > cpp::numeric_limits<unsigned char>::max())
+ return 0;
+ const char ch = static_cast<char>(c);
return static_cast<int>(!internal::isalnum(ch) && internal::isgraph(ch));
}
diff --git a/libc/src/ctype/ispunct_l.cpp b/libc/src/ctype/ispunct_l.cpp
index e825fbe2001b0..79cd47b6a214d 100644
--- a/libc/src/ctype/ispunct_l.cpp
+++ b/libc/src/ctype/ispunct_l.cpp
@@ -8,6 +8,7 @@
#include "src/ctype/ispunct_l.h"
+#include "src/__support/CPP/limits.h"
#include "src/__support/common.h"
#include "src/__support/ctype_utils.h"
#include "src/__support/macros/config.h"
@@ -15,7 +16,9 @@
namespace LIBC_NAMESPACE_DECL {
LLVM_LIBC_FUNCTION(int, ispunct_l, (int c, locale_t)) {
- const unsigned ch = static_cast<unsigned>(c);
+ if (c < 0 || c > cpp::numeric_limits<unsigned char>::max())
+ return 0;
+ const char ch = static_cast<char>(c);
return static_cast<int>(!internal::isalnum(ch) && internal::isgraph(ch));
}
diff --git a/libc/src/ctype/isspace.cpp b/libc/src/ctype/isspace.cpp
index 005bf460fc103..908cc11960739 100644
--- a/libc/src/ctype/isspace.cpp
+++ b/libc/src/ctype/isspace.cpp
@@ -7,15 +7,18 @@
//===----------------------------------------------------------------------===//
#include "src/ctype/isspace.h"
-#include "src/__support/ctype_utils.h"
+#include "src/__support/CPP/limits.h"
#include "src/__support/common.h"
+#include "src/__support/ctype_utils.h"
#include "src/__support/macros/config.h"
namespace LIBC_NAMESPACE_DECL {
LLVM_LIBC_FUNCTION(int, isspace, (int c)) {
- return static_cast<int>(internal::isspace(static_cast<unsigned>(c)));
+ if (c < 0 || c > cpp::numeric_limits<unsigned char>::max())
+ return 0;
+ return static_cast<int>(internal::isspace(static_cast<char>(c)));
}
} // namespace LIBC_NAMESPACE_DECL
diff --git a/libc/src/ctype/isspace_l.cpp b/libc/src/ctype/isspace_l.cpp
index 5c46dd6805126..dd287c633b5c4 100644
--- a/libc/src/ctype/isspace_l.cpp
+++ b/libc/src/ctype/isspace_l.cpp
@@ -7,15 +7,18 @@
//===----------------------------------------------------------------------===//
#include "src/ctype/isspace_l.h"
-#include "src/__support/ctype_utils.h"
+#include "src/__support/CPP/limits.h"
#include "src/__support/common.h"
+#include "src/__support/ctype_utils.h"
#include "src/__support/macros/config.h"
namespace LIBC_NAMESPACE_DECL {
LLVM_LIBC_FUNCTION(int, isspace_l, (int c, locale_t)) {
- return static_cast<int>(internal::isspace(static_cast<unsigned>(c)));
+ if (c < 0 || c > cpp::numeric_limits<unsigned char>::max())
+ return 0;
+ return static_cast<int>(internal::isspace(static_cast<char>(c)));
}
} // namespace LIBC_NAMESPACE_DECL
diff --git a/libc/src/ctype/isupper.cpp b/libc/src/ctype/isupper.cpp
index 965fa336b28b4..c5c3dbd5d7d4a 100644
--- a/libc/src/ctype/isupper.cpp
+++ b/libc/src/ctype/isupper.cpp
@@ -7,15 +7,18 @@
//===----------------------------------------------------------------------===//
#include "src/ctype/isupper.h"
-#include "src/__support/ctype_utils.h"
+#include "src/__support/CPP/limits.h"
#include "src/__support/common.h"
+#include "src/__support/ctype_utils.h"
#include "src/__support/macros/config.h"
namespace LIBC_NAMESPACE_DECL {
LLVM_LIBC_FUNCTION(int, isupper, (int c)) {
- return static_cast<int>(internal::isupper(static_cast<unsigned>(c)));
+ if (c < 0 || c > cpp::numeric_limits<unsigned char>::max())
+ return 0;
+ return static_cast<int>(internal::isupper(static_cast<char>(c)));
}
} // namespace LIBC_NAMESPACE_DECL
diff --git a/libc/src/ctype/isupper_l.cpp b/libc/src/ctype/isupper_l.cpp
index 358990261d603..44ed9dab90a16 100644
--- a/libc/src/ctype/isupper_l.cpp
+++ b/libc/src/ctype/isupper_l.cpp
@@ -7,15 +7,18 @@
//===----------------------------------------------------------------------===//
#include "src/ctype/isupper_l.h"
-#include "src/__support/ctype_utils.h"
+#include "src/__support/CPP/limits.h"
#include "src/__support/common.h"
+#include "src/__support/ctype_utils.h"
#include "src/__support/macros/config.h"
namespace LIBC_NAMESPACE_DECL {
LLVM_LIBC_FUNCTION(int, isupper_l, (int c, locale_t)) {...
[truncated]
|
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
This is a counterpart of llvm#166225 but for wctype_utils (which are not yet widely used). For now, I'm just changing the types from wint_t to wchar_t to match the regular ctype_utils change. The next change may rename most of the functions to match the name of ctype_utils variants, so that we could be calling them from the templated code operating on "const char*" and "const wchar_t*" strings, and the right function signature would be picked up.
michaelrj-google
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the cleanup
|
Thank you for the review! I've checked that the code is not is seemingly not pessimized by this change too much: https://godbolt.org/z/he31Mvfch (lookup table is still there). |
This is a counterpart of #166225 but for wctype_utils (which are not yet widely used). For now, I'm just changing the types from wint_t to wchar_t to match the regular ctype_utils change. The next change may rename most of the functions to match the name of ctype_utils variants, so that we could be calling them from the templated code operating on "const char*" and "const wchar_t*" strings, and the right function signature would be picked up.
…e. (#166234) This is a counterpart of llvm/llvm-project#166225 but for wctype_utils (which are not yet widely used). For now, I'm just changing the types from wint_t to wchar_t to match the regular ctype_utils change. The next change may rename most of the functions to match the name of ctype_utils variants, so that we could be calling them from the templated code operating on "const char*" and "const wchar_t*" strings, and the right function signature would be picked up.
Functions like isalpha / tolower can operate on chars internally. This allows us to get rid of unnecessary casts and open a way to creating wchar_t overloads with the same names (e.g. for isalpha), that would simplify templated code for conversion functions (see 315dfe5).
Add the int->char converstion to public entrypoints implementation instead. We also need to introduce bounds check on the input argument values - these functions' behavior is unspecified if the argument is neither EOF nor fits in "unsigned char" range, but the tests we've had verified that they always return false for small negative values. To preserve this behavior, cover it explicitly.