Skip to content

Conversation

@vonosmas
Copy link
Contributor

@vonosmas vonosmas commented Nov 3, 2025

Functions like isalpha / tolower can operate on chars internally. This allows us to get rid of unnecessary casts and open a way to creating wchar_t overloads with the same names (e.g. for isalpha), that would simplify templated code for conversion functions (see 315dfe5).

Add the int->char converstion to public entrypoints implementation instead. We also need to introduce bounds check on the input argument values - these functions' behavior is unspecified if the argument is neither EOF nor fits in "unsigned char" range, but the tests we've had verified that they always return false for small negative values. To preserve this behavior, cover it explicitly.

Functions like isalpha / tolower can operate on chars internally.
This allows us to get rid of unnecessary casts and open a way to
creating wchar_t overloads with the same names (e.g. for isalpha),
that would simplify templated code for conversion functions
(see 315dfe5).

Add the int->char converstion to public entrypoints implementation
instead. We also need to introduce bounds check on the input
argument values - these functions' behavior is unspecified if the
argument is neither EOF nor fits in "unsigned char" range, but the tests
we've had verified that they always return false for small negative values.
To preserve this behavior, cover it explicitly.
@llvmbot llvmbot added libc bazel "Peripheral" support tier build system: utils/bazel labels Nov 3, 2025
@llvmbot
Copy link
Member

llvmbot commented Nov 3, 2025

@llvm/pr-subscribers-libc

Author: Alexey Samsonov (vonosmas)

Changes

Functions like isalpha / tolower can operate on chars internally. This allows us to get rid of unnecessary casts and open a way to creating wchar_t overloads with the same names (e.g. for isalpha), that would simplify templated code for conversion functions (see 315dfe5).

Add the int->char converstion to public entrypoints implementation instead. We also need to introduce bounds check on the input argument values - these functions' behavior is unspecified if the argument is neither EOF nor fits in "unsigned char" range, but the tests we've had verified that they always return false for small negative values. To preserve this behavior, cover it explicitly.


Patch is 38.38 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/166225.diff

37 Files Affected:

  • (modified) libc/src/__support/ctype_utils.h (+12-12)
  • (modified) libc/src/__support/integer_to_string.h (+2-3)
  • (modified) libc/src/ctype/CMakeLists.txt (+23)
  • (modified) libc/src/ctype/isalnum.cpp (+5-2)
  • (modified) libc/src/ctype/isalnum_l.cpp (+5-2)
  • (modified) libc/src/ctype/isalpha.cpp (+4-1)
  • (modified) libc/src/ctype/isalpha_l.cpp (+4-1)
  • (modified) libc/src/ctype/isdigit.cpp (+5-1)
  • (modified) libc/src/ctype/isdigit_l.cpp (+5-1)
  • (modified) libc/src/ctype/isgraph.cpp (+4-1)
  • (modified) libc/src/ctype/isgraph_l.cpp (+4-1)
  • (modified) libc/src/ctype/islower.cpp (+5-2)
  • (modified) libc/src/ctype/islower_l.cpp (+5-2)
  • (modified) libc/src/ctype/ispunct.cpp (+4-1)
  • (modified) libc/src/ctype/ispunct_l.cpp (+4-1)
  • (modified) libc/src/ctype/isspace.cpp (+5-2)
  • (modified) libc/src/ctype/isspace_l.cpp (+5-2)
  • (modified) libc/src/ctype/isupper.cpp (+5-2)
  • (modified) libc/src/ctype/isupper_l.cpp (+5-2)
  • (modified) libc/src/ctype/isxdigit.cpp (+5-2)
  • (modified) libc/src/ctype/isxdigit_l.cpp (+5-2)
  • (modified) libc/src/ctype/tolower.cpp (+9-2)
  • (modified) libc/src/ctype/tolower_l.cpp (+7-2)
  • (modified) libc/src/ctype/toupper.cpp (+9-2)
  • (modified) libc/src/ctype/toupper_l.cpp (+7-2)
  • (modified) libc/src/stdio/printf_core/float_dec_converter_limited.h (+2-2)
  • (modified) libc/src/stdio/printf_core/float_hex_converter.h (+3-4)
  • (modified) libc/src/stdlib/l64a.cpp (+2-4)
  • (modified) libc/src/string/strcasestr.cpp (+2-2)
  • (modified) libc/src/strings/strcasecmp.cpp (+2-2)
  • (modified) libc/src/strings/strcasecmp_l.cpp (+2-2)
  • (modified) libc/src/strings/strncasecmp.cpp (+2-2)
  • (modified) libc/src/strings/strncasecmp_l.cpp (+2-2)
  • (modified) libc/test/UnitTest/MemoryMatcher.cpp (+2-2)
  • (modified) libc/test/src/ctype/islower_test.cpp (+1-1)
  • (modified) libc/test/src/stdlib/StrtolTest.h (+12-12)
  • (modified) utils/bazel/llvm-project-overlay/libc/BUILD.bazel (+11)
diff --git a/libc/src/__support/ctype_utils.h b/libc/src/__support/ctype_utils.h
index be0f25330af9e..61b7a0aeb5b67 100644
--- a/libc/src/__support/ctype_utils.h
+++ b/libc/src/__support/ctype_utils.h
@@ -27,7 +27,7 @@ namespace internal {
 // as well as a way to support non-ASCII character encodings.
 
 // Similarly, do not change these functions to use case ranges. e.g.
-//  bool islower(int ch) {
+//  bool islower(char ch) {
 //    switch(ch) {
 //    case 'a'...'z':
 //      return true;
@@ -37,7 +37,7 @@ namespace internal {
 // EBCDIC. Technically we could use some smaller ranges, but that's even harder
 // to read.
 
-LIBC_INLINE static constexpr bool islower(int ch) {
+LIBC_INLINE static constexpr bool islower(char ch) {
   switch (ch) {
   case 'a':
   case 'b':
@@ -71,7 +71,7 @@ LIBC_INLINE static constexpr bool islower(int ch) {
   }
 }
 
-LIBC_INLINE static constexpr bool isupper(int ch) {
+LIBC_INLINE static constexpr bool isupper(char ch) {
   switch (ch) {
   case 'A':
   case 'B':
@@ -105,7 +105,7 @@ LIBC_INLINE static constexpr bool isupper(int ch) {
   }
 }
 
-LIBC_INLINE static constexpr bool isdigit(int ch) {
+LIBC_INLINE static constexpr bool isdigit(char ch) {
   switch (ch) {
   case '0':
   case '1':
@@ -123,7 +123,7 @@ LIBC_INLINE static constexpr bool isdigit(int ch) {
   }
 }
 
-LIBC_INLINE static constexpr int tolower(int ch) {
+LIBC_INLINE static constexpr char tolower(char ch) {
   switch (ch) {
   case 'A':
     return 'a';
@@ -182,7 +182,7 @@ LIBC_INLINE static constexpr int tolower(int ch) {
   }
 }
 
-LIBC_INLINE static constexpr int toupper(int ch) {
+LIBC_INLINE static constexpr char toupper(char ch) {
   switch (ch) {
   case 'a':
     return 'A';
@@ -241,7 +241,7 @@ LIBC_INLINE static constexpr int toupper(int ch) {
   }
 }
 
-LIBC_INLINE static constexpr bool isalpha(int ch) {
+LIBC_INLINE static constexpr bool isalpha(char ch) {
   switch (ch) {
   case 'a':
   case 'b':
@@ -301,7 +301,7 @@ LIBC_INLINE static constexpr bool isalpha(int ch) {
   }
 }
 
-LIBC_INLINE static constexpr bool isalnum(int ch) {
+LIBC_INLINE static constexpr bool isalnum(char ch) {
   switch (ch) {
   case 'a':
   case 'b':
@@ -371,7 +371,7 @@ LIBC_INLINE static constexpr bool isalnum(int ch) {
   }
 }
 
-LIBC_INLINE static constexpr int b36_char_to_int(int ch) {
+LIBC_INLINE static constexpr int b36_char_to_int(char ch) {
   switch (ch) {
   case '0':
     return 0;
@@ -476,7 +476,7 @@ LIBC_INLINE static constexpr int b36_char_to_int(int ch) {
   }
 }
 
-LIBC_INLINE static constexpr int int_to_b36_char(int num) {
+LIBC_INLINE static constexpr char int_to_b36_char(int num) {
   // Can't actually use LIBC_ASSERT here because it depends on integer_to_string
   // which depends on this.
 
@@ -559,7 +559,7 @@ LIBC_INLINE static constexpr int int_to_b36_char(int num) {
   }
 }
 
-LIBC_INLINE static constexpr bool isspace(int ch) {
+LIBC_INLINE static constexpr bool isspace(char ch) {
   switch (ch) {
   case ' ':
   case '\t':
@@ -574,7 +574,7 @@ LIBC_INLINE static constexpr bool isspace(int ch) {
 }
 
 // not yet encoding independent.
-LIBC_INLINE static constexpr bool isgraph(int ch) {
+LIBC_INLINE static constexpr bool isgraph(char ch) {
   return 0x20 < ch && ch < 0x7f;
 }
 
diff --git a/libc/src/__support/integer_to_string.h b/libc/src/__support/integer_to_string.h
index 29449bd739730..5e7369de00962 100644
--- a/libc/src/__support/integer_to_string.h
+++ b/libc/src/__support/integer_to_string.h
@@ -378,9 +378,8 @@ template <typename T, typename Fmt = radix::Dec> class IntegerToString {
     using UNSIGNED_T = make_integral_or_big_int_unsigned_t<T>;
 
     LIBC_INLINE static char digit_char(uint8_t digit) {
-      const int result = internal::int_to_b36_char(digit);
-      return static_cast<char>(Fmt::IS_UPPERCASE ? internal::toupper(result)
-                                                 : result);
+      const char result = internal::int_to_b36_char(digit);
+      return Fmt::IS_UPPERCASE ? internal::toupper(result) : result;
     }
 
     LIBC_INLINE static void
diff --git a/libc/src/ctype/CMakeLists.txt b/libc/src/ctype/CMakeLists.txt
index 8830c1bccf9ea..68e982bd4529e 100644
--- a/libc/src/ctype/CMakeLists.txt
+++ b/libc/src/ctype/CMakeLists.txt
@@ -6,6 +6,7 @@ add_entrypoint_object(
     isalnum.h
   DEPENDS
     libc.include.ctype
+    libc.src.__support.CPP.limits
     libc.src.__support.ctype_utils
 )
 
@@ -16,6 +17,7 @@ add_entrypoint_object(
   HDRS
     isalpha.h
   DEPENDS
+    libc.src.__support.CPP.limits
     libc.src.__support.ctype_utils
 )
 
@@ -50,6 +52,7 @@ add_entrypoint_object(
   HDRS
     isdigit.h
   DEPENDS
+    libc.src.__support.CPP.limits
     libc.src.__support.ctype_utils
 )
 
@@ -60,6 +63,7 @@ add_entrypoint_object(
   HDRS
     isgraph.h
   DEPENDS
+    libc.src.__support.CPP.limits
     libc.src.__support.ctype_utils
 )
 
@@ -70,6 +74,7 @@ add_entrypoint_object(
   HDRS
     islower.h
   DEPENDS
+    libc.src.__support.CPP.limits
     libc.src.__support.ctype_utils
 )
 
@@ -88,6 +93,7 @@ add_entrypoint_object(
   HDRS
     ispunct.h
   DEPENDS
+    libc.src.__support.CPP.limits
     libc.src.__support.ctype_utils
 )
 
@@ -97,6 +103,9 @@ add_entrypoint_object(
     isspace.cpp
   HDRS
     isspace.h
+  DEPENDS
+    libc.src.__support.CPP.limits
+    libc.src.__support.ctype_utils
 )
 
 add_entrypoint_object(
@@ -106,6 +115,7 @@ add_entrypoint_object(
   HDRS
     isupper.h
   DEPENDS
+    libc.src.__support.CPP.limits  
     libc.src.__support.ctype_utils
 )
 
@@ -116,6 +126,7 @@ add_entrypoint_object(
   HDRS
     isxdigit.h
   DEPENDS
+    libc.src.__support.CPP.limits
     libc.src.__support.ctype_utils
 )
 
@@ -126,6 +137,7 @@ add_entrypoint_object(
   HDRS
     tolower.h
   DEPENDS
+    libc.src.__support.CPP.limits
     libc.src.__support.ctype_utils
 )
 
@@ -144,6 +156,7 @@ add_entrypoint_object(
   HDRS
     toupper.h
   DEPENDS
+    libc.src.__support.CPP.limits
     libc.src.__support.ctype_utils
 )
 
@@ -160,6 +173,7 @@ add_entrypoint_object(
     isalnum_l.h
   DEPENDS
     libc.include.ctype
+    libc.src.__support.CPP.limits
     libc.src.__support.ctype_utils
     libc.hdr.types.locale_t
 )
@@ -171,6 +185,7 @@ add_entrypoint_object(
   HDRS
     isalpha_l.h
   DEPENDS
+    libc.src.__support.CPP.limits
     libc.src.__support.ctype_utils
     libc.hdr.types.locale_t
 )
@@ -202,6 +217,7 @@ add_entrypoint_object(
   HDRS
     isdigit_l.h
   DEPENDS
+    libc.src.__support.CPP.limits
     libc.src.__support.ctype_utils
     libc.hdr.types.locale_t
 )
@@ -224,6 +240,7 @@ add_entrypoint_object(
   HDRS
     islower_l.h
   DEPENDS
+    libc.src.__support.CPP.limits
     libc.src.__support.ctype_utils
     libc.hdr.types.locale_t
 )
@@ -257,6 +274,8 @@ add_entrypoint_object(
     isspace_l.h
   DEPENDS
     libc.hdr.types.locale_t
+    libc.src.__support.CPP.limits
+    libc.src.__support.ctype_utils
 )
 
 add_entrypoint_object(
@@ -266,6 +285,7 @@ add_entrypoint_object(
   HDRS
     isupper_l.h
   DEPENDS
+    libc.src.__support.CPP.limits  
     libc.src.__support.ctype_utils
     libc.hdr.types.locale_t
 )
@@ -277,6 +297,7 @@ add_entrypoint_object(
   HDRS
     isxdigit_l.h
   DEPENDS
+    libc.src.__support.CPP.limits
     libc.src.__support.ctype_utils
     libc.hdr.types.locale_t
 )
@@ -288,6 +309,7 @@ add_entrypoint_object(
   HDRS
     tolower_l.h
   DEPENDS
+    libc.src.__support.CPP.limits
     libc.src.__support.ctype_utils
     libc.hdr.types.locale_t
 )
@@ -299,6 +321,7 @@ add_entrypoint_object(
   HDRS
     toupper_l.h
   DEPENDS
+    libc.src.__support.CPP.limits
     libc.src.__support.ctype_utils
     libc.hdr.types.locale_t
 )
diff --git a/libc/src/ctype/isalnum.cpp b/libc/src/ctype/isalnum.cpp
index 54a3e35748879..102b5e79e4a18 100644
--- a/libc/src/ctype/isalnum.cpp
+++ b/libc/src/ctype/isalnum.cpp
@@ -7,15 +7,18 @@
 //===----------------------------------------------------------------------===//
 
 #include "src/ctype/isalnum.h"
-#include "src/__support/ctype_utils.h"
 
+#include "src/__support/CPP/limits.h"
 #include "src/__support/common.h"
+#include "src/__support/ctype_utils.h"
 #include "src/__support/macros/config.h"
 
 namespace LIBC_NAMESPACE_DECL {
 
 LLVM_LIBC_FUNCTION(int, isalnum, (int c)) {
-  return static_cast<int>(internal::isalnum(static_cast<unsigned>(c)));
+  if (c < 0 || c > cpp::numeric_limits<unsigned char>::max())
+    return 0;
+  return static_cast<int>(internal::isalnum(static_cast<char>(c)));
 }
 
 } // namespace LIBC_NAMESPACE_DECL
diff --git a/libc/src/ctype/isalnum_l.cpp b/libc/src/ctype/isalnum_l.cpp
index 671d9b75c4c33..173e1c174121e 100644
--- a/libc/src/ctype/isalnum_l.cpp
+++ b/libc/src/ctype/isalnum_l.cpp
@@ -7,15 +7,18 @@
 //===----------------------------------------------------------------------===//
 
 #include "src/ctype/isalnum_l.h"
-#include "src/__support/ctype_utils.h"
 
+#include "src/__support/CPP/limits.h"
 #include "src/__support/common.h"
+#include "src/__support/ctype_utils.h"
 #include "src/__support/macros/config.h"
 
 namespace LIBC_NAMESPACE_DECL {
 
 LLVM_LIBC_FUNCTION(int, isalnum_l, (int c, locale_t)) {
-  return static_cast<int>(internal::isalnum(static_cast<unsigned>(c)));
+  if (c < 0 || c > cpp::numeric_limits<unsigned char>::max())
+    return 0;
+  return static_cast<int>(internal::isalnum(static_cast<char>(c)));
 }
 
 } // namespace LIBC_NAMESPACE_DECL
diff --git a/libc/src/ctype/isalpha.cpp b/libc/src/ctype/isalpha.cpp
index 78b26f6a486ea..8b6ecf192e129 100644
--- a/libc/src/ctype/isalpha.cpp
+++ b/libc/src/ctype/isalpha.cpp
@@ -8,6 +8,7 @@
 
 #include "src/ctype/isalpha.h"
 
+#include "src/__support/CPP/limits.h"
 #include "src/__support/common.h"
 #include "src/__support/ctype_utils.h"
 #include "src/__support/macros/config.h"
@@ -15,7 +16,9 @@
 namespace LIBC_NAMESPACE_DECL {
 
 LLVM_LIBC_FUNCTION(int, isalpha, (int c)) {
-  return static_cast<int>(internal::isalpha(static_cast<unsigned>(c)));
+  if (c < 0 || c > cpp::numeric_limits<unsigned char>::max())
+    return 0;  
+  return static_cast<int>(internal::isalpha(static_cast<char>(c)));
 }
 
 } // namespace LIBC_NAMESPACE_DECL
diff --git a/libc/src/ctype/isalpha_l.cpp b/libc/src/ctype/isalpha_l.cpp
index 0619d979bedf2..982bcc569faaf 100644
--- a/libc/src/ctype/isalpha_l.cpp
+++ b/libc/src/ctype/isalpha_l.cpp
@@ -8,6 +8,7 @@
 
 #include "src/ctype/isalpha_l.h"
 
+#include "src/__support/CPP/limits.h"
 #include "src/__support/common.h"
 #include "src/__support/ctype_utils.h"
 #include "src/__support/macros/config.h"
@@ -15,7 +16,9 @@
 namespace LIBC_NAMESPACE_DECL {
 
 LLVM_LIBC_FUNCTION(int, isalpha_l, (int c, locale_t)) {
-  return static_cast<int>(internal::isalpha(static_cast<unsigned>(c)));
+  if (c < 0 || c > cpp::numeric_limits<unsigned char>::max())
+    return 0;
+  return static_cast<int>(internal::isalpha(static_cast<char>(c)));
 }
 
 } // namespace LIBC_NAMESPACE_DECL
diff --git a/libc/src/ctype/isdigit.cpp b/libc/src/ctype/isdigit.cpp
index 1f711943861f8..43553c794a2f3 100644
--- a/libc/src/ctype/isdigit.cpp
+++ b/libc/src/ctype/isdigit.cpp
@@ -7,6 +7,8 @@
 //===----------------------------------------------------------------------===//
 
 #include "src/ctype/isdigit.h"
+
+#include "src/__support/CPP/limits.h"
 #include "src/__support/common.h"
 #include "src/__support/ctype_utils.h"
 #include "src/__support/macros/config.h"
@@ -14,7 +16,9 @@
 namespace LIBC_NAMESPACE_DECL {
 
 LLVM_LIBC_FUNCTION(int, isdigit, (int c)) {
-  return static_cast<int>(internal::isdigit(static_cast<unsigned>(c)));
+  if (c < 0 || c > cpp::numeric_limits<unsigned char>::max())
+    return 0;
+  return static_cast<int>(internal::isdigit(static_cast<char>(c)));
 }
 
 } // namespace LIBC_NAMESPACE_DECL
diff --git a/libc/src/ctype/isdigit_l.cpp b/libc/src/ctype/isdigit_l.cpp
index ca981362bfe83..40b5618906dac 100644
--- a/libc/src/ctype/isdigit_l.cpp
+++ b/libc/src/ctype/isdigit_l.cpp
@@ -7,6 +7,8 @@
 //===----------------------------------------------------------------------===//
 
 #include "src/ctype/isdigit_l.h"
+
+#include "src/__support/CPP/limits.h"
 #include "src/__support/common.h"
 #include "src/__support/ctype_utils.h"
 #include "src/__support/macros/config.h"
@@ -14,7 +16,9 @@
 namespace LIBC_NAMESPACE_DECL {
 
 LLVM_LIBC_FUNCTION(int, isdigit_l, (int c, locale_t)) {
-  return static_cast<int>(internal::isdigit(static_cast<unsigned>(c)));
+  if (c < 0 || c > cpp::numeric_limits<unsigned char>::max())
+    return 0;
+  return static_cast<int>(internal::isdigit(static_cast<char>(c)));
 }
 
 } // namespace LIBC_NAMESPACE_DECL
diff --git a/libc/src/ctype/isgraph.cpp b/libc/src/ctype/isgraph.cpp
index 74bb2e75d138e..86ac0cb0d2714 100644
--- a/libc/src/ctype/isgraph.cpp
+++ b/libc/src/ctype/isgraph.cpp
@@ -8,6 +8,7 @@
 
 #include "src/ctype/isgraph.h"
 
+#include "src/__support/CPP/limits.h"
 #include "src/__support/common.h"
 #include "src/__support/ctype_utils.h"
 #include "src/__support/macros/config.h"
@@ -15,7 +16,9 @@
 namespace LIBC_NAMESPACE_DECL {
 
 LLVM_LIBC_FUNCTION(int, isgraph, (int c)) {
-  return static_cast<int>(internal::isgraph(static_cast<unsigned>(c)));
+  if (c < 0 || c > cpp::numeric_limits<unsigned char>::max())
+    return 0;  
+  return static_cast<int>(internal::isgraph(static_cast<char>(c)));
 }
 
 } // namespace LIBC_NAMESPACE_DECL
diff --git a/libc/src/ctype/isgraph_l.cpp b/libc/src/ctype/isgraph_l.cpp
index cbef6df148aed..dddcb9be4f80c 100644
--- a/libc/src/ctype/isgraph_l.cpp
+++ b/libc/src/ctype/isgraph_l.cpp
@@ -8,6 +8,7 @@
 
 #include "src/ctype/isgraph_l.h"
 
+#include "src/__support/CPP/limits.h"
 #include "src/__support/common.h"
 #include "src/__support/ctype_utils.h"
 #include "src/__support/macros/config.h"
@@ -15,7 +16,9 @@
 namespace LIBC_NAMESPACE_DECL {
 
 LLVM_LIBC_FUNCTION(int, isgraph_l, (int c, locale_t)) {
-  return static_cast<int>(internal::isgraph(static_cast<unsigned>(c)));
+  if (c < 0 || c > cpp::numeric_limits<unsigned char>::max())
+    return 0;
+  return static_cast<int>(internal::isgraph(static_cast<char>(c)));
 }
 
 } // namespace LIBC_NAMESPACE_DECL
diff --git a/libc/src/ctype/islower.cpp b/libc/src/ctype/islower.cpp
index 831aad32d3a22..920bfc1cc1a59 100644
--- a/libc/src/ctype/islower.cpp
+++ b/libc/src/ctype/islower.cpp
@@ -7,15 +7,18 @@
 //===----------------------------------------------------------------------===//
 
 #include "src/ctype/islower.h"
-#include "src/__support/ctype_utils.h"
 
+#include "src/__support/CPP/limits.h"
 #include "src/__support/common.h"
+#include "src/__support/ctype_utils.h"
 #include "src/__support/macros/config.h"
 
 namespace LIBC_NAMESPACE_DECL {
 
 LLVM_LIBC_FUNCTION(int, islower, (int c)) {
-  return static_cast<int>(internal::islower(static_cast<unsigned>(c)));
+  if (c < 0 || c > cpp::numeric_limits<unsigned char>::max())
+    return 0;
+  return static_cast<int>(internal::islower(static_cast<char>(c)));
 }
 
 } // namespace LIBC_NAMESPACE_DECL
diff --git a/libc/src/ctype/islower_l.cpp b/libc/src/ctype/islower_l.cpp
index b9be6acc81c99..da97026dc59a7 100644
--- a/libc/src/ctype/islower_l.cpp
+++ b/libc/src/ctype/islower_l.cpp
@@ -7,15 +7,18 @@
 //===----------------------------------------------------------------------===//
 
 #include "src/ctype/islower_l.h"
-#include "src/__support/ctype_utils.h"
 
+#include "src/__support/CPP/limits.h"
 #include "src/__support/common.h"
+#include "src/__support/ctype_utils.h"
 #include "src/__support/macros/config.h"
 
 namespace LIBC_NAMESPACE_DECL {
 
 LLVM_LIBC_FUNCTION(int, islower_l, (int c, locale_t)) {
-  return static_cast<int>(internal::islower(static_cast<unsigned>(c)));
+  if (c < 0 || c > cpp::numeric_limits<unsigned char>::max())
+    return 0;
+  return static_cast<int>(internal::islower(static_cast<char>(c)));
 }
 
 } // namespace LIBC_NAMESPACE_DECL
diff --git a/libc/src/ctype/ispunct.cpp b/libc/src/ctype/ispunct.cpp
index 0635294220b9c..4950036e9b81f 100644
--- a/libc/src/ctype/ispunct.cpp
+++ b/libc/src/ctype/ispunct.cpp
@@ -8,6 +8,7 @@
 
 #include "src/ctype/ispunct.h"
 
+#include "src/__support/CPP/limits.h"
 #include "src/__support/common.h"
 #include "src/__support/ctype_utils.h"
 #include "src/__support/macros/config.h"
@@ -15,7 +16,9 @@
 namespace LIBC_NAMESPACE_DECL {
 
 LLVM_LIBC_FUNCTION(int, ispunct, (int c)) {
-  const unsigned ch = static_cast<unsigned>(c);
+  if (c < 0 || c > cpp::numeric_limits<unsigned char>::max())
+    return 0;
+  const char ch = static_cast<char>(c);
   return static_cast<int>(!internal::isalnum(ch) && internal::isgraph(ch));
 }
 
diff --git a/libc/src/ctype/ispunct_l.cpp b/libc/src/ctype/ispunct_l.cpp
index e825fbe2001b0..79cd47b6a214d 100644
--- a/libc/src/ctype/ispunct_l.cpp
+++ b/libc/src/ctype/ispunct_l.cpp
@@ -8,6 +8,7 @@
 
 #include "src/ctype/ispunct_l.h"
 
+#include "src/__support/CPP/limits.h"
 #include "src/__support/common.h"
 #include "src/__support/ctype_utils.h"
 #include "src/__support/macros/config.h"
@@ -15,7 +16,9 @@
 namespace LIBC_NAMESPACE_DECL {
 
 LLVM_LIBC_FUNCTION(int, ispunct_l, (int c, locale_t)) {
-  const unsigned ch = static_cast<unsigned>(c);
+  if (c < 0 || c > cpp::numeric_limits<unsigned char>::max())
+    return 0;
+  const char ch = static_cast<char>(c);
   return static_cast<int>(!internal::isalnum(ch) && internal::isgraph(ch));
 }
 
diff --git a/libc/src/ctype/isspace.cpp b/libc/src/ctype/isspace.cpp
index 005bf460fc103..908cc11960739 100644
--- a/libc/src/ctype/isspace.cpp
+++ b/libc/src/ctype/isspace.cpp
@@ -7,15 +7,18 @@
 //===----------------------------------------------------------------------===//
 
 #include "src/ctype/isspace.h"
-#include "src/__support/ctype_utils.h"
 
+#include "src/__support/CPP/limits.h"
 #include "src/__support/common.h"
+#include "src/__support/ctype_utils.h"
 #include "src/__support/macros/config.h"
 
 namespace LIBC_NAMESPACE_DECL {
 
 LLVM_LIBC_FUNCTION(int, isspace, (int c)) {
-  return static_cast<int>(internal::isspace(static_cast<unsigned>(c)));
+  if (c < 0 || c > cpp::numeric_limits<unsigned char>::max())
+    return 0;  
+  return static_cast<int>(internal::isspace(static_cast<char>(c)));
 }
 
 } // namespace LIBC_NAMESPACE_DECL
diff --git a/libc/src/ctype/isspace_l.cpp b/libc/src/ctype/isspace_l.cpp
index 5c46dd6805126..dd287c633b5c4 100644
--- a/libc/src/ctype/isspace_l.cpp
+++ b/libc/src/ctype/isspace_l.cpp
@@ -7,15 +7,18 @@
 //===----------------------------------------------------------------------===//
 
 #include "src/ctype/isspace_l.h"
-#include "src/__support/ctype_utils.h"
 
+#include "src/__support/CPP/limits.h"
 #include "src/__support/common.h"
+#include "src/__support/ctype_utils.h"
 #include "src/__support/macros/config.h"
 
 namespace LIBC_NAMESPACE_DECL {
 
 LLVM_LIBC_FUNCTION(int, isspace_l, (int c, locale_t)) {
-  return static_cast<int>(internal::isspace(static_cast<unsigned>(c)));
+  if (c < 0 || c > cpp::numeric_limits<unsigned char>::max())
+    return 0;  
+  return static_cast<int>(internal::isspace(static_cast<char>(c)));
 }
 
 } // namespace LIBC_NAMESPACE_DECL
diff --git a/libc/src/ctype/isupper.cpp b/libc/src/ctype/isupper.cpp
index 965fa336b28b4..c5c3dbd5d7d4a 100644
--- a/libc/src/ctype/isupper.cpp
+++ b/libc/src/ctype/isupper.cpp
@@ -7,15 +7,18 @@
 //===----------------------------------------------------------------------===//
 
 #include "src/ctype/isupper.h"
-#include "src/__support/ctype_utils.h"
 
+#include "src/__support/CPP/limits.h"
 #include "src/__support/common.h"
+#include "src/__support/ctype_utils.h"
 #include "src/__support/macros/config.h"
 
 namespace LIBC_NAMESPACE_DECL {
 
 LLVM_LIBC_FUNCTION(int, isupper, (int c)) {
-  return static_cast<int>(internal::isupper(static_cast<unsigned>(c)));
+  if (c < 0 || c > cpp::numeric_limits<unsigned char>::max())
+    return 0;
+  return static_cast<int>(internal::isupper(static_cast<char>(c)));
 }
 
 } // namespace LIBC_NAMESPACE_DECL
diff --git a/libc/src/ctype/isupper_l.cpp b/libc/src/ctype/isupper_l.cpp
index 358990261d603..44ed9dab90a16 100644
--- a/libc/src/ctype/isupper_l.cpp
+++ b/libc/src/ctype/isupper_l.cpp
@@ -7,15 +7,18 @@
 //===----------------------------------------------------------------------===//
 
 #include "src/ctype/isupper_l.h"
-#include "src/__support/ctype_utils.h"
 
+#include "src/__support/CPP/limits.h"
 #include "src/__support/common.h"
+#include "src/__support/ctype_utils.h"
 #include "src/__support/macros/config.h"
 
 namespace LIBC_NAMESPACE_DECL {
 
 LLVM_LIBC_FUNCTION(int, isupper_l, (int c, locale_t)) {...
[truncated]

@github-actions
Copy link

github-actions bot commented Nov 3, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

vonosmas added a commit to vonosmas/llvm-project that referenced this pull request Nov 3, 2025
This is a counterpart of llvm#166225
but for wctype_utils (which are not yet widely used). For now, I'm just
changing the types from wint_t to wchar_t to match the regular ctype_utils
change. The next change may rename most of the functions to match
the name of ctype_utils variants, so that we could be calling them from
the templated code operating on "const char*" and "const wchar_t*"
strings, and the right function signature would be picked up.
Copy link
Contributor

@michaelrj-google michaelrj-google left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the cleanup

@vonosmas
Copy link
Contributor Author

vonosmas commented Nov 5, 2025

Thank you for the review!

I've checked that the code is not is seemingly not pessimized by this change too much: https://godbolt.org/z/he31Mvfch (lookup table is still there).

@vonosmas vonosmas merged commit 120689e into llvm:main Nov 5, 2025
21 checks passed
@vonosmas vonosmas deleted the libc-ctype-type-fix branch November 5, 2025 19:02
vonosmas added a commit that referenced this pull request Nov 5, 2025
This is a counterpart of
#166225 but for wctype_utils
(which are not yet widely used). For now, I'm just changing the types
from wint_t to wchar_t to match the regular ctype_utils change. The next
change may rename most of the functions to match the name of ctype_utils
variants, so that we could be calling them from the templated code
operating on "const char*" and "const wchar_t*" strings, and the right
function signature would be picked up.
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Nov 5, 2025
…e. (#166234)

This is a counterpart of
llvm/llvm-project#166225 but for wctype_utils
(which are not yet widely used). For now, I'm just changing the types
from wint_t to wchar_t to match the regular ctype_utils change. The next
change may rename most of the functions to match the name of ctype_utils
variants, so that we could be calling them from the templated code
operating on "const char*" and "const wchar_t*" strings, and the right
function signature would be picked up.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bazel "Peripheral" support tier build system: utils/bazel libc

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants