Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix clang to recognize new C23 modifiers %w and %wf when printing and scanning #71771

Draft
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

ZijunZhaoCCK
Copy link
Contributor

No description provided.

Copy link

github-actions bot commented Nov 9, 2023

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 4d4c30a37c75a4c41bcc70be1431651704624b6a a5b2d771fc51df5018765896207b76cfee88e39b -- clang/include/clang/AST/FormatString.h clang/include/clang/Basic/TargetInfo.h clang/lib/AST/FormatString.cpp clang/lib/AST/PrintfFormatString.cpp clang/lib/AST/ScanfFormatString.cpp clang/lib/Basic/TargetInfo.cpp clang/lib/Sema/SemaChecking.cpp clang/test/Sema/format-strings-ms.c
View the diff from clang-format here.
diff --git a/clang/include/clang/AST/FormatString.h b/clang/include/clang/AST/FormatString.h
index 7add829ee1..673fd0d91e 100644
--- a/clang/include/clang/AST/FormatString.h
+++ b/clang/include/clang/AST/FormatString.h
@@ -81,9 +81,10 @@ public:
     AsLongDouble, // 'L'
     AsAllocate,   // for '%as', GNU extension to C90 scanf
     AsMAllocate,  // for '%ms', GNU extension to scanf
-    AsWide,       // 'w' (1. MSVCRT, like l but only for c, C, s, S, or Z on windows
-                  // 2. for b, d, i, o, u, x, or X when a size followed(like 8, 16, 32 or 64)
-    AsWideFast,   // 'wf' (for b, d, i, o, u, x, or X)
+    AsWide, // 'w' (1. MSVCRT, like l but only for c, C, s, S, or Z on windows
+            // 2. for b, d, i, o, u, x, or X when a size followed(like 8, 16, 32
+            // or 64)
+    AsWideFast,          // 'wf' (for b, d, i, o, u, x, or X)
     AsWideChar = AsLong, // for '%ls', only makes sense for printf
   };
 
@@ -424,8 +425,8 @@ protected:
 
 public:
   FormatSpecifier(bool isPrintf)
-    : CS(isPrintf), VectorNumElts(false),
-      UsesPositionalArg(false), argIndex(0), ExplicitlyFixedSizeValid(true) {}
+      : CS(isPrintf), VectorNumElts(false), UsesPositionalArg(false),
+        argIndex(0), ExplicitlyFixedSizeValid(true) {}
 
   void setLengthModifier(LengthModifier lm) {
     LM = lm;
@@ -465,13 +466,9 @@ public:
     FieldWidth = Amt;
   }
 
-  void setExplicitlyFixedSize(unsigned S) {
-    ExplicitlyFixedSize = S;
-  }
+  void setExplicitlyFixedSize(unsigned S) { ExplicitlyFixedSize = S; }
 
-  unsigned getExplicitlyFixedSize() const {
-    return ExplicitlyFixedSize;
-  }
+  unsigned getExplicitlyFixedSize() const { return ExplicitlyFixedSize; }
 
   bool usesPositionalArg() const { return UsesPositionalArg; }
 
diff --git a/clang/include/clang/Basic/TargetInfo.h b/clang/include/clang/Basic/TargetInfo.h
index c1e79b5d75..36018ae14c 100644
--- a/clang/include/clang/Basic/TargetInfo.h
+++ b/clang/include/clang/Basic/TargetInfo.h
@@ -443,8 +443,8 @@ public:
                                          bool IsSigned) const;
 
   /// Return the fastest integer type with at least the specified width.
-  virtual IntType getFastIntTypeByWidth(unsigned BitWidth,
-                                        bool IsSigned, bool Fast) const;
+  virtual IntType getFastIntTypeByWidth(unsigned BitWidth, bool IsSigned,
+                                        bool Fast) const;
 
   /// Return floating point type with specified width. On PPC, there are
   /// three possible types for 128-bit floating point: "PPC double-double",
diff --git a/clang/lib/AST/FormatString.cpp b/clang/lib/AST/FormatString.cpp
index 39359c2b18..a05f6563f2 100644
--- a/clang/lib/AST/FormatString.cpp
+++ b/clang/lib/AST/FormatString.cpp
@@ -287,7 +287,8 @@ clang::analyze_format_string::ParseLengthModifier(FormatSpecifier &FS,
       break;
     case 'w':
       ++I;
-      if (I == E) return false;
+      if (I == E)
+        return false;
       if (*I == 'f') {
         lmKind = LengthModifier::AsWideFast;
         ++I;
@@ -295,7 +296,8 @@ clang::analyze_format_string::ParseLengthModifier(FormatSpecifier &FS,
         lmKind = LengthModifier::AsWide;
       }
 
-      if (I == E) return false;
+      if (I == E)
+        return false;
       int s = 0;
       bool MSVCRT = true;
       while (unsigned(*I - '0') <= 9) {
@@ -304,10 +306,11 @@ clang::analyze_format_string::ParseLengthModifier(FormatSpecifier &FS,
         ++I;
       }
 
-      // MSVCRT == true is MSVCRT case, like l but only for c, C, s, S, or Z on windows
-      // MSVCRT == false for b, d, i, o, u, x, or X when a size followed(like 8, 16, 32 or 64)
+      // MSVCRT == true is MSVCRT case, like l but only for c, C, s, S, or Z on
+      // windows MSVCRT == false for b, d, i, o, u, x, or X when a size
+      // followed(like 8, 16, 32 or 64)
       if (!MSVCRT) {
-        std::set<int> supported_list {8, 16, 32, 64};
+        std::set<int> supported_list{8, 16, 32, 64};
         if (supported_list.count(s) == 0)
           FS.setExplicitlyFixedSizeValid(false);
         FS.setExplicitlyFixedSize(s);
@@ -1009,19 +1012,19 @@ bool FormatSpecifier::hasValidLengthModifier(const TargetInfo &Target,
           return true;
         default:
           return false;
-      }
+        }
     case LengthModifier::AsWideFast:
       switch (CS.getKind()) {
-        case ConversionSpecifier::bArg:
-        case ConversionSpecifier::dArg:
-        case ConversionSpecifier::iArg:
-        case ConversionSpecifier::oArg:
-        case ConversionSpecifier::uArg:
-        case ConversionSpecifier::xArg:
-        case ConversionSpecifier::XArg:
-          return true;
-        default:
-          return false;
+      case ConversionSpecifier::bArg:
+      case ConversionSpecifier::dArg:
+      case ConversionSpecifier::iArg:
+      case ConversionSpecifier::oArg:
+      case ConversionSpecifier::uArg:
+      case ConversionSpecifier::xArg:
+      case ConversionSpecifier::XArg:
+        return true;
+      default:
+        return false;
       }
   }
   llvm_unreachable("Invalid LengthModifier Kind!");
diff --git a/clang/lib/AST/PrintfFormatString.cpp b/clang/lib/AST/PrintfFormatString.cpp
index f41fba78b5..6298481743 100644
--- a/clang/lib/AST/PrintfFormatString.cpp
+++ b/clang/lib/AST/PrintfFormatString.cpp
@@ -484,27 +484,27 @@ bool clang::analyze_format_string::parseFormatStringHasFormattingSpecifiers(
   return false;
 }
 
-ArgType clang::analyze_format_string::wToArgType(
-    int Size, bool IsSigned, bool Fast, ASTContext &C) {
+ArgType clang::analyze_format_string::wToArgType(int Size, bool IsSigned,
+                                                 bool Fast, ASTContext &C) {
   switch (C.getTargetInfo().getFastIntTypeByWidth(Size, IsSigned, Fast)) {
-    case TargetInfo::SignedChar:
-      return C.SignedCharTy;
-    case TargetInfo::UnsignedChar:
-      return C.UnsignedCharTy;
-    case TargetInfo::SignedShort:
-      return C.ShortTy;
-    case TargetInfo::UnsignedShort:
-      return C.UnsignedShortTy;
-    case TargetInfo::SignedInt:
-      return C.IntTy;
-    case TargetInfo::UnsignedInt:
-      return C.UnsignedIntTy;
-    case TargetInfo::SignedLongLong:
-      return C.LongLongTy;
-    case TargetInfo::UnsignedLongLong:
-      return C.UnsignedLongLongTy;
-    default:
-      return ArgType::Invalid();
+  case TargetInfo::SignedChar:
+    return C.SignedCharTy;
+  case TargetInfo::UnsignedChar:
+    return C.UnsignedCharTy;
+  case TargetInfo::SignedShort:
+    return C.ShortTy;
+  case TargetInfo::UnsignedShort:
+    return C.UnsignedShortTy;
+  case TargetInfo::SignedInt:
+    return C.IntTy;
+  case TargetInfo::UnsignedInt:
+    return C.UnsignedIntTy;
+  case TargetInfo::SignedLongLong:
+    return C.LongLongTy;
+  case TargetInfo::UnsignedLongLong:
+    return C.UnsignedLongLongTy;
+  default:
+    return ArgType::Invalid();
   }
 }
 
diff --git a/clang/lib/AST/ScanfFormatString.cpp b/clang/lib/AST/ScanfFormatString.cpp
index 52bb1de117..e030590427 100644
--- a/clang/lib/AST/ScanfFormatString.cpp
+++ b/clang/lib/AST/ScanfFormatString.cpp
@@ -313,8 +313,10 @@ ArgType ScanfSpecifier::getArgType(ASTContext &Ctx) const {
         case LengthModifier::AsWideFast:
           int S = getExplicitlyFixedSize();
           bool Fast = LM.getKind() == LengthModifier::AsWideFast ? true : false;
-          if (CS.getKind() == ConversionSpecifier::uArg or CS.getKind() == ConversionSpecifier::UArg)
-            return clang::analyze_format_string::wToArgType(S, false, Fast, Ctx);
+          if (CS.getKind() == ConversionSpecifier::uArg or
+              CS.getKind() == ConversionSpecifier::UArg)
+            return clang::analyze_format_string::wToArgType(S, false, Fast,
+                                                            Ctx);
           return clang::analyze_format_string::wToArgType(S, true, Fast, Ctx);
       }
       llvm_unreachable("Unsupported LengthModifier Type");
diff --git a/clang/lib/Basic/TargetInfo.cpp b/clang/lib/Basic/TargetInfo.cpp
index fbc842d492..89a0a73bd8 100644
--- a/clang/lib/Basic/TargetInfo.cpp
+++ b/clang/lib/Basic/TargetInfo.cpp
@@ -313,11 +313,12 @@ TargetInfo::IntType TargetInfo::getLeastIntTypeByWidth(unsigned BitWidth,
 }
 
 TargetInfo::IntType TargetInfo::getFastIntTypeByWidth(unsigned BitWidth,
-                                                      bool IsSigned, bool Fast)
-                                                      const {
-  IntType SignedFastType = getTriple().isArch64Bit() ? SignedLongLong : SignedInt;
-  IntType UnSignedFastType = getTriple().isArch64Bit() ?
-                             UnsignedLongLong : UnsignedInt;
+                                                      bool IsSigned,
+                                                      bool Fast) const {
+  IntType SignedFastType =
+      getTriple().isArch64Bit() ? SignedLongLong : SignedInt;
+  IntType UnSignedFastType =
+      getTriple().isArch64Bit() ? UnsignedLongLong : UnsignedInt;
   if (getCharWidth() == BitWidth)
     return IsSigned ? SignedChar : UnsignedChar;
   if (getShortWidth() == BitWidth) {
diff --git a/clang/lib/Sema/SemaChecking.cpp b/clang/lib/Sema/SemaChecking.cpp
index 7e2f5dc670..dc0a439240 100644
--- a/clang/lib/Sema/SemaChecking.cpp
+++ b/clang/lib/Sema/SemaChecking.cpp
@@ -11681,13 +11681,12 @@ bool CheckPrintfHandler::HandlePrintfSpecifier(
     HandleNonStandardConversionSpecifier(CS, startSpecifier, specifierLen);
 
   // Check the explicitly fixed size is supported
-  if (!FS.isExplicitlyFixedSizeSupported()){
-    EmitFormatDiagnostic(S.PDiag(
-                         diag::warn_format_conversion_size_unsupported) 
-                         << FS.getLengthModifier().toString()
-                         << FS.getExplicitlyFixedSize(),
+  if (!FS.isExplicitlyFixedSizeSupported()) {
+    EmitFormatDiagnostic(S.PDiag(diag::warn_format_conversion_size_unsupported)
+                             << FS.getLengthModifier().toString()
+                             << FS.getExplicitlyFixedSize(),
                          getLocationOfByte(startSpecifier),
-                         /*IsStringLocation*/true,
+                         /*IsStringLocation*/ true,
                          getSpecifierRange(startSpecifier, specifierLen));
   }
 
@@ -12300,15 +12299,14 @@ bool CheckScanfHandler::HandleScanfSpecifier(
   else if (!FS.hasStandardLengthConversionCombination())
     HandleInvalidLengthModifier(FS, CS, startSpecifier, specifierLen,
                                 diag::warn_format_non_standard_conversion_spec);
-  
+
   // Check the explicitly fixed size is supported
-  if (!FS.isExplicitlyFixedSizeSupported()){
-    EmitFormatDiagnostic(S.PDiag(
-                         diag::warn_format_conversion_size_unsupported) 
-                         << FS.getLengthModifier().toString()
-                         << FS.getExplicitlyFixedSize(),
+  if (!FS.isExplicitlyFixedSizeSupported()) {
+    EmitFormatDiagnostic(S.PDiag(diag::warn_format_conversion_size_unsupported)
+                             << FS.getLengthModifier().toString()
+                             << FS.getExplicitlyFixedSize(),
                          getLocationOfByte(startSpecifier),
-                         /*IsStringLocation*/true,
+                         /*IsStringLocation*/ true,
                          getSpecifierRange(startSpecifier, specifierLen));
   }
 

clang/include/clang/AST/FormatString.h Outdated Show resolved Hide resolved
clang/include/clang/AST/FormatString.h Outdated Show resolved Hide resolved
clang/lib/AST/FormatString.cpp Show resolved Hide resolved
@@ -484,6 +484,26 @@ bool clang::analyze_format_string::parseFormatStringHasFormattingSpecifiers(
return false;
}

ArgType clang::analyze_format_string::wToArgType(
int size, bool fast, ASTContext &C) {
ArgType fastType = C.getTargetInfo().getTriple().isArch64Bit() ? C.LongLongTy : C.IntTy;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also seems incorrect -- we should be looking at the underlying type for the specified fastest (or exact) width integer. But again, not entirely clear how to do that from this interface...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually quite a mess right now...

We currently have TargetInfo getIntTypeByWidth/getLeastIntTypeByWidth, but, no getFastIntTypeByWidth.

And this is a real problem...e.g. clang's stdint.h builtin-header goes through a ton of complexity to obfuscate that in fact it just ends up defining int_leastN_t and int_fastN_t as aliases to intN_t, on all platforms we support (which have all of 8/16/32/64-bit types available.)

But, clang's stdint.h defers to the libc's stdint.h if that exists, and e.g. on glibc, int_fast{16|32}_t are instead defined as int (32bit) or long (64bit).

OK, fine...Except -- we ALSO define a bunch of preprocessor macros for the fast types in InitPreprocessor (e.g. __INT_FAST16_TYPE__. Which assume that fast == least. And, of course, these don't match the libc stdint.h's definitions...

And then there's the problem of which of 'long' vs 'long long' is used on various platforms, when both are 64-bit types.

I think we need to fix all that as a prerequisite. At that point, this code simply calls getFastIntTypeByWidth.

clang/test/Sema/format-strings-ms.c Show resolved Hide resolved
@ZijunZhaoCCK ZijunZhaoCCK self-assigned this Nov 14, 2023
clang/include/clang/AST/FormatString.h Outdated Show resolved Hide resolved
clang/lib/AST/FormatString.cpp Outdated Show resolved Hide resolved
clang/lib/AST/FormatString.cpp Outdated Show resolved Hide resolved
clang/lib/AST/PrintfFormatString.cpp Outdated Show resolved Hide resolved
@ZijunZhaoCCK ZijunZhaoCCK changed the title Fix clang to recognize new C23 modifiers %w and %wf when printing Fix clang to recognize new C23 modifiers %w and %wf when printing and scanning Dec 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants