Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[clang] Improved isSimpleTypeSpecifier #79037

Merged
merged 2 commits into from
Jan 31, 2024

Conversation

carlos4242
Copy link
Contributor

@carlos4242 carlos4242 commented Jan 22, 2024

Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be
notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write
permissions for the repository. In which case you can instead tag reviewers by
name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review
by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate
is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

Copy link

github-actions bot commented Jan 22, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@carlos4242 carlos4242 force-pushed the improved-isSimpleTypeSpecifier2 branch from c3de969 to 61a06b7 Compare January 22, 2024 20:09
@carlos4242
Copy link
Contributor Author

@owenca ... is this better for you?

@owenca
Copy link
Contributor

owenca commented Jan 25, 2024

@carlos4242 carlos4242 force-pushed the improved-isSimpleTypeSpecifier2 branch from b2f5cd3 to fdc1e0d Compare January 26, 2024 15:35
@carlos4242 carlos4242 marked this pull request as ready for review January 26, 2024 15:39
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Jan 26, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Jan 26, 2024

@llvm/pr-subscribers-clang

Author: Carl Peto (carlos4242)

Changes
  • Sema::isSimpleTypeSpecifier return true for _Bool in c99 (currently returns false for _Bool, regardless of C dialect). (Fixes #72203)
  • move simple type decision code into shared location (IdentifierInfo)
  • replace the logic with a check for simple types and a proper check for a valid keyword in the appropriate dialect
  • change all call sites to match the above new API

Full diff: https://github.com/llvm/llvm-project/pull/79037.diff

5 Files Affected:

  • (modified) clang/include/clang/Lex/Token.h (+7)
  • (modified) clang/include/clang/Sema/Sema.h (+1-1)
  • (modified) clang/lib/Parse/ParseExpr.cpp (+1-1)
  • (modified) clang/lib/Parse/ParseObjc.cpp (+1-1)
  • (modified) clang/lib/Sema/SemaDecl.cpp (+13-13)
diff --git a/clang/include/clang/Lex/Token.h b/clang/include/clang/Lex/Token.h
index 1409e2c58b550df..3c707d469322139 100644
--- a/clang/include/clang/Lex/Token.h
+++ b/clang/include/clang/Lex/Token.h
@@ -196,6 +196,13 @@ class Token {
     PtrData = (void*) II;
   }
 
+  bool hasIdentifierInfo() {
+    if (is(tok::raw_identifier) || isAnnotation() || isLiteral() ||
+        is(tok::eof))
+      return false;
+    return true;
+  }
+
   const void *getEofData() const {
     assert(is(tok::eof));
     return reinterpret_cast<const void *>(PtrData);
diff --git a/clang/include/clang/Sema/Sema.h b/clang/include/clang/Sema/Sema.h
index 1f1cbd11ff73581..dc8797b2f12106f 100644
--- a/clang/include/clang/Sema/Sema.h
+++ b/clang/include/clang/Sema/Sema.h
@@ -2636,7 +2636,7 @@ class Sema final {
 
   void DiagnoseUseOfUnimplementedSelectors();
 
-  bool isSimpleTypeSpecifier(tok::TokenKind Kind) const;
+  bool isSimpleTypeSpecifier(Token &Tok) const;
 
   ParsedType getTypeName(const IdentifierInfo &II, SourceLocation NameLoc,
                          Scope *S, CXXScopeSpec *SS = nullptr,
diff --git a/clang/lib/Parse/ParseExpr.cpp b/clang/lib/Parse/ParseExpr.cpp
index e862856a08ca117..034d56c6b0e300a 100644
--- a/clang/lib/Parse/ParseExpr.cpp
+++ b/clang/lib/Parse/ParseExpr.cpp
@@ -1597,7 +1597,7 @@ ExprResult Parser::ParseCastExpression(CastParseKind ParseKind,
       if (TryAnnotateTypeOrScopeToken())
         return ExprError();
 
-      if (!Actions.isSimpleTypeSpecifier(Tok.getKind()))
+      if (!Actions.isSimpleTypeSpecifier(Tok))
         // We are trying to parse a simple-type-specifier but might not get such
         // a token after error recovery.
         return ExprError();
diff --git a/clang/lib/Parse/ParseObjc.cpp b/clang/lib/Parse/ParseObjc.cpp
index 849fd1ac95a442e..4771b69eadb34b9 100644
--- a/clang/lib/Parse/ParseObjc.cpp
+++ b/clang/lib/Parse/ParseObjc.cpp
@@ -2971,7 +2971,7 @@ bool Parser::ParseObjCXXMessageReceiver(bool &IsExpr, void *&TypeOrExpr) {
                   tok::annot_cxxscope))
     TryAnnotateTypeOrScopeToken();
 
-  if (!Actions.isSimpleTypeSpecifier(Tok.getKind())) {
+  if (!Actions.isSimpleTypeSpecifier(Tok)) {
     //   objc-receiver:
     //     expression
     // Make sure any typos in the receiver are corrected or diagnosed, so that
diff --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp
index f9bf1d14bdc4f68..4c7b12a53768c01 100644
--- a/clang/lib/Sema/SemaDecl.cpp
+++ b/clang/lib/Sema/SemaDecl.cpp
@@ -128,10 +128,15 @@ class TypeNameValidatorCCC final : public CorrectionCandidateCallback {
 } // end anonymous namespace
 
 /// Determine whether the token kind starts a simple-type-specifier.
-bool Sema::isSimpleTypeSpecifier(tok::TokenKind Kind) const {
+bool Sema::isSimpleTypeSpecifier(Token &Tok) const {
+  auto Kind = Tok.getKind();
+  auto LangOpts = getLangOpts();
+
   switch (Kind) {
-  // FIXME: Take into account the current language when deciding whether a
-  // token kind is a valid type specifier
+  case tok::annot_typename:
+  case tok::annot_decltype:
+    return true;
+
   case tok::kw_short:
   case tok::kw_long:
   case tok::kw___int64:
@@ -156,24 +161,19 @@ bool Sema::isSimpleTypeSpecifier(tok::TokenKind Kind) const {
 #define TRANSFORM_TYPE_TRAIT_DEF(_, Trait) case tok::kw___##Trait:
 #include "clang/Basic/TransformTypeTraits.def"
   case tok::kw___auto_type:
-    return true;
-
-  case tok::annot_typename:
+  case tok::kw__Bool:
   case tok::kw_char16_t:
   case tok::kw_char32_t:
   case tok::kw_typeof:
-  case tok::annot_decltype:
   case tok::kw_decltype:
-    return getLangOpts().CPlusPlus;
-
   case tok::kw_char8_t:
-    return getLangOpts().Char8;
+    if (!Tok.hasIdentifierInfo())
+      return false;
+    return Tok.getIdentifierInfo()->isKeyword(LangOpts);
 
   default:
-    break;
+    return false;
   }
-
-  return false;
 }
 
 namespace {

clang/lib/Sema/SemaDecl.cpp Outdated Show resolved Hide resolved
clang/include/clang/Sema/Sema.h Outdated Show resolved Hide resolved
clang/lib/Sema/SemaDecl.cpp Outdated Show resolved Hide resolved
clang/lib/Sema/SemaDecl.cpp Outdated Show resolved Hide resolved
clang/include/clang/Lex/Token.h Outdated Show resolved Hide resolved
@carlos4242
Copy link
Contributor Author

I think unit testing the bug fix to this function (the copy that is in Sema) is going to be nearly impossible.

To prove the point, I tried hard coding the wrong return value for many types just now...

  switch (Kind) {
  case tok::kw_bool:
  case tok::kw_short:
  case tok::kw_long:
  case tok::kw___int64:
  case tok::kw___int128:
  case tok::kw_signed:
  case tok::kw_unsigned:
  case tok::kw_char:
  case tok::kw_int:
  case tok::kw_half:
  case tok::kw_float:
  case tok::kw_double:
  case tok::kw___bf16:
  case tok::kw__Float16:
  case tok::kw___float128:
  case tok::kw___ibm128:
  case tok::kw_wchar_t:
    return false;

...Then re-ran the full clang/tests suite (18,000+ tests) and all passed as normal. So there's basically very little unit test coverage on this function as it stands. It's not just a question of improving or adding to existing tests. Whoever first wrote this function seems to have not added really enough unit test coverage at the time, and it's only called in two obscure places in Sema. The duplicate copy function in Format is more widely used and would probably be easier to test. But for this PR I think I'll make the changes you suggest, then leave it as is... either it gets approved without new tests or it'll get culled one day as an abandoned PR. I'll leave that up to the maintainers.

@owenca
Copy link
Contributor

owenca commented Jan 28, 2024

LGTM, but please rebase and resolve conflicts.

    - Sema::isSimpleTypeSpecifier return true for _Bool in c99 (currently returns false for _Bool, regardless of C dialect). (Fixes llvm#72203)
    - replace the logic with a check for simple types and a proper check for a valid keyword in the appropriate dialect
    - PR feedback fixes for owenca
@carlos4242 carlos4242 force-pushed the improved-isSimpleTypeSpecifier2 branch from 9863244 to a785453 Compare January 29, 2024 11:27
@carlos4242
Copy link
Contributor Author

LGTM, but please rebase and resolve conflicts.

done, rebased, tests pass locally

@carlos4242
Copy link
Contributor Author

note: build kite is passing on linux but failing windows tests with an error like "mt.exe : general error c101008d: Failed to write the updated manifest to the resource of file "bin\llvm-symbolizer.exe". Operation did not complete successfully because the file contains a virus or potentially unwanted software" ... I'm a bit unsure how to proceed with that. Does it look like a genuine failure I should address or a spurious build quirk?

@owenca
Copy link
Contributor

owenca commented Jan 30, 2024

Does it look like a genuine failure I should address or a spurious build quirk?

The latter.

Copy link
Collaborator

@AaronBallman AaronBallman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM; I can't see a reasonable way for us to test this change and it's almost an NFC change, so I think it's fine to land without tests given that we're at the very start of the 19.x cycle.

@AaronBallman
Copy link
Collaborator

Once precommit CI comes back green, I'll merge on your behalf.

@owenca owenca merged commit b4d832c into llvm:main Jan 31, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Sema::isSimpleTypeSpecifier should return true for _Bool from C99 onwards
4 participants