Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ClangFormat] Fix formatting bugs. #76245

Merged
merged 4 commits into from Jan 11, 2024
Merged

[ClangFormat] Fix formatting bugs. #76245

merged 4 commits into from Jan 11, 2024

Conversation

r4nt
Copy link
Collaborator

@r4nt r4nt commented Dec 22, 2023

  1. There are multiple calls to addFakeParenthesis; move the guard to not
    assign fake parenthesis into the function to make sure we cover all
    calls.
  2. MustBreakBefore can be set on a token in two cases: either during
    unwrapped line parsing, or later, during token annotation. We must
    keep the latter, but reset the former.
  3. Added a test to document that the intended behavior of preferring not to
    break between a return type and a function identifier.
    For example, with MOCK_METHOD(r, n, a)=r n a, the code
    MOCK_METHOD(void, f, (int a, int b)) should prefer the same breaks as
    the expanded void f(int a, int b).

1. There are multiple calls to addFakeParenthesis; move the guard to not
   assign fake parenthesis into the function to make sure we cover all
   calls.
2. MustBreakBefore can be set on a token in two cases: either during
   unwrapped line parsing, or later, during token annotation. We must
   keep the latter, but reset the former.
3. Added a test to document that the intended behavior of preferring not to
   break between a return type and a function identifier.
   For example, with MOCK_METHOD(r, n, a)=r n a, the code
   MOCK_METHOD(void, f, (int a, int b)) should prefer the same breaks as
   the expanded void f(int a, int b).
@llvmbot
Copy link
Collaborator

llvmbot commented Dec 22, 2023

@llvm/pr-subscribers-clang-format

Author: None (r4nt)

Changes
  1. There are multiple calls to addFakeParenthesis; move the guard to not
    assign fake parenthesis into the function to make sure we cover all
    calls.
  2. MustBreakBefore can be set on a token in two cases: either during
    unwrapped line parsing, or later, during token annotation. We must
    keep the latter, but reset the former.
  3. Added a test to document that the intended behavior of preferring not to
    break between a return type and a function identifier.
    For example, with MOCK_METHOD(r, n, a)=r n a, the code
    MOCK_METHOD(void, f, (int a, int b)) should prefer the same breaks as
    the expanded void f(int a, int b).

Full diff: https://github.com/llvm/llvm-project/pull/76245.diff

5 Files Affected:

  • (modified) clang/lib/Format/FormatToken.h (+18-8)
  • (modified) clang/lib/Format/TokenAnnotator.cpp (+6-7)
  • (modified) clang/lib/Format/UnwrappedLineFormatter.cpp (+6-4)
  • (modified) clang/lib/Format/UnwrappedLineParser.cpp (+2)
  • (modified) clang/unittests/Format/FormatTestMacroExpansion.cpp (+38)
diff --git a/clang/lib/Format/FormatToken.h b/clang/lib/Format/FormatToken.h
index 3f9664f8f78a3e..b1e3ae8ab303d6 100644
--- a/clang/lib/Format/FormatToken.h
+++ b/clang/lib/Format/FormatToken.h
@@ -275,14 +275,15 @@ class AnnotatedLine;
 struct FormatToken {
   FormatToken()
       : HasUnescapedNewline(false), IsMultiline(false), IsFirst(false),
-        MustBreakBefore(false), IsUnterminatedLiteral(false),
-        CanBreakBefore(false), ClosesTemplateDeclaration(false),
-        StartsBinaryExpression(false), EndsBinaryExpression(false),
-        PartOfMultiVariableDeclStmt(false), ContinuesLineCommentSection(false),
-        Finalized(false), ClosesRequiresClause(false),
-        EndsCppAttributeGroup(false), BlockKind(BK_Unknown),
-        Decision(FD_Unformatted), PackingKind(PPK_Inconclusive),
-        TypeIsFinalized(false), Type(TT_Unknown) {}
+        MustBreakBefore(false), MustBreakBeforeFinalized(false),
+        IsUnterminatedLiteral(false), CanBreakBefore(false),
+        ClosesTemplateDeclaration(false), StartsBinaryExpression(false),
+        EndsBinaryExpression(false), PartOfMultiVariableDeclStmt(false),
+        ContinuesLineCommentSection(false), Finalized(false),
+        ClosesRequiresClause(false), EndsCppAttributeGroup(false),
+        BlockKind(BK_Unknown), Decision(FD_Unformatted),
+        PackingKind(PPK_Inconclusive), TypeIsFinalized(false),
+        Type(TT_Unknown) {}
 
   /// The \c Token.
   Token Tok;
@@ -318,6 +319,10 @@ struct FormatToken {
   /// before the token.
   unsigned MustBreakBefore : 1;
 
+  /// Whether MustBreakBefore is finalized during parsing and must not
+  /// be reset between runs.
+  unsigned MustBreakBeforeFinalized : 1;
+
   /// Set to \c true if this token is an unterminated literal.
   unsigned IsUnterminatedLiteral : 1;
 
@@ -416,10 +421,15 @@ struct FormatToken {
   /// to another one please use overwriteFixedType, or even better remove the
   /// need to reassign the type.
   void setFinalizedType(TokenType T) {
+    if (MacroCtx && MacroCtx->Role == MR_UnexpandedArg)
+      return;
+
     Type = T;
     TypeIsFinalized = true;
   }
   void overwriteFixedType(TokenType T) {
+    if (MacroCtx && MacroCtx->Role == MR_UnexpandedArg)
+      return;
     TypeIsFinalized = false;
     setType(T);
   }
diff --git a/clang/lib/Format/TokenAnnotator.cpp b/clang/lib/Format/TokenAnnotator.cpp
index f3551af3424396..c26b248a3b2d40 100644
--- a/clang/lib/Format/TokenAnnotator.cpp
+++ b/clang/lib/Format/TokenAnnotator.cpp
@@ -2769,13 +2769,6 @@ class ExpressionParser {
       // Consume operators with higher precedence.
       parse(Precedence + 1);
 
-      // Do not assign fake parenthesis to tokens that are part of an
-      // unexpanded macro call. The line within the macro call contains
-      // the parenthesis and commas, and we will not find operators within
-      // that structure.
-      if (Current && Current->MacroParent)
-        break;
-
       int CurrentPrecedence = getCurrentPrecedence();
 
       if (Precedence == CurrentPrecedence && Current &&
@@ -2919,6 +2912,12 @@ class ExpressionParser {
 
   void addFakeParenthesis(FormatToken *Start, prec::Level Precedence,
                           FormatToken *End = nullptr) {
+    // Do not assign fake parenthesis to tokens that are part of an
+    // unexpanded macro call. The line within the macro call contains
+    // the parenthesis and commas, and we will not find operators within
+    // that structure.
+    if (Start->MacroParent) return;
+
     Start->FakeLParens.push_back(Precedence);
     if (Precedence > prec::Unknown)
       Start->StartsBinaryExpression = true;
diff --git a/clang/lib/Format/UnwrappedLineFormatter.cpp b/clang/lib/Format/UnwrappedLineFormatter.cpp
index 56077499c39d53..27983a330ac40a 100644
--- a/clang/lib/Format/UnwrappedLineFormatter.cpp
+++ b/clang/lib/Format/UnwrappedLineFormatter.cpp
@@ -954,13 +954,15 @@ static void markFinalized(FormatToken *Tok) {
       // will be modified as unexpanded arguments (as part of the macro call
       // formatting) in the next pass.
       Tok->MacroCtx->Role = MR_UnexpandedArg;
-      // Reset whether spaces are required before this token, as that is context
-      // dependent, and that context may change when formatting the macro call.
-      // For example, given M(x) -> 2 * x, and the macro call M(var),
-      // the token 'var' will have SpacesRequiredBefore = 1 after being
+      // Reset whether spaces or a line break are required before this token, as
+      // that is context dependent, and that context may change when formatting
+      // the macro call.  For example, given M(x) -> 2 * x, and the macro call
+      // M(var), the token 'var' will have SpacesRequiredBefore = 1 after being
       // formatted as part of the expanded macro, but SpacesRequiredBefore = 0
       // for its position within the macro call.
       Tok->SpacesRequiredBefore = 0;
+      if (!Tok->MustBreakBeforeFinalized)
+        Tok->MustBreakBefore = 0;
     } else {
       Tok->Finalized = true;
     }
diff --git a/clang/lib/Format/UnwrappedLineParser.cpp b/clang/lib/Format/UnwrappedLineParser.cpp
index 684609747a5513..942aaff3c456df 100644
--- a/clang/lib/Format/UnwrappedLineParser.cpp
+++ b/clang/lib/Format/UnwrappedLineParser.cpp
@@ -4675,6 +4675,7 @@ void UnwrappedLineParser::readToken(int LevelDifference) {
         conditionalCompilationEnd();
       FormatTok = Tokens->getNextToken();
       FormatTok->MustBreakBefore = true;
+      FormatTok->MustBreakBeforeFinalized = true;
     }
 
     auto IsFirstNonCommentOnLine = [](bool FirstNonCommentOnLine,
@@ -4891,6 +4892,7 @@ void UnwrappedLineParser::pushToken(FormatToken *Tok) {
   Line->Tokens.push_back(UnwrappedLineNode(Tok));
   if (MustBreakBeforeNextToken) {
     Line->Tokens.back().Tok->MustBreakBefore = true;
+    Line->Tokens.back().Tok->MustBreakBeforeFinalized = true;
     MustBreakBeforeNextToken = false;
   }
 }
diff --git a/clang/unittests/Format/FormatTestMacroExpansion.cpp b/clang/unittests/Format/FormatTestMacroExpansion.cpp
index 68250234f4201e..afc88f0eb75673 100644
--- a/clang/unittests/Format/FormatTestMacroExpansion.cpp
+++ b/clang/unittests/Format/FormatTestMacroExpansion.cpp
@@ -255,6 +255,44 @@ TEST_F(FormatTestMacroExpansion,
                          Style);
 }
 
+TEST_F(FormatTestMacroExpansion, CommaAsOperator) {
+  FormatStyle Style = getGoogleStyle();
+  Style.Macros.push_back("ASSIGN_OR_RETURN(a, b, c)=a=(b); if(x) c");
+  verifyFormat(R"(ASSIGN_OR_RETURN(auto reader,
+                 logs::proxy::Reader::NewReader(log_filename, access_options,
+                                                reader_options),
+                 _ << aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa << aaaaaa << aaaaaaaaa
+                   << aaaaaaaaaaaa << aaaaaaaaaaa << aaaaaaaaaaaa);
+)", Style);
+}
+
+TEST_F(FormatTestMacroExpansion, ForcedBreakDiffers) {
+  FormatStyle Style = getGoogleStyle();
+  Style.Macros.push_back("ASSIGN_OR_RETURN(a, b)=a=(b)");
+  EXPECT_EQ(R"(
+//
+ASSIGN_OR_RETURN(const lllllllllllllllllllllllllll xxxxxxxxxxxxxxx,
+                 aaaaaaaaaaaaaaaaaaaaaa(
+                     bbbbbbbbb, ccccccccccccccccccccccccccccc, dddddddddddd));
+)", format(R"(
+//
+ASSIGN_OR_RETURN(const lllllllllllllllllllllllllll xxxxxxxxxxxxxxx,
+                 aaaaaaaaaaaaaaaaaaaaaa(
+                     bbbbbbbbb, ccccccccccccccccccccccccccccc, dddddddddddd));
+)", Style));
+}
+
+TEST_F(FormatTestMacroExpansion, PreferNotBreakingBetweenReturnTypeAndFunction) {
+  FormatStyle Style = getGoogleStyle();
+  Style.Macros.push_back("MOCK_METHOD(r, n, a, s)=r n a s");
+  // In the expanded code, we parse a full function signature, and afterwards
+  // know that we prefer not to break before the function name.
+  verifyFormat(R"(MOCK_METHOD(
+    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa, bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb,
+    (cccccccccccccccccccccccccccccc), (const, override));
+)", Style);
+}
+
 } // namespace
 } // namespace test
 } // namespace format

clang/lib/Format/FormatToken.h Outdated Show resolved Hide resolved
clang/lib/Format/TokenAnnotator.cpp Outdated Show resolved Hide resolved
clang/unittests/Format/FormatTestMacroExpansion.cpp Outdated Show resolved Hide resolved
clang/unittests/Format/FormatTestMacroExpansion.cpp Outdated Show resolved Hide resolved
Copy link
Collaborator Author

@r4nt r4nt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed review comments.

clang/unittests/Format/FormatTestMacroExpansion.cpp Outdated Show resolved Hide resolved
clang/unittests/Format/FormatTestMacroExpansion.cpp Outdated Show resolved Hide resolved
clang/unittests/Format/FormatTestMacroExpansion.cpp Outdated Show resolved Hide resolved
clang/unittests/Format/FormatTestMacroExpansion.cpp Outdated Show resolved Hide resolved
clang/unittests/Format/FormatTestMacroExpansion.cpp Outdated Show resolved Hide resolved
@r4nt r4nt merged commit b7770be into llvm:main Jan 11, 2024
3 of 4 checks passed
@r4nt r4nt deleted the 2023-12-22-fix-bugs branch January 11, 2024 12:29
justinfargnoli pushed a commit to justinfargnoli/llvm-project that referenced this pull request Jan 28, 2024
1. There are multiple calls to addFakeParenthesis; move the guard to not
   assign fake parenthesis into the function to make sure we cover all
   calls.
2. MustBreakBefore can be set on a token in two cases: either during
   unwrapped line parsing, or later, during token annotation. We must
   keep the latter, but reset the former.
3. Added a test to document that the intended behavior of preferring not
   to break between a return type and a function identifier.
   For example, with MOCK_METHOD(r, n, a)=r n a, the code
   MOCK_METHOD(void, f, (int a, int b)) should prefer the same breaks as
   the expanded void f(int a, int b).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants