Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[clang] Fix a crash from nested ArrayInitLoopExpr #67722

Merged
merged 6 commits into from
Sep 29, 2023

Conversation

isuckatcs
Copy link
Member

@isuckatcs isuckatcs commented Sep 28, 2023

For the following snippets clang performs the same steps:

  S s[2][2];

  auto [a1,a2] = s;
void crash() {
  S s[2][2];

  int arr[4];

  arr[0] = [s] { return s[0][0].i; }();
}

From the two however the latter crashes. The source of the crash is that we create a temporary variable for the source array expression, the ArrayInitLoop is performed on.

-ArrayInitLoopExpr ... 'S[2][2]'
 |-OpaqueValueExpr ...
 | `-DeclRefExpr ... 'S[2][2]' lvalue Var ... 's' 'S[2][2]'
 `-ArrayInitLoopExpr ... 'S[2]' 
   |-OpaqueValueExpr ... <-- a temporary is created for this during every iteration
   | `<SourceArray> 
   `-...

The problem is that we are unable to differentiate between the temporaries that are created in the different iterations, as we only use the pointer to the node as the key, which stays the same.

As for why only the second snippet crashes, the reason is that after evaluating the first iteration, we note a failure and would return, however during analyzing the second snippet, the CheckingForUndefinedBehavior flag is set, probably because we assign to an array member and clang wants us to keep going.

This patch changes the described behaviour, as if we keep going the exact same expression would be evaluated again and it would fail again, so there's no point of doing that. A different solution would be to wrap the temporary in a FullExpressionRAII so that the temporary is cleaned up before it is created once again.

Fixes #57135

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Sep 28, 2023
@llvmbot
Copy link
Collaborator

llvmbot commented Sep 28, 2023

@llvm/pr-subscribers-clang

Changes

For the following snippets clang performs the same steps:

  S s[2][2];

  auto [a1,a2] = s;
void crash() {
  S s[2][2];

  int arr[4];

  arr[0] = [s] { return s[0][0].i; }();
}

From the two however the latter crashes. The source of the crash is that we create a temporary variable for the source expression, the ArrayInitLoop is performed on.

-ArrayInitLoopExpr ... 'S[2][2]'
 |-OpaqueValueExpr ...
 | `-DeclRefExpr ... 'S[2][2]' lvalue Var ... 's' 'S[2][2]'
 `-ArrayInitLoopExpr ... 'S[2]' &lt;-- a temporary is created for this during every iteration
   |-...
   | `...
   `-...

The problem is that we are unable to differentiate between the temporaries that are created in the different iterations, as we only use the pointer to the node as the key, which stays the same.

As for why only the second snippet crashes, the reason is that after evaluating the first iteration, we note a failure and would return, however during analyzing the second snippet, the CheckingForUndefinedBehavior flag is set, probably because we assign to an array member and clang wants us to keep going.

This patch changes the described behaviour, as if we keep going the exact same expression would be evaluated again and it would fail again, so there's no point of doing that. A different solution would be to wrap the temporary in a FullExpressionRAII so that the temporary is cleaned up before it is created once again.

Fixes #57135


Full diff: https://github.com/llvm/llvm-project/pull/67722.diff

2 Files Affected:

  • (modified) clang/lib/AST/ExprConstant.cpp (+4-4)
  • (added) clang/test/AST/nested-array-init-loop-in-lambda-capture.cpp (+13)
diff --git a/clang/lib/AST/ExprConstant.cpp b/clang/lib/AST/ExprConstant.cpp
index fea06b97259fe31..3a4ef81672911ba 100644
--- a/clang/lib/AST/ExprConstant.cpp
+++ b/clang/lib/AST/ExprConstant.cpp
@@ -10967,19 +10967,19 @@ bool ArrayExprEvaluator::VisitArrayInitLoopExpr(const ArrayInitLoopExpr *E) {
   LValue Subobject = This;
   Subobject.addArray(Info, E, CAT);
 
-  bool Success = true;
   for (EvalInfo::ArrayInitLoopIndex Index(Info); Index != Elements; ++Index) {
     if (!EvaluateInPlace(Result.getArrayInitializedElt(Index),
                          Info, Subobject, E->getSubExpr()) ||
         !HandleLValueArrayAdjustment(Info, E, Subobject,
                                      CAT->getElementType(), 1)) {
-      if (!Info.noteFailure())
+      
+      // There's no need to try and evaluate the rest, as those are the exact same expressions.
+      std::ignore = Info.noteFailure();
         return false;
-      Success = false;
     }
   }
 
-  return Success;
+  return true;
 }
 
 bool ArrayExprEvaluator::VisitCXXConstructExpr(const CXXConstructExpr *E) {
diff --git a/clang/test/AST/nested-array-init-loop-in-lambda-capture.cpp b/clang/test/AST/nested-array-init-loop-in-lambda-capture.cpp
new file mode 100644
index 000000000000000..82d27380b637d03
--- /dev/null
+++ b/clang/test/AST/nested-array-init-loop-in-lambda-capture.cpp
@@ -0,0 +1,13 @@
+// RUN: %clang_cc1 -fexperimental-new-constant-interpreter -std=c++17 -verify %s
+// RUN: %clang_cc1 -std=c++17 -verify=ref %s
+
+// ref-no-diagnostics
+// expected-no-diagnostics
+
+void used_to_crash() {
+  int s[2][2];
+
+  int arr[4];
+
+  arr[0] = [s] { return s[0][0]; }();
+}

@isuckatcs isuckatcs force-pushed the clang-crash branch 2 times, most recently from 264c0fa to fb0bf2f Compare September 28, 2023 18:54
@github-actions
Copy link

github-actions bot commented Sep 28, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

@isuckatcs
Copy link
Member Author

I decided to push the solution where the temporary is put inside a FullExpressionRAII instead of not letting clang keep going in case of an evaluation failure.

@shafik
Copy link
Collaborator

shafik commented Sep 28, 2023

Thank you for the fix!

I don't think I understand the bug based on your description. You say

the reason is that after evaluating the first iteration, we note a failure and would return, however during analyzing the second snippet

You start saying first iteration and then refer to second snippet did you mean second iteration?

You say

and would return

Can you link to the line?

Can you also point to the line we are checking CheckingForUndefinedBehavior?

clang/lib/AST/ExprConstant.cpp Outdated Show resolved Hide resolved
@isuckatcs
Copy link
Member Author

I don't think I understand the bug based on your description.

I'll try my best to explain it a bit better.

So, this is the function that runs and triggers the assertion failure at one point.

bool ArrayExprEvaluator::VisitArrayInitLoopExpr(const ArrayInitLoopExpr *E) {
  LValue CommonLV;
  if (E->getCommonExpr() &&
      !Evaluate(Info.CurrentCall->createTemporary( // assertion failure triggered from createTemporary()
                    E->getCommonExpr(),
                    getStorageType(Info.Ctx, E->getCommonExpr()),
                    ScopeKind::FullExpression, CommonLV),
                Info, E->getCommonExpr()->getSourceExpr()))
    return false;

  auto *CAT = cast<ConstantArrayType>(E->getType()->castAsArrayTypeUnsafe());

  uint64_t Elements = CAT->getSize().getZExtValue();
  Result = APValue(APValue::UninitArray(), Elements, Elements);

  LValue Subobject = This;
  Subobject.addArray(Info, E, CAT);

  bool Success = true;
  for (EvalInfo::ArrayInitLoopIndex Index(Info); Index != Elements; ++Index) {
    if (!EvaluateInPlace(Result.getArrayInitializedElt(Index),
                         Info, Subobject, E->getSubExpr()) ||
        !HandleLValueArrayAdjustment(Info, E, Subobject,
                                     CAT->getElementType(), 1)) {
      if (!Info.noteFailure())
        return false;
      Success = false;
    }
  }
  return Success;
}

Let's see how it runs on the following snippet and how the AST looks like:

  S s[2][2];

  auto [a1,a2] = s; // the ast is only shown for this line
`-DeclStmt ...
  `-DecompositionDecl ... used 'S[2][2]' cinit
    |-ArrayInitLoopExpr... 'S[2][2]'
    | |-OpaqueValueExpr ... 'S[2][2]' lvalue
    | | `-DeclRefExpr ... 'S[2][2]' lvalue Var ... 's' 'S[2][2]'
    | `-ArrayInitLoopExpr ... 'S[2]'
    |   |-OpaqueValueExpr ... 'S[2]' lvalue
    |   | `-ArraySubscriptExpr ... 'S[2]' lvalue
    |   |   |-ImplicitCastExpr ... 'S (*)[2]' <ArrayToPointerDecay>
    |   |   | `-OpaqueValueExpr ... 'S[2][2]' lvalue
    |   |   |   `-DeclRefExpr ... 'S[2][2]' lvalue Var ... 's' 'S[2][2]'
    |   |   `-ArrayInitIndexExpr 'unsigned long'
    |   `-<initializer used for each array member>
    |-BindingDecl ... a1 'S[2]'
    | `-...
    `-BindingDecl ... a2 'S[2]'
      `-...

A quick anatomy guide for ArrayInitLoopExpr:

ArrayInitLoopExpr
|-OpaqueValueExpr // getCommonExpr() returns this
| `-DeclRefExpr <source_array> // always wrapped inside an OpaqueValueExpr
`-<initializer_used_for_each_element> // getSubExpr() returns this
  `-ArrayInitIndexExpr // a placeholder node denoting the "current" index of the element being initialized

For more details please check the clang documentation of ArrayInitLoopExpr and ArrayInitIndexExpr

So, let's see the step by step execution of VisitArrayInitLoopExpr().

-ArrayInitLoopExpr... 'S[2][2]' // <-- we are here
 |-OpaqueValueExpr ... 'S[2][2]' lvalue
 | `-DeclRefExpr ... 'S[2][2]' lvalue Var ... 's' 'S[2][2]'
 `-ArrayInitLoopExpr ... 'S[2]'
  1. We create a temporary region for the OpaqueValueExpr, returned by E->getCommonExpr() and try to evaluate the DeclRefExpr inside this temporary.
if (
    E->getCommonExpr() &&
    !Evaluate(Info.CurrentCall->createTemporary(E->getCommonExpr(), ...), Info, E->getCommonExpr()->getSourceExpr())
)
    return false;

I say we create it for the OpaqueValueExpr, because that will be the key of the region inside the stack frame.

// Key is passed from `E->getCommonExpr()`.
template<typename KeyT>
APValue &CallStackFrame::createTemporary(const KeyT *Key, QualType T,
                                         ScopeKind Scope, LValue &LV) {
  unsigned Version = getTempVersion();
  APValue::LValueBase Base(Key, Index, Version);
  LV.set(Base);
  return createLocal(Base, Key, T, Scope);
}
APValue &CallStackFrame::createLocal(APValue::LValueBase Base, const void *Key, ...) {
  ...
  unsigned Version = Base.getVersion();
  APValue &Result = Temporaries[MapKeyTy(Key, Version)];
  assert(Result.isAbsent() && "local created multiple times");
  ...
}

So, from what we know, currently the stack frame looks like this:

{
  OpaqueValueExpr ... 'S[2][2]' (temporary)
}
  1. We start iterating over the initializer that is used for each element, returned by getSubExpr() as many times, as many elements the source array has and attempt to analyze it.
for (EvalInfo::ArrayInitLoopIndex Index(Info); Index != Elements; ++Index) {
    if (!EvaluateInPlace(Result.getArrayInitializedElt(Index),
                         Info, Subobject, E->getSubExpr()) ||
        !HandleLValueArrayAdjustment(Info, E, Subobject,
                                     CAT->getElementType(), 1)) {
      if (!Info.noteFailure())
        return false;
      Success = false;
    }
  }
  1. Since in the current case, this initializer also happens to be an ArrayInitLoopExpr, we go back to step 1 again.
-ArrayInitLoopExpr... 'S[2][2]'
 |-OpaqueValueExpr ... 'S[2][2]' lvalue
 | `-DeclRefExpr ... 'S[2][2]' lvalue Var ... 's' 'S[2][2]'
 `-ArrayInitLoopExpr ... 'S[2]' // <-- we are here
   |-OpaqueValueExpr ... 'S[2]' lvalue // <-- temporary created for this
   | `-ArraySubscriptExpr ... 'S[2]' lvalue
   |   |-ImplicitCastExpr ... 'S (*)[2]' <ArrayToPointerDecay>
   |   | `-OpaqueValueExpr ... 'S[2][2]' lvalue
   |   |   `-DeclRefExpr ... 'S[2][2]' lvalue Var ... 's' 'S[2][2]'
   |   `-ArrayInitIndexExpr 'unsigned long'
   `-<initializer used for each array member>

After creating the the new temporary, the stack frame looks like this:

{
  OpaqueValueExpr ... 'S[2]' (temporary)
  OpaqueValueExpr ... 'S[2][2]' (temporary)
}

Now let's pretend that the loop presented in step 2 finished execution inside ArrayInitLoopExpr ... 'S[2]' and jump back to the point where we were before step 3.

  1. We finished evaluating the ArrayInitLoopExpr ... 'S[2]' once, so !Info.noteFailure() is checked.
-ArrayInitLoopExpr... 'S[2][2]' // <-- we are here
 |-OpaqueValueExpr ... 'S[2][2]' lvalue
 | `-DeclRefExpr ... 'S[2][2]' lvalue Var ... 's' 'S[2][2]'
 `-ArrayInitLoopExpr ... 'S[2]' // <-- we evaluated this in step 3
for (EvalInfo::ArrayInitLoopIndex Index(Info); Index != Elements; ++Index) {
    if (!EvaluateInPlace(Result.getArrayInitializedElt(Index),
                         Info, Subobject, E->getSubExpr()) ||
        !HandleLValueArrayAdjustment(Info, E, Subobject,
                                     CAT->getElementType(), 1)) {
      if (!Info.noteFailure()) // <-- we are here
        return false;
      Success = false;
    }
  }

In the snippet that we're currently analyzing, Info.noteFailure() will return false, so we don't loop over the same Expr once again (note that the array has 2 elements and because of that the loop would iterate twice), but return false immediately, so there is no assertion failure.

Note that the stack frame still looks like this:

{
  OpaqueValueExpr ... 'S[2]' (temporary) <-- yes, this is not cleaned up
  OpaqueValueExpr ... 'S[2][2]' (temporary)
}

Now let's see the snippet that hits the assertion failure and why it actually happens, although I think I spoiled it already.

S s[2][2];

int res[4];

res[0] = [s] { return s[0][0].i; }();

Everything up until step 4 is exactly the same as it was with the previous snippet. The difference comes in step 4, where !Info.noteFailure() return true, so we attempt to evaluate the same ArrayInitLoopExpr ... 'S[2]' once again.

Remember that when we enter that Expr, we are at step 1 and create a temporary, however as I noted previously an OpaqueValueExpr ... 'S[2]' (temporary) is already in the stack frame and we attempt to create it once again since the key expression is the same. Creating the same local variable twice is not allowed, so we fail the assertion.

APValue &CallStackFrame::createLocal(...) {
  ...
  unsigned Version = Base.getVersion();
  APValue &Result = Temporaries[MapKeyTy(Key, Version)];
  assert(Result.isAbsent() && "local created multiple times"); // <- it is not absent
  ...
}

And the reason for why !Info.noteFailure() returns true for this snippet is that CheckingForUndefinedBehavior is true, which is returned like this:

    [[nodiscard]] bool noteFailure() {
      bool KeepGoing = keepEvaluatingAfterFailure();
      EvalStatus.HasSideEffects |= KeepGoing;
      return KeepGoing;
    }
    bool keepEvaluatingAfterFailure() const override {
      if (!StepsLeft)
        return false;

      switch (EvalMode) {
      case EM_ConstantExpression:
      case EM_ConstantExpressionUnevaluated:
      case EM_ConstantFold:
      case EM_IgnoreSideEffects:
        return checkingPotentialConstantExpression() ||
               checkingForUndefinedBehavior();
      }
      llvm_unreachable("Missed EvalMode case");
    }
    /// Are we checking an expression for overflow?
    // FIXME: We should check for any kind of undefined or suspicious behavior
    // in such constructs, not just overflow.
    bool checkingForUndefinedBehavior() const override {
      return CheckingForUndefinedBehavior;
    }

I didn't investigate where this gets set to true or should it be checked at all, since I thought the root cause of the problem was that the temporary is not cleaned up, so I wanted to address that.

Copy link
Collaborator

@zygoloid zygoloid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ExprConstant changes LGTM. I don't have state on the Interp changes; someone else should review that.

@tbaederr
Copy link
Contributor

Remove the Interp/ changes please, I'll have a look at that later.

@isuckatcs
Copy link
Member Author

Removed Interp/ changes.

@tbaederr
Copy link
Contributor

I've pushed 38018ec since, so you could remove the CUR_INTERP stuff from there and get a new test case for free.

@cor3ntin
Copy link
Contributor

Does this fixes #67317? #67688?
If so, might as well add the test cases and mark these as being fixed too.

You are going to need at add a release note in clang/docs/ReleaseNotes.rst

When visiting an ArrayInitLoopExpr, clang creates a temporary
variable for the source array, however in a nested AILE this
variable is created multiple times, in every iteration as they
have the same AST node and clang is unable to differentiate them.

Fixes llvm#57135
@isuckatcs
Copy link
Member Author

It doesn't fix #67317 and #67688 is already marked as a duplicate of this and closed. I rebased and removed CUR_INTERP and documented the changes in ReleaseNotes.rst.

Copy link
Contributor

@cor3ntin cor3ntin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except for a small nit in the changelog

@isuckatcs isuckatcs merged commit c496aa3 into llvm:main Sep 29, 2023
3 checks passed
Copy link
Collaborator

@shafik shafik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the post commit review, just have one question.

@@ -10977,6 +10987,9 @@ bool ArrayExprEvaluator::VisitArrayInitLoopExpr(const ArrayInitLoopExpr *E) {
return false;
Success = false;
}

// Make sure we run the destructors too.
Scope.destroy();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we don't have to check the return value? It looks like every other use does not discard the return value.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The result of the initializer expression should be placed inside a temporary array, which is created on line 10967. We only destroy whatever is left on the stack and we shouldn't need anything from there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[clang] Assertion failure in case of nested ArrayInitLoopExpr using -std=c++17
6 participants