Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[clang][dataflow] Simplify flow conditions displayed in HTMLLogger. #70848

Merged
merged 2 commits into from
Nov 7, 2023

Conversation

martinboehme
Copy link
Contributor

This can make the flow condition significantly easier to interpret; see below
for an example.

I had hoped that adding the simplification as a preprocessing step before the
SAT solver (in DataflowAnalysisContext::querySolver()) would also speed up SAT
solving and maybe even eliminate SAT solver timeouts, but in my testing, this
actually turns out to be a pessimization. It appears that these simplifications
are easy enough for the SAT solver to perform itself.

Nevertheless, the improvement in debugging alone makes this a worthwhile change.

Example of flow condition output with these changes:

Flow condition token: V37
Constraints:
(V16 = (((V15 & (V19 = V12)) & V22) & V25))
(V15 = ((V12 & ((V14 = V9) | (V14 = V4))) & (V13 = V14)))
True atoms: (V0, V1, V2, V5, V6, V7, V29, V30, V32, V34, V35, V37)
False atoms: (V3, V8, V17)
Equivalent atoms:
(V11, V15)

Flow condition constraints before simplification:
V37
((!V3 & !V8) & !V17)
(V37 = V34)
(V34 = (V29 & (V35 = V30)))
(V29 = (((V16 | V2) & V32) & (V30 = V32)))
(V16 = (((V15 & (V19 = V12)) & V22) & V25))
(V15 = V11)
(V11 = ((((V7 | V2) & V12) & ((V7 & (V14 = V9)) | (V2 & (V14 = V4)))) & (V13 = V14)))
(V2 = V1)
(V1 = V0)
V0
(V7 = V6)
(V6 = V5)
(V5 = V2)

This can make the flow condition significantly easier to interpret; see below
for an example.

I had hoped that adding the simplification as a preprocessing step before the
SAT solver (in `DataflowAnalysisContext::querySolver()`) would also speed up SAT
solving and maybe even eliminate SAT solver timeouts, but in my testing, this
actually turns out to be a pessimization. It appears that these simplifications
are easy enough for the SAT solver to perform itself.

Nevertheless, the improvement in debugging alone makes this a worthwhile change.

Example of flow condition output with these changes:

```
Flow condition token: V37
Constraints:
(V16 = (((V15 & (V19 = V12)) & V22) & V25))
(V15 = ((V12 & ((V14 = V9) | (V14 = V4))) & (V13 = V14)))
True atoms: (V0, V1, V2, V5, V6, V7, V29, V30, V32, V34, V35, V37)
False atoms: (V3, V8, V17)
Equivalent atoms:
(V11, V15)

Flow condition constraints before simplification:
V37
((!V3 & !V8) & !V17)
(V37 = V34)
(V34 = (V29 & (V35 = V30)))
(V29 = (((V16 | V2) & V32) & (V30 = V32)))
(V16 = (((V15 & (V19 = V12)) & V22) & V25))
(V15 = V11)
(V11 = ((((V7 | V2) & V12) & ((V7 & (V14 = V9)) | (V2 & (V14 = V4)))) & (V13 = V14)))
(V2 = V1)
(V1 = V0)
V0
(V7 = V6)
(V6 = V5)
(V5 = V2)
```
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:dataflow Clang Dataflow Analysis framework - https://clang.llvm.org/docs/DataFlowAnalysisIntro.html clang:analysis labels Oct 31, 2023
@llvmbot
Copy link
Collaborator

llvmbot commented Oct 31, 2023

@llvm/pr-subscribers-clang-analysis

@llvm/pr-subscribers-clang

Author: None (martinboehme)

Changes

This can make the flow condition significantly easier to interpret; see below
for an example.

I had hoped that adding the simplification as a preprocessing step before the
SAT solver (in DataflowAnalysisContext::querySolver()) would also speed up SAT
solving and maybe even eliminate SAT solver timeouts, but in my testing, this
actually turns out to be a pessimization. It appears that these simplifications
are easy enough for the SAT solver to perform itself.

Nevertheless, the improvement in debugging alone makes this a worthwhile change.

Example of flow condition output with these changes:

Flow condition token: V37
Constraints:
(V16 = (((V15 & (V19 = V12)) & V22) & V25))
(V15 = ((V12 & ((V14 = V9) | (V14 = V4))) & (V13 = V14)))
True atoms: (V0, V1, V2, V5, V6, V7, V29, V30, V32, V34, V35, V37)
False atoms: (V3, V8, V17)
Equivalent atoms:
(V11, V15)

Flow condition constraints before simplification:
V37
((!V3 & !V8) & !V17)
(V37 = V34)
(V34 = (V29 & (V35 = V30)))
(V29 = (((V16 | V2) & V32) & (V30 = V32)))
(V16 = (((V15 & (V19 = V12)) & V22) & V25))
(V15 = V11)
(V11 = ((((V7 | V2) & V12) & ((V7 & (V14 = V9)) | (V2 & (V14 = V4)))) & (V13 = V14)))
(V2 = V1)
(V1 = V0)
V0
(V7 = V6)
(V6 = V5)
(V5 = V2)

Patch is 20.36 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/70848.diff

9 Files Affected:

  • (added) clang/include/clang/Analysis/FlowSensitive/SimplifyConstraints.h (+55)
  • (modified) clang/lib/Analysis/FlowSensitive/CMakeLists.txt (+1)
  • (modified) clang/lib/Analysis/FlowSensitive/DataflowAnalysisContext.cpp (+39-1)
  • (modified) clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp (+1-1)
  • (added) clang/lib/Analysis/FlowSensitive/SimplifyConstraints.cpp (+183)
  • (modified) clang/unittests/Analysis/FlowSensitive/CMakeLists.txt (+1)
  • (added) clang/unittests/Analysis/FlowSensitive/SimplifyConstraintsTest.cpp (+177)
  • (modified) llvm/utils/gn/secondary/clang/lib/Analysis/FlowSensitive/BUILD.gn (+1)
  • (modified) llvm/utils/gn/secondary/clang/unittests/Analysis/FlowSensitive/BUILD.gn (+1)
diff --git a/clang/include/clang/Analysis/FlowSensitive/SimplifyConstraints.h b/clang/include/clang/Analysis/FlowSensitive/SimplifyConstraints.h
new file mode 100644
index 000000000000000..a6068d14df33c7d
--- /dev/null
+++ b/clang/include/clang/Analysis/FlowSensitive/SimplifyConstraints.h
@@ -0,0 +1,55 @@
+//===-- SimplifyConstraints.h -----------------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+//  This file defines a DataflowAnalysisContext class that owns objects that
+//  encompass the state of a program and stores context that is used during
+//  dataflow analysis.
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_CLANG_ANALYSIS_FLOWSENSITIVE_SIMPLIFYCONSTRAINTS_H
+#define LLVM_CLANG_ANALYSIS_FLOWSENSITIVE_SIMPLIFYCONSTRAINTS_H
+
+#include "clang/Analysis/FlowSensitive/Arena.h"
+#include "clang/Analysis/FlowSensitive/Formula.h"
+#include "llvm/ADT/SetVector.h"
+
+namespace clang {
+namespace dataflow {
+
+/// Information on the way a set of constraints was simplified.
+struct SimplifyConstraintsInfo {
+  /// List of equivalence classes of atoms. For each equivalence class, the
+  /// original constraints imply that all atoms in it must be equivalent.
+  /// Simplification replaces all occurrences of atoms in an equivalence class
+  /// with a single representative atom from the class.
+  /// Does not contain equivalence classes with just one member or atoms
+  /// contained in `TrueAtoms` or `FalseAtoms`.
+  llvm::SmallVector<llvm::SmallVector<Atom>> EquivalentAtoms;
+  /// Atoms that the original constraints imply must be true.
+  /// Simplification replaces all occurrences of these atoms by a true literal
+  /// (which may enable additional simplifications).
+  llvm::SmallVector<Atom> TrueAtoms;
+  /// Atoms that the original constraints imply must be false.
+  /// Simplification replaces all occurrences of these atoms by a false literal
+  /// (which may enable additional simplifications).
+  llvm::SmallVector<Atom> FalseAtoms;
+};
+
+/// Simplifies a set of constraints (implicitly connected by "and") in a way
+/// that does not change satisfiability of the constraints. This does _not_ mean
+/// that the set of solutions is the same before and after simplification.
+/// `Info`, if non-null, will be populated with information about the
+/// simplifications that were made to the formula (e.g. to display to the user).
+void simplifyConstraints(llvm::SetVector<const Formula *> &Constraints,
+                         Arena &arena, SimplifyConstraintsInfo *Info = nullptr);
+
+} // namespace dataflow
+} // namespace clang
+
+#endif // LLVM_CLANG_ANALYSIS_FLOWSENSITIVE_SIMPLIFYCONSTRAINTS_H
diff --git a/clang/lib/Analysis/FlowSensitive/CMakeLists.txt b/clang/lib/Analysis/FlowSensitive/CMakeLists.txt
index 3394c9f0299e414..5af4ecfc9efa5da 100644
--- a/clang/lib/Analysis/FlowSensitive/CMakeLists.txt
+++ b/clang/lib/Analysis/FlowSensitive/CMakeLists.txt
@@ -7,6 +7,7 @@ add_clang_library(clangAnalysisFlowSensitive
   HTMLLogger.cpp
   Logger.cpp
   RecordOps.cpp
+  SimplifyConstraints.cpp
   Transfer.cpp
   TypeErasedDataflowAnalysis.cpp
   Value.cpp
diff --git a/clang/lib/Analysis/FlowSensitive/DataflowAnalysisContext.cpp b/clang/lib/Analysis/FlowSensitive/DataflowAnalysisContext.cpp
index 40de6cdf3a69e28..9c15c8756e9dc15 100644
--- a/clang/lib/Analysis/FlowSensitive/DataflowAnalysisContext.cpp
+++ b/clang/lib/Analysis/FlowSensitive/DataflowAnalysisContext.cpp
@@ -17,6 +17,7 @@
 #include "clang/Analysis/FlowSensitive/DebugSupport.h"
 #include "clang/Analysis/FlowSensitive/Formula.h"
 #include "clang/Analysis/FlowSensitive/Logger.h"
+#include "clang/Analysis/FlowSensitive/SimplifyConstraints.h"
 #include "clang/Analysis/FlowSensitive/Value.h"
 #include "llvm/ADT/SetOperations.h"
 #include "llvm/ADT/SetVector.h"
@@ -205,13 +206,50 @@ void DataflowAnalysisContext::addTransitiveFlowConditionConstraints(
   }
 }
 
+static void printAtomList(const llvm::SmallVector<Atom> &Atoms,
+                          llvm::raw_ostream &OS) {
+  OS << "(";
+  for (size_t i = 0; i < Atoms.size(); ++i) {
+    OS << Atoms[i];
+    if (i + 1 < Atoms.size())
+      OS << ", ";
+  }
+  OS << ")\n";
+}
+
 void DataflowAnalysisContext::dumpFlowCondition(Atom Token,
                                                 llvm::raw_ostream &OS) {
   llvm::SetVector<const Formula *> Constraints;
   Constraints.insert(&arena().makeAtomRef(Token));
   addTransitiveFlowConditionConstraints(Token, Constraints);
 
-  for (const auto *Constraint : Constraints) {
+  OS << "Flow condition token: " << Token << "\n";
+  SimplifyConstraintsInfo Info;
+  llvm::SetVector<const Formula *> OriginalConstraints = Constraints;
+  simplifyConstraints(Constraints, arena(), &Info);
+  if (!Constraints.empty()) {
+    OS << "Constraints:\n";
+    for (const auto *Constraint : Constraints) {
+      Constraint->print(OS);
+      OS << "\n";
+    }
+  }
+  if (!Info.TrueAtoms.empty()) {
+    OS << "True atoms: ";
+    printAtomList(Info.TrueAtoms, OS);
+  }
+  if (!Info.FalseAtoms.empty()) {
+    OS << "False atoms: ";
+    printAtomList(Info.FalseAtoms, OS);
+  }
+  if (!Info.EquivalentAtoms.empty()) {
+    OS << "Equivalent atoms:\n";
+    for (const llvm::SmallVector<Atom> Class : Info.EquivalentAtoms)
+      printAtomList(Class, OS);
+  }
+
+  OS << "\nFlow condition constraints before simplification:\n";
+  for (const auto *Constraint : OriginalConstraints) {
     Constraint->print(OS);
     OS << "\n";
   }
diff --git a/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp b/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
index cf1e9eb7b1fd7f2..2b000eb7b370868 100644
--- a/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
+++ b/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
@@ -961,7 +961,7 @@ void Environment::dump(raw_ostream &OS) const {
     OS << "  [" << L << ", " << V << ": " << *V << "]\n";
   }
 
-  OS << "FlowConditionToken:\n";
+  OS << "\n";
   DACtx->dumpFlowCondition(FlowConditionToken, OS);
 }
 
diff --git a/clang/lib/Analysis/FlowSensitive/SimplifyConstraints.cpp b/clang/lib/Analysis/FlowSensitive/SimplifyConstraints.cpp
new file mode 100644
index 000000000000000..d43b008c59c4dff
--- /dev/null
+++ b/clang/lib/Analysis/FlowSensitive/SimplifyConstraints.cpp
@@ -0,0 +1,183 @@
+//===-- SimplifyConstraints.cpp ---------------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+//  This file defines a DataflowAnalysisContext class that owns objects that
+//  encompass the state of a program and stores context that is used during
+//  dataflow analysis.
+//
+//===----------------------------------------------------------------------===//
+
+#include "clang/Analysis/FlowSensitive/SimplifyConstraints.h"
+#include "llvm/ADT/EquivalenceClasses.h"
+
+namespace clang {
+namespace dataflow {
+
+// Substitute all occurrences of a given atom in `F` by a given formula and
+// returns the resulting formula.
+static const Formula &
+substitute(const Formula &F,
+           const llvm::DenseMap<Atom, const Formula *> &Substitutions,
+           Arena &arena) {
+  switch (F.kind()) {
+  case Formula::AtomRef:
+    if (auto iter = Substitutions.find(F.getAtom());
+        iter != Substitutions.end())
+      return *iter->second;
+    return F;
+  case Formula::Literal:
+    return F;
+  case Formula::Not:
+    return arena.makeNot(substitute(*F.operands()[0], Substitutions, arena));
+  case Formula::And:
+    return arena.makeAnd(substitute(*F.operands()[0], Substitutions, arena),
+                         substitute(*F.operands()[1], Substitutions, arena));
+  case Formula::Or:
+    return arena.makeOr(substitute(*F.operands()[0], Substitutions, arena),
+                        substitute(*F.operands()[1], Substitutions, arena));
+  case Formula::Implies:
+    return arena.makeImplies(
+        substitute(*F.operands()[0], Substitutions, arena),
+        substitute(*F.operands()[1], Substitutions, arena));
+  case Formula::Equal:
+    return arena.makeEquals(substitute(*F.operands()[0], Substitutions, arena),
+                            substitute(*F.operands()[1], Substitutions, arena));
+  }
+}
+
+// Returns the result of replacing atoms in `Atoms` with the leader of their
+// equivalence class in `EquivalentAtoms`.
+// Atoms that don't have an equivalence class in `EquivalentAtoms` are inserted
+// into it as single-member equivalence classes.
+static llvm::DenseSet<Atom>
+projectToLeaders(const llvm::DenseSet<Atom> &Atoms,
+                 llvm::EquivalenceClasses<Atom> &EquivalentAtoms) {
+  llvm::DenseSet<Atom> Result;
+
+  for (Atom Atom : Atoms)
+    Result.insert(EquivalentAtoms.getOrInsertLeaderValue(Atom));
+
+  return Result;
+}
+
+// Returns the atoms in the equivalence class for the leader identified by
+// `LeaderIt`.
+static llvm::SmallVector<Atom>
+atomsInEquivalenceClass(const llvm::EquivalenceClasses<Atom> &EquivalentAtoms,
+                        llvm::EquivalenceClasses<Atom>::iterator LeaderIt) {
+  llvm::SmallVector<Atom> Result;
+  for (auto MemberIt = EquivalentAtoms.member_begin(LeaderIt);
+       MemberIt != EquivalentAtoms.member_end(); ++MemberIt)
+    Result.push_back(*MemberIt);
+  return Result;
+}
+
+void simplifyConstraints(llvm::SetVector<const Formula *> &Constraints,
+                         Arena &arena, SimplifyConstraintsInfo *Info) {
+  auto contradiction = [&]() {
+    Constraints.clear();
+    Constraints.insert(&arena.makeLiteral(false));
+  };
+
+  llvm::EquivalenceClasses<Atom> EquivalentAtoms;
+  llvm::DenseSet<Atom> TrueAtoms;
+  llvm::DenseSet<Atom> FalseAtoms;
+
+  while (true) {
+    for (const auto *Constraint : Constraints) {
+      switch (Constraint->kind()) {
+      case Formula::AtomRef:
+        TrueAtoms.insert(Constraint->getAtom());
+        break;
+      case Formula::Not:
+        if (Constraint->operands()[0]->kind() == Formula::AtomRef)
+          FalseAtoms.insert(Constraint->operands()[0]->getAtom());
+        break;
+      case Formula::Equal:
+        if (Constraint->operands()[0]->kind() == Formula::AtomRef &&
+            Constraint->operands()[1]->kind() == Formula::AtomRef) {
+          EquivalentAtoms.unionSets(Constraint->operands()[0]->getAtom(),
+                                    Constraint->operands()[1]->getAtom());
+        }
+        break;
+      default:
+        break;
+      }
+    }
+
+    TrueAtoms = projectToLeaders(TrueAtoms, EquivalentAtoms);
+    FalseAtoms = projectToLeaders(FalseAtoms, EquivalentAtoms);
+
+    llvm::DenseMap<Atom, const Formula *> Substitutions;
+    for (auto It = EquivalentAtoms.begin(); It != EquivalentAtoms.end(); ++It) {
+      Atom TheAtom = It->getData();
+      Atom Leader = EquivalentAtoms.getLeaderValue(TheAtom);
+      if (TrueAtoms.contains(Leader)) {
+        if (FalseAtoms.contains(Leader)) {
+          contradiction();
+          return;
+        }
+        Substitutions.insert({TheAtom, &arena.makeLiteral(true)});
+      } else if (FalseAtoms.contains(Leader)) {
+        Substitutions.insert({TheAtom, &arena.makeLiteral(false)});
+      } else if (TheAtom != Leader) {
+        Substitutions.insert({TheAtom, &arena.makeAtomRef(Leader)});
+      }
+    }
+
+    llvm::SetVector<const Formula *> NewConstraints;
+    for (const auto *Constraint : Constraints) {
+      const Formula &NewConstraint =
+          substitute(*Constraint, Substitutions, arena);
+      if (&NewConstraint == &arena.makeLiteral(true))
+        continue;
+      if (&NewConstraint == &arena.makeLiteral(false)) {
+        contradiction();
+        return;
+      }
+      if (NewConstraint.kind() == Formula::And) {
+        NewConstraints.insert(NewConstraint.operands()[0]);
+        NewConstraints.insert(NewConstraint.operands()[1]);
+        continue;
+      }
+      NewConstraints.insert(&NewConstraint);
+    }
+
+    if (NewConstraints == Constraints)
+      break;
+    Constraints = NewConstraints;
+  }
+
+  if (Info) {
+    for (auto It = EquivalentAtoms.begin(), End = EquivalentAtoms.end();
+         It != End; ++It) {
+      if (!It->isLeader())
+        continue;
+      Atom At = *EquivalentAtoms.findLeader(It);
+      if (TrueAtoms.contains(At) || FalseAtoms.contains(At))
+        continue;
+      llvm::SmallVector<Atom> Atoms =
+          atomsInEquivalenceClass(EquivalentAtoms, It);
+      if (Atoms.size() == 1)
+        continue;
+      std::sort(Atoms.begin(), Atoms.end());
+      Info->EquivalentAtoms.push_back(std::move(Atoms));
+    }
+    for (Atom At : TrueAtoms)
+      Info->TrueAtoms.append(atomsInEquivalenceClass(
+          EquivalentAtoms, EquivalentAtoms.findValue(At)));
+    std::sort(Info->TrueAtoms.begin(), Info->TrueAtoms.end());
+    for (Atom At : FalseAtoms)
+      Info->FalseAtoms.append(atomsInEquivalenceClass(
+          EquivalentAtoms, EquivalentAtoms.findValue(At)));
+    std::sort(Info->FalseAtoms.begin(), Info->FalseAtoms.end());
+  }
+}
+
+} // namespace dataflow
+} // namespace clang
diff --git a/clang/unittests/Analysis/FlowSensitive/CMakeLists.txt b/clang/unittests/Analysis/FlowSensitive/CMakeLists.txt
index a9c07d930cdd071..94160d949637cfa 100644
--- a/clang/unittests/Analysis/FlowSensitive/CMakeLists.txt
+++ b/clang/unittests/Analysis/FlowSensitive/CMakeLists.txt
@@ -17,6 +17,7 @@ add_clang_unittest(ClangAnalysisFlowSensitiveTests
   MultiVarConstantPropagationTest.cpp
   RecordOpsTest.cpp
   SignAnalysisTest.cpp
+  SimplifyConstraintsTest.cpp
   SingleVarConstantPropagationTest.cpp
   SolverTest.cpp
   TestingSupport.cpp
diff --git a/clang/unittests/Analysis/FlowSensitive/SimplifyConstraintsTest.cpp b/clang/unittests/Analysis/FlowSensitive/SimplifyConstraintsTest.cpp
new file mode 100644
index 000000000000000..1f34ae076d5ed51
--- /dev/null
+++ b/clang/unittests/Analysis/FlowSensitive/SimplifyConstraintsTest.cpp
@@ -0,0 +1,177 @@
+//===- unittests/Analysis/FlowSensitive/SimplifyConstraintsTest.cpp -------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "clang/Analysis/FlowSensitive/SimplifyConstraints.h"
+#include "TestingSupport.h"
+#include "clang/Analysis/FlowSensitive/Arena.h"
+#include "gmock/gmock.h"
+#include "gtest/gtest.h"
+
+namespace {
+
+using namespace clang;
+using namespace dataflow;
+
+using ::testing::ElementsAre;
+using ::testing::IsEmpty;
+
+class SimplifyConstraintsTest : public ::testing::Test {
+protected:
+  llvm::SetVector<const Formula *> parse(StringRef Lines) {
+    std::vector<const Formula *> formulas = test::parseFormulas(A, Lines);
+    llvm::SetVector<const Formula *> Constraints(formulas.begin(),
+                                                 formulas.end());
+    return Constraints;
+  }
+
+  llvm::SetVector<const Formula *> simplify(StringRef Lines,
+                                            SimplifyConstraintsInfo &Info) {
+    llvm::SetVector<const Formula *> Constraints = parse(Lines);
+    simplifyConstraints(Constraints, A, &Info);
+    return Constraints;
+  }
+
+  Arena A;
+};
+
+void printConstraints(const llvm::SetVector<const Formula *> &Constraints,
+                      raw_ostream &OS) {
+  if (Constraints.empty()) {
+    OS << "empty";
+    return;
+  }
+  for (const auto *Constraint : Constraints) {
+    Constraint->print(OS);
+    OS << "\n";
+  }
+}
+
+std::string
+constraintsToString(const llvm::SetVector<const Formula *> &Constraints) {
+  std::string Str;
+  llvm::raw_string_ostream OS(Str);
+  printConstraints(Constraints, OS);
+  return Str;
+}
+
+MATCHER_P(EqualsConstraints, Constraints,
+          "constraints are: " + constraintsToString(Constraints)) {
+  if (arg == Constraints)
+    return true;
+
+  if (result_listener->stream()) {
+    llvm::raw_os_ostream OS(*result_listener->stream());
+    OS << "constraints are: ";
+    printConstraints(arg, OS);
+  }
+  return false;
+}
+
+TEST_F(SimplifyConstraintsTest, TriviallySatisfiable) {
+  SimplifyConstraintsInfo Info;
+  EXPECT_THAT(simplify(R"(
+     V0
+  )",
+                       Info),
+              EqualsConstraints(parse("")));
+  EXPECT_THAT(Info.EquivalentAtoms, IsEmpty());
+  EXPECT_THAT(Info.TrueAtoms, ElementsAre(Atom(0)));
+  EXPECT_THAT(Info.FalseAtoms, IsEmpty());
+}
+
+TEST_F(SimplifyConstraintsTest, SimpleContradiction) {
+  SimplifyConstraintsInfo Info;
+  EXPECT_THAT(simplify(R"(
+     V0
+     !V0
+  )",
+                       Info),
+              EqualsConstraints(parse("false")));
+  EXPECT_THAT(Info.EquivalentAtoms, IsEmpty());
+  EXPECT_THAT(Info.TrueAtoms, IsEmpty());
+  EXPECT_THAT(Info.FalseAtoms, IsEmpty());
+}
+
+TEST_F(SimplifyConstraintsTest, ContradictionThroughEquivalence) {
+  SimplifyConstraintsInfo Info;
+  EXPECT_THAT(simplify(R"(
+     (V0 = V1)
+     V0
+     !V1
+  )",
+                       Info),
+              EqualsConstraints(parse("false")));
+  EXPECT_THAT(Info.EquivalentAtoms, IsEmpty());
+  EXPECT_THAT(Info.TrueAtoms, IsEmpty());
+  EXPECT_THAT(Info.FalseAtoms, IsEmpty());
+}
+
+TEST_F(SimplifyConstraintsTest, EquivalenceChain) {
+  SimplifyConstraintsInfo Info;
+  EXPECT_THAT(simplify(R"(
+     (V0 | V3)
+     (V1 = V2)
+     (V2 = V3)
+  )",
+                       Info),
+              EqualsConstraints(parse("(V0 | V1)")));
+  EXPECT_THAT(Info.EquivalentAtoms,
+              ElementsAre(ElementsAre(Atom(1), Atom(2), Atom(3))));
+  EXPECT_THAT(Info.TrueAtoms, IsEmpty());
+  EXPECT_THAT(Info.FalseAtoms, IsEmpty());
+}
+
+TEST_F(SimplifyConstraintsTest, TrueAndFalseAtomsSimplifyOtherExpressions) {
+  SimplifyConstraintsInfo Info;
+  EXPECT_THAT(simplify(R"(
+    V0
+    !V1
+    (V0 & (V2 => V3))
+    (V1 | (V4 => V5))
+  )",
+                       Info),
+              EqualsConstraints(parse(R"(
+    (V2 => V3)
+    (V4 => V5)
+  )")));
+  EXPECT_THAT(Info.EquivalentAtoms, IsEmpty());
+  EXPECT_THAT(Info.TrueAtoms, ElementsAre(Atom(0)));
+  EXPECT_THAT(Info.FalseAtoms, ElementsAre(Atom(1)));
+}
+
+TEST_F(SimplifyConstraintsTest, TrueAtomUnlocksEquivalenceChain) {
+  SimplifyConstraintsInfo Info;
+  EXPECT_THAT(simplify(R"(
+     V0
+     (V0 & (V1 = V2))
+     (V0 & (V2 = V3))
+  )",
+                       Info),
+              EqualsConstraints(parse("")));
+  EXPECT_THAT(Info.EquivalentAtoms,
+              ElementsAre(ElementsAre(Atom(1), Atom(2), Atom(3))));
+  EXPECT_THAT(Info.TrueAtoms, ElementsAre(Atom(0)));
+  EXPECT_THAT(Info.FalseAtoms, IsEmpty());
+}
+
+TEST_F(SimplifyConstraintsTest, TopLevelAndSplitIntoMultipleConstraints) {
+  SimplifyConstraintsInfo Info;
+  EXPECT_THAT(simplify(R"(
+     ((V0 => V1) & (V2 => V3))
+  )",
+                       Info),
+              EqualsConstraints(parse(R"(
+    (V0 => V1)
+    (V2 => V3)
+  )")));
+  EXPECT_THAT(Info.EquivalentAtoms, IsEmpty());
+  EXPECT_THAT(Info.TrueAtoms, IsEmpty());
+  EXPECT_THAT(Info.FalseAtoms, IsEmpty());
+}
+
+} // namespace
diff --git a/llvm/utils/gn/secondary/clang/lib/Analysis/FlowSensitive/BUILD.gn b/llvm/utils/gn/secondary/clang/lib/Analysis/FlowSensitive/BUILD.gn
index 60b5e0b4497632b..e69567c2d9c652a 100644
--- a/llvm/utils/gn/secondary/clang/lib/Analysis/FlowSensitive/BUILD.gn
+++ b/llvm/utils/gn/secondary/clang/lib/Analysis/FlowSensitive/BUILD.gn
@@ -32,6 +32,7 @@ static_library("FlowSensitive") {
     "HTMLLogger.cpp",
     "Logger.cpp",
     "RecordOps.cpp",
+    "SimplifyConstraints.cpp",
     "Transfer.cpp",
     "TypeErasedDataflowAnalysis.cpp",
     "Value.cpp",
diff --git a/llvm/utils/gn/secondary/clang/unittests/Analysis/FlowSensitive/BUILD.gn b/llvm/utils/gn/secondary/clang/unittests/Analysis/FlowSensitive/BUILD.gn
index 2fe4fe854c57653..df5b4587bf1c197 100644
--- a/llvm/utils/gn/secondary/clang/unittests/Analysis/FlowSensitive/BUILD.gn
+++ ...
[truncated]

Copy link
Collaborator

@ymand ymand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice!

Comment on lines 9 to 11
// This file defines a DataflowAnalysisContext class that owns objects that
// encompass the state of a program and stores context that is used during
// dataflow analysis.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace/delete.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops... thanks for catching. Done.

//
//===----------------------------------------------------------------------===//
//
// This file defines a DataflowAnalysisContext class that owns objects that
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace or delete.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops again. Done.

namespace clang {
namespace dataflow {

// Substitute all occurrences of a given atom in `F` by a given formula and
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Substitute all occurrences of a given atom in `F` by a given formula and
// Substitutes all occurrences of a given atom in `F` by a given formula and

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

FalseAtoms.insert(Constraint->operands()[0]->getAtom());
break;
case Formula::Equal:
if (Constraint->operands()[0]->kind() == Formula::AtomRef &&
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: bind Constraint->operands() to a variable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. Done.

Comment on lines +143 to +147
if (NewConstraint.kind() == Formula::And) {
NewConstraints.insert(NewConstraint.operands()[0]);
NewConstraints.insert(NewConstraint.operands()[1]);
continue;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do this step here rather than at the outset (for loop on line 92)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't do it in the loop on line 92 -- because I already want that loop to "see" the "split conjunctions". So I'd have to add yet another loop. That might potentially be more efficient in some cases -- but it's not even clear to me that it would be a definite performance win (I'd be reallocating the Constraints array another time), and the code is simpler this way. So I think I'll keep it this way for now.


if (NewConstraints == Constraints)
break;
Constraints = NewConstraints;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::move?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course. Thanks for catching! Done.

for (Atom At : TrueAtoms)
Info->TrueAtoms.append(atomsInEquivalenceClass(
EquivalentAtoms, EquivalentAtoms.findValue(At)));
std::sort(Info->TrueAtoms.begin(), Info->TrueAtoms.end());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why sort? for output readability?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly.

@martinboehme martinboehme merged commit 7c63672 into llvm:main Nov 7, 2023
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:analysis clang:dataflow Clang Dataflow Analysis framework - https://clang.llvm.org/docs/DataFlowAnalysisIntro.html clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants