Skip to content

Commit

Permalink
[clang][dataflow] Simplify flow conditions displayed in HTMLLogger. (#…
Browse files Browse the repository at this point in the history
…70848)

This can make the flow condition significantly easier to interpret; see
below
for an example.

I had hoped that adding the simplification as a preprocessing step
before the
SAT solver (in `DataflowAnalysisContext::querySolver()`) would also
speed up SAT
solving and maybe even eliminate SAT solver timeouts, but in my testing,
this
actually turns out to be a pessimization. It appears that these
simplifications
are easy enough for the SAT solver to perform itself.

Nevertheless, the improvement in debugging alone makes this a worthwhile
change.

Example of flow condition output with these changes:

```
Flow condition token: V37
Constraints:
(V16 = (((V15 & (V19 = V12)) & V22) & V25))
(V15 = ((V12 & ((V14 = V9) | (V14 = V4))) & (V13 = V14)))
True atoms: (V0, V1, V2, V5, V6, V7, V29, V30, V32, V34, V35, V37)
False atoms: (V3, V8, V17)
Equivalent atoms:
(V11, V15)

Flow condition constraints before simplification:
V37
((!V3 & !V8) & !V17)
(V37 = V34)
(V34 = (V29 & (V35 = V30)))
(V29 = (((V16 | V2) & V32) & (V30 = V32)))
(V16 = (((V15 & (V19 = V12)) & V22) & V25))
(V15 = V11)
(V11 = ((((V7 | V2) & V12) & ((V7 & (V14 = V9)) | (V2 & (V14 = V4)))) & (V13 = V14)))
(V2 = V1)
(V1 = V0)
V0
(V7 = V6)
(V6 = V5)
(V5 = V2)
```
  • Loading branch information
martinboehme committed Nov 7, 2023
1 parent 9b24391 commit 7c63672
Show file tree
Hide file tree
Showing 9 changed files with 449 additions and 2 deletions.
49 changes: 49 additions & 0 deletions clang/include/clang/Analysis/FlowSensitive/SimplifyConstraints.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
//===-- SimplifyConstraints.h -----------------------------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

#ifndef LLVM_CLANG_ANALYSIS_FLOWSENSITIVE_SIMPLIFYCONSTRAINTS_H
#define LLVM_CLANG_ANALYSIS_FLOWSENSITIVE_SIMPLIFYCONSTRAINTS_H

#include "clang/Analysis/FlowSensitive/Arena.h"
#include "clang/Analysis/FlowSensitive/Formula.h"
#include "llvm/ADT/SetVector.h"

namespace clang {
namespace dataflow {

/// Information on the way a set of constraints was simplified.
struct SimplifyConstraintsInfo {
/// List of equivalence classes of atoms. For each equivalence class, the
/// original constraints imply that all atoms in it must be equivalent.
/// Simplification replaces all occurrences of atoms in an equivalence class
/// with a single representative atom from the class.
/// Does not contain equivalence classes with just one member or atoms
/// contained in `TrueAtoms` or `FalseAtoms`.
llvm::SmallVector<llvm::SmallVector<Atom>> EquivalentAtoms;
/// Atoms that the original constraints imply must be true.
/// Simplification replaces all occurrences of these atoms by a true literal
/// (which may enable additional simplifications).
llvm::SmallVector<Atom> TrueAtoms;
/// Atoms that the original constraints imply must be false.
/// Simplification replaces all occurrences of these atoms by a false literal
/// (which may enable additional simplifications).
llvm::SmallVector<Atom> FalseAtoms;
};

/// Simplifies a set of constraints (implicitly connected by "and") in a way
/// that does not change satisfiability of the constraints. This does _not_ mean
/// that the set of solutions is the same before and after simplification.
/// `Info`, if non-null, will be populated with information about the
/// simplifications that were made to the formula (e.g. to display to the user).
void simplifyConstraints(llvm::SetVector<const Formula *> &Constraints,
Arena &arena, SimplifyConstraintsInfo *Info = nullptr);

} // namespace dataflow
} // namespace clang

#endif // LLVM_CLANG_ANALYSIS_FLOWSENSITIVE_SIMPLIFYCONSTRAINTS_H
1 change: 1 addition & 0 deletions clang/lib/Analysis/FlowSensitive/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ add_clang_library(clangAnalysisFlowSensitive
HTMLLogger.cpp
Logger.cpp
RecordOps.cpp
SimplifyConstraints.cpp
Transfer.cpp
TypeErasedDataflowAnalysis.cpp
Value.cpp
Expand Down
40 changes: 39 additions & 1 deletion clang/lib/Analysis/FlowSensitive/DataflowAnalysisContext.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
#include "clang/Analysis/FlowSensitive/DebugSupport.h"
#include "clang/Analysis/FlowSensitive/Formula.h"
#include "clang/Analysis/FlowSensitive/Logger.h"
#include "clang/Analysis/FlowSensitive/SimplifyConstraints.h"
#include "clang/Analysis/FlowSensitive/Value.h"
#include "llvm/ADT/SetOperations.h"
#include "llvm/ADT/SetVector.h"
Expand Down Expand Up @@ -205,13 +206,50 @@ void DataflowAnalysisContext::addTransitiveFlowConditionConstraints(
}
}

static void printAtomList(const llvm::SmallVector<Atom> &Atoms,
llvm::raw_ostream &OS) {
OS << "(";
for (size_t i = 0; i < Atoms.size(); ++i) {
OS << Atoms[i];
if (i + 1 < Atoms.size())
OS << ", ";
}
OS << ")\n";
}

void DataflowAnalysisContext::dumpFlowCondition(Atom Token,
llvm::raw_ostream &OS) {
llvm::SetVector<const Formula *> Constraints;
Constraints.insert(&arena().makeAtomRef(Token));
addTransitiveFlowConditionConstraints(Token, Constraints);

for (const auto *Constraint : Constraints) {
OS << "Flow condition token: " << Token << "\n";
SimplifyConstraintsInfo Info;
llvm::SetVector<const Formula *> OriginalConstraints = Constraints;
simplifyConstraints(Constraints, arena(), &Info);
if (!Constraints.empty()) {
OS << "Constraints:\n";
for (const auto *Constraint : Constraints) {
Constraint->print(OS);
OS << "\n";
}
}
if (!Info.TrueAtoms.empty()) {
OS << "True atoms: ";
printAtomList(Info.TrueAtoms, OS);
}
if (!Info.FalseAtoms.empty()) {
OS << "False atoms: ";
printAtomList(Info.FalseAtoms, OS);
}
if (!Info.EquivalentAtoms.empty()) {
OS << "Equivalent atoms:\n";
for (const llvm::SmallVector<Atom> Class : Info.EquivalentAtoms)
printAtomList(Class, OS);
}

OS << "\nFlow condition constraints before simplification:\n";
for (const auto *Constraint : OriginalConstraints) {
Constraint->print(OS);
OS << "\n";
}
Expand Down
2 changes: 1 addition & 1 deletion clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -961,7 +961,7 @@ void Environment::dump(raw_ostream &OS) const {
OS << " [" << L << ", " << V << ": " << *V << "]\n";
}

OS << "FlowConditionToken:\n";
OS << "\n";
DACtx->dumpFlowCondition(FlowConditionToken, OS);
}

Expand Down
179 changes: 179 additions & 0 deletions clang/lib/Analysis/FlowSensitive/SimplifyConstraints.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,179 @@
//===-- SimplifyConstraints.cpp ---------------------------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

#include "clang/Analysis/FlowSensitive/SimplifyConstraints.h"
#include "llvm/ADT/EquivalenceClasses.h"

namespace clang {
namespace dataflow {

// Substitutes all occurrences of a given atom in `F` by a given formula and
// returns the resulting formula.
static const Formula &
substitute(const Formula &F,
const llvm::DenseMap<Atom, const Formula *> &Substitutions,
Arena &arena) {
switch (F.kind()) {
case Formula::AtomRef:
if (auto iter = Substitutions.find(F.getAtom());
iter != Substitutions.end())
return *iter->second;
return F;
case Formula::Literal:
return F;
case Formula::Not:
return arena.makeNot(substitute(*F.operands()[0], Substitutions, arena));
case Formula::And:
return arena.makeAnd(substitute(*F.operands()[0], Substitutions, arena),
substitute(*F.operands()[1], Substitutions, arena));
case Formula::Or:
return arena.makeOr(substitute(*F.operands()[0], Substitutions, arena),
substitute(*F.operands()[1], Substitutions, arena));
case Formula::Implies:
return arena.makeImplies(
substitute(*F.operands()[0], Substitutions, arena),
substitute(*F.operands()[1], Substitutions, arena));
case Formula::Equal:
return arena.makeEquals(substitute(*F.operands()[0], Substitutions, arena),
substitute(*F.operands()[1], Substitutions, arena));
}
}

// Returns the result of replacing atoms in `Atoms` with the leader of their
// equivalence class in `EquivalentAtoms`.
// Atoms that don't have an equivalence class in `EquivalentAtoms` are inserted
// into it as single-member equivalence classes.
static llvm::DenseSet<Atom>
projectToLeaders(const llvm::DenseSet<Atom> &Atoms,
llvm::EquivalenceClasses<Atom> &EquivalentAtoms) {
llvm::DenseSet<Atom> Result;

for (Atom Atom : Atoms)
Result.insert(EquivalentAtoms.getOrInsertLeaderValue(Atom));

return Result;
}

// Returns the atoms in the equivalence class for the leader identified by
// `LeaderIt`.
static llvm::SmallVector<Atom>
atomsInEquivalenceClass(const llvm::EquivalenceClasses<Atom> &EquivalentAtoms,
llvm::EquivalenceClasses<Atom>::iterator LeaderIt) {
llvm::SmallVector<Atom> Result;
for (auto MemberIt = EquivalentAtoms.member_begin(LeaderIt);
MemberIt != EquivalentAtoms.member_end(); ++MemberIt)
Result.push_back(*MemberIt);
return Result;
}

void simplifyConstraints(llvm::SetVector<const Formula *> &Constraints,
Arena &arena, SimplifyConstraintsInfo *Info) {
auto contradiction = [&]() {
Constraints.clear();
Constraints.insert(&arena.makeLiteral(false));
};

llvm::EquivalenceClasses<Atom> EquivalentAtoms;
llvm::DenseSet<Atom> TrueAtoms;
llvm::DenseSet<Atom> FalseAtoms;

while (true) {
for (const auto *Constraint : Constraints) {
switch (Constraint->kind()) {
case Formula::AtomRef:
TrueAtoms.insert(Constraint->getAtom());
break;
case Formula::Not:
if (Constraint->operands()[0]->kind() == Formula::AtomRef)
FalseAtoms.insert(Constraint->operands()[0]->getAtom());
break;
case Formula::Equal: {
ArrayRef<const Formula *> operands = Constraint->operands();
if (operands[0]->kind() == Formula::AtomRef &&
operands[1]->kind() == Formula::AtomRef) {
EquivalentAtoms.unionSets(operands[0]->getAtom(),
operands[1]->getAtom());
}
break;
}
default:
break;
}
}

TrueAtoms = projectToLeaders(TrueAtoms, EquivalentAtoms);
FalseAtoms = projectToLeaders(FalseAtoms, EquivalentAtoms);

llvm::DenseMap<Atom, const Formula *> Substitutions;
for (auto It = EquivalentAtoms.begin(); It != EquivalentAtoms.end(); ++It) {
Atom TheAtom = It->getData();
Atom Leader = EquivalentAtoms.getLeaderValue(TheAtom);
if (TrueAtoms.contains(Leader)) {
if (FalseAtoms.contains(Leader)) {
contradiction();
return;
}
Substitutions.insert({TheAtom, &arena.makeLiteral(true)});
} else if (FalseAtoms.contains(Leader)) {
Substitutions.insert({TheAtom, &arena.makeLiteral(false)});
} else if (TheAtom != Leader) {
Substitutions.insert({TheAtom, &arena.makeAtomRef(Leader)});
}
}

llvm::SetVector<const Formula *> NewConstraints;
for (const auto *Constraint : Constraints) {
const Formula &NewConstraint =
substitute(*Constraint, Substitutions, arena);
if (&NewConstraint == &arena.makeLiteral(true))
continue;
if (&NewConstraint == &arena.makeLiteral(false)) {
contradiction();
return;
}
if (NewConstraint.kind() == Formula::And) {
NewConstraints.insert(NewConstraint.operands()[0]);
NewConstraints.insert(NewConstraint.operands()[1]);
continue;
}
NewConstraints.insert(&NewConstraint);
}

if (NewConstraints == Constraints)
break;
Constraints = std::move(NewConstraints);
}

if (Info) {
for (auto It = EquivalentAtoms.begin(), End = EquivalentAtoms.end();
It != End; ++It) {
if (!It->isLeader())
continue;
Atom At = *EquivalentAtoms.findLeader(It);
if (TrueAtoms.contains(At) || FalseAtoms.contains(At))
continue;
llvm::SmallVector<Atom> Atoms =
atomsInEquivalenceClass(EquivalentAtoms, It);
if (Atoms.size() == 1)
continue;
std::sort(Atoms.begin(), Atoms.end());
Info->EquivalentAtoms.push_back(std::move(Atoms));
}
for (Atom At : TrueAtoms)
Info->TrueAtoms.append(atomsInEquivalenceClass(
EquivalentAtoms, EquivalentAtoms.findValue(At)));
std::sort(Info->TrueAtoms.begin(), Info->TrueAtoms.end());
for (Atom At : FalseAtoms)
Info->FalseAtoms.append(atomsInEquivalenceClass(
EquivalentAtoms, EquivalentAtoms.findValue(At)));
std::sort(Info->FalseAtoms.begin(), Info->FalseAtoms.end());
}
}

} // namespace dataflow
} // namespace clang
1 change: 1 addition & 0 deletions clang/unittests/Analysis/FlowSensitive/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ add_clang_unittest(ClangAnalysisFlowSensitiveTests
MultiVarConstantPropagationTest.cpp
RecordOpsTest.cpp
SignAnalysisTest.cpp
SimplifyConstraintsTest.cpp
SingleVarConstantPropagationTest.cpp
SolverTest.cpp
TestingSupport.cpp
Expand Down

2 comments on commit 7c63672

@hctim
Copy link
Collaborator

@hctim hctim commented on 7c63672 Nov 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, looks like this broke various buildbots with -Werror (https://lab.llvm.org/buildbot/#/builders/37/builds/27176):

[6/103] Building CXX object tools/clang/lib/Analysis/FlowSensitive/CMakeFiles/obj.clangAnalysisFlowSensitive.dir/DataflowAnalysisContext.cpp.o
FAILED: tools/clang/lib/Analysis/FlowSensitive/CMakeFiles/obj.clangAnalysisFlowSensitive.dir/DataflowAnalysisContext.cpp.o 
CCACHE_CPP2=yes CCACHE_HASHDIR=yes /usr/bin/ccache /b/sanitizer-x86_64-linux/build/llvm_build0/bin/clang++ -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/b/sanitizer-x86_64-linux/build/build_symbolizer/tools/clang/lib/Analysis/FlowSensitive -I/b/sanitizer-x86_64-linux/build/llvm-project/clang/lib/Analysis/FlowSensitive -I/b/sanitizer-x86_64-linux/build/llvm-project/clang/include -I/b/sanitizer-x86_64-linux/build/build_symbolizer/tools/clang/include -I/b/sanitizer-x86_64-linux/build/build_symbolizer/include -I/b/sanitizer-x86_64-linux/build/llvm-project/llvm/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -fno-common -Woverloaded-virtual -Wno-nested-anon-types -O3 -DNDEBUG  -fno-exceptions -funwind-tables -fno-rtti -UNDEBUG -std=c++17 -MD -MT tools/clang/lib/Analysis/FlowSensitive/CMakeFiles/obj.clangAnalysisFlowSensitive.dir/DataflowAnalysisContext.cpp.o -MF tools/clang/lib/Analysis/FlowSensitive/CMakeFiles/obj.clangAnalysisFlowSensitive.dir/DataflowAnalysisContext.cpp.o.d -o tools/clang/lib/Analysis/FlowSensitive/CMakeFiles/obj.clangAnalysisFlowSensitive.dir/DataflowAnalysisContext.cpp.o -c /b/sanitizer-x86_64-linux/build/llvm-project/clang/lib/Analysis/FlowSensitive/DataflowAnalysisContext.cpp
/b/sanitizer-x86_64-linux/build/llvm-project/clang/lib/Analysis/FlowSensitive/DataflowAnalysisContext.cpp:247:40: error: loop variable 'Class' creates a copy from type 'const llvm::SmallVector<Atom>' [-Werror,-Wrange-loop-construct]
  247 |     for (const llvm::SmallVector<Atom> Class : Info.EquivalentAtoms)
      |                                        ^
/b/sanitizer-x86_64-linux/build/llvm-project/clang/lib/Analysis/FlowSensitive/DataflowAnalysisContext.cpp:247:10: note: use reference type 'const llvm::SmallVector<Atom> &' to prevent copying
  247 |     for (const llvm::SmallVector<Atom> Class : Info.EquivalentAtoms)
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                                        &
1 error generated.

@hctim
Copy link
Collaborator

@hctim hctim commented on 7c63672 Nov 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see this was fixed in a737a33. Thanks!

Please sign in to comment.