Skip to content

Commit

Permalink
Reland "[dataflow] Add dedicated representation of boolean formulas"
Browse files Browse the repository at this point in the history
This reverts commit 7a72ce9.

Test problems were due to unspecified order of function arg evaluation.

Reland "[dataflow] Replace most BoolValue subclasses with references to Formula (and AtomicBoolValue => Atom and BoolValue => Formula where appropriate)"

This properly frees the Value hierarchy from managing boolean formulas.

We still distinguish AtomicBoolValue; this type is used in client code.
However we expect to convert such uses to BoolValue (where the
distinction is not needed) or Atom (where atomic identity is intended),
and then fold AtomicBoolValue into FormulaBoolValue.

We also distinguish TopBoolValue; this has distinct rules for
widen/join/equivalence, and top-ness is not represented in Formula.
It'd be nice to find a cleaner representation (e.g. the absence of a
formula), but no immediate plans.

For now, BoolValues with the same Formula are deduplicated. This doesn't
seem desirable, as Values are mutable by their creators (properties).
We can probably drop this for FormulaBoolValue immediately (not in this
patch, to isolate changes). For AtomicBoolValue we first need to update
clients to stop using value pointers for atom identity.

The data structures around flow conditions are updated:
- flow condition tokens are Atom, rather than AtomicBoolValue*
- conditions are Formula, rather than BoolValue
Most APIs were changed directly, some with many clients had a
new version added and the existing one deprecated.

The factories for BoolValues in Environment keep their existing
signatures for now (e.g. makeOr(BoolValue, BoolValue) => BoolValue)
and are not deprecated. These have very many clients and finding the
most ergonomic API & migration path still needs some thought.

Differential Revision: https://reviews.llvm.org/D153469
  • Loading branch information
sam-mccall committed Jul 7, 2023
1 parent 74eac85 commit fc9821a
Show file tree
Hide file tree
Showing 24 changed files with 896 additions and 1,427 deletions.
116 changes: 66 additions & 50 deletions clang/include/clang/Analysis/FlowSensitive/Arena.h
Original file line number Diff line number Diff line change
Expand Up @@ -8,19 +8,18 @@
#ifndef LLVM_CLANG_ANALYSIS_FLOWSENSITIVE__ARENA_H
#define LLVM_CLANG_ANALYSIS_FLOWSENSITIVE__ARENA_H

#include "clang/Analysis/FlowSensitive/Formula.h"
#include "clang/Analysis/FlowSensitive/StorageLocation.h"
#include "clang/Analysis/FlowSensitive/Value.h"
#include <vector>

namespace clang::dataflow {

/// The Arena owns the objects that model data within an analysis.
/// For example, `Value` and `StorageLocation`.
/// For example, `Value`, `StorageLocation`, `Atom`, and `Formula`.
class Arena {
public:
Arena()
: TrueVal(create<AtomicBoolValue>()),
FalseVal(create<AtomicBoolValue>()) {}
Arena() : True(makeAtom()), False(makeAtom()) {}
Arena(const Arena &) = delete;
Arena &operator=(const Arena &) = delete;

Expand Down Expand Up @@ -56,74 +55,91 @@ class Arena {
.get());
}

/// Returns a boolean value that represents the conjunction of `LHS` and
/// `RHS`. Subsequent calls with the same arguments, regardless of their
/// order, will return the same result. If the given boolean values represent
/// the same value, the result will be the value itself.
BoolValue &makeAnd(BoolValue &LHS, BoolValue &RHS);

/// Returns a boolean value that represents the disjunction of `LHS` and
/// `RHS`. Subsequent calls with the same arguments, regardless of their
/// order, will return the same result. If the given boolean values represent
/// the same value, the result will be the value itself.
BoolValue &makeOr(BoolValue &LHS, BoolValue &RHS);

/// Returns a boolean value that represents the negation of `Val`. Subsequent
/// calls with the same argument will return the same result.
BoolValue &makeNot(BoolValue &Val);

/// Returns a boolean value that represents `LHS => RHS`. Subsequent calls
/// with the same arguments, will return the same result. If the given boolean
/// values represent the same value, the result will be a value that
/// represents the true boolean literal.
BoolValue &makeImplies(BoolValue &LHS, BoolValue &RHS);

/// Returns a boolean value that represents `LHS <=> RHS`. Subsequent calls
/// with the same arguments, regardless of their order, will return the same
/// result. If the given boolean values represent the same value, the result
/// will be a value that represents the true boolean literal.
BoolValue &makeEquals(BoolValue &LHS, BoolValue &RHS);
/// Creates a BoolValue wrapping a particular formula.
///
/// Passing in the same formula will result in the same BoolValue.
/// FIXME: Interning BoolValues but not other Values is inconsistent.
/// Decide whether we want Value interning or not.
BoolValue &makeBoolValue(const Formula &);

/// Creates a fresh atom and wraps in in an AtomicBoolValue.
/// FIXME: For now, identical-address AtomicBoolValue <=> identical atom.
/// Stop relying on pointer identity and remove this guarantee.
AtomicBoolValue &makeAtomValue() {
return cast<AtomicBoolValue>(makeBoolValue(makeAtomRef(makeAtom())));
}

/// Creates a fresh Top boolean value.
TopBoolValue &makeTopValue() {
// No need for deduplicating: there's no way to create aliasing Tops.
return create<TopBoolValue>(makeAtomRef(makeAtom()));
}

/// Returns a symbolic integer value that models an integer literal equal to
/// `Value`. These literals are the same every time.
/// Integer literals are not typed; the type is determined by the `Expr` that
/// an integer literal is associated with.
IntegerValue &makeIntLiteral(llvm::APInt Value);

/// Returns a symbolic boolean value that models a boolean literal equal to
/// `Value`. These literals are the same every time.
AtomicBoolValue &makeLiteral(bool Value) const {
return Value ? TrueVal : FalseVal;
// Factories for boolean formulas.
// Formulas are interned: passing the same arguments return the same result.
// For commutative operations like And/Or, interning ignores order.
// Simplifications are applied: makeOr(X, X) => X, etc.

/// Returns a formula for the conjunction of `LHS` and `RHS`.
const Formula &makeAnd(const Formula &LHS, const Formula &RHS);

/// Returns a formula for the disjunction of `LHS` and `RHS`.
const Formula &makeOr(const Formula &LHS, const Formula &RHS);

/// Returns a formula for the negation of `Val`.
const Formula &makeNot(const Formula &Val);

/// Returns a formula for `LHS => RHS`.
const Formula &makeImplies(const Formula &LHS, const Formula &RHS);

/// Returns a formula for `LHS <=> RHS`.
const Formula &makeEquals(const Formula &LHS, const Formula &RHS);

/// Returns a formula for the variable A.
const Formula &makeAtomRef(Atom A);

/// Returns a formula for a literal true/false.
const Formula &makeLiteral(bool Value) {
return makeAtomRef(Value ? True : False);
}

/// Returns a new atomic boolean variable, distinct from any other.
Atom makeAtom() { return static_cast<Atom>(NextAtom++); };

/// Creates a fresh flow condition and returns a token that identifies it. The
/// token can be used to perform various operations on the flow condition such
/// as adding constraints to it, forking it, joining it with another flow
/// condition, or checking implications.
AtomicBoolValue &makeFlowConditionToken() {
return create<AtomicBoolValue>();
}
Atom makeFlowConditionToken() { return makeAtom(); }

private:
llvm::BumpPtrAllocator Alloc;

// Storage for the state of a program.
std::vector<std::unique_ptr<StorageLocation>> Locs;
std::vector<std::unique_ptr<Value>> Vals;

// Indices that are used to avoid recreating the same integer literals and
// composite boolean values.
llvm::DenseMap<llvm::APInt, IntegerValue *> IntegerLiterals;
llvm::DenseMap<std::pair<BoolValue *, BoolValue *>, ConjunctionValue *>
ConjunctionVals;
llvm::DenseMap<std::pair<BoolValue *, BoolValue *>, DisjunctionValue *>
DisjunctionVals;
llvm::DenseMap<BoolValue *, NegationValue *> NegationVals;
llvm::DenseMap<std::pair<BoolValue *, BoolValue *>, ImplicationValue *>
ImplicationVals;
llvm::DenseMap<std::pair<BoolValue *, BoolValue *>, BiconditionalValue *>
BiconditionalVals;

AtomicBoolValue &TrueVal;
AtomicBoolValue &FalseVal;
using FormulaPair = std::pair<const Formula *, const Formula *>;
llvm::DenseMap<FormulaPair, const Formula *> Ands;
llvm::DenseMap<FormulaPair, const Formula *> Ors;
llvm::DenseMap<const Formula *, const Formula *> Nots;
llvm::DenseMap<FormulaPair, const Formula *> Implies;
llvm::DenseMap<FormulaPair, const Formula *> Equals;
llvm::DenseMap<Atom, const Formula *> AtomRefs;

llvm::DenseMap<const Formula *, BoolValue *> FormulaValues;
unsigned NextAtom = 0;

Atom True, False;
};

} // namespace clang::dataflow
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -138,33 +138,31 @@ class DataflowAnalysisContext {
PointerValue &getOrCreateNullPointerValue(QualType PointeeType);

/// Adds `Constraint` to the flow condition identified by `Token`.
void addFlowConditionConstraint(AtomicBoolValue &Token,
BoolValue &Constraint);
void addFlowConditionConstraint(Atom Token, const Formula &Constraint);

/// Creates a new flow condition with the same constraints as the flow
/// condition identified by `Token` and returns its token.
AtomicBoolValue &forkFlowCondition(AtomicBoolValue &Token);
Atom forkFlowCondition(Atom Token);

/// Creates a new flow condition that represents the disjunction of the flow
/// conditions identified by `FirstToken` and `SecondToken`, and returns its
/// token.
AtomicBoolValue &joinFlowConditions(AtomicBoolValue &FirstToken,
AtomicBoolValue &SecondToken);
Atom joinFlowConditions(Atom FirstToken, Atom SecondToken);

/// Returns true if and only if the constraints of the flow condition
/// identified by `Token` imply that `Val` is true.
bool flowConditionImplies(AtomicBoolValue &Token, BoolValue &Val);
bool flowConditionImplies(Atom Token, const Formula &);

/// Returns true if and only if the constraints of the flow condition
/// identified by `Token` are always true.
bool flowConditionIsTautology(AtomicBoolValue &Token);
bool flowConditionIsTautology(Atom Token);

/// Returns true if `Val1` is equivalent to `Val2`.
/// Note: This function doesn't take into account constraints on `Val1` and
/// `Val2` imposed by the flow condition.
bool equivalentBoolValues(BoolValue &Val1, BoolValue &Val2);
bool equivalentFormulas(const Formula &Val1, const Formula &Val2);

LLVM_DUMP_METHOD void dumpFlowCondition(AtomicBoolValue &Token,
LLVM_DUMP_METHOD void dumpFlowCondition(Atom Token,
llvm::raw_ostream &OS = llvm::dbgs());

/// Returns the `ControlFlowContext` registered for `F`, if any. Otherwise,
Expand All @@ -181,7 +179,7 @@ class DataflowAnalysisContext {
/// included in `Constraints` to provide contextually-accurate results, e.g.
/// if any definitions or relationships of the values in `Constraints` have
/// been stored in flow conditions.
Solver::Result querySolver(llvm::SetVector<BoolValue *> Constraints);
Solver::Result querySolver(llvm::SetVector<const Formula *> Constraints);

private:
friend class Environment;
Expand Down Expand Up @@ -209,12 +207,12 @@ class DataflowAnalysisContext {
/// to track tokens of flow conditions that were already visited by recursive
/// calls.
void addTransitiveFlowConditionConstraints(
AtomicBoolValue &Token, llvm::SetVector<BoolValue *> &Constraints,
llvm::DenseSet<AtomicBoolValue *> &VisitedTokens);
Atom Token, llvm::SetVector<const Formula *> &Constraints,
llvm::DenseSet<Atom> &VisitedTokens);

/// Returns true if the solver is able to prove that there is no satisfying
/// assignment for `Constraints`
bool isUnsatisfiable(llvm::SetVector<BoolValue *> Constraints) {
bool isUnsatisfiable(llvm::SetVector<const Formula *> Constraints) {
return querySolver(std::move(Constraints)).getStatus() ==
Solver::Result::Status::Unsatisfiable;
}
Expand Down Expand Up @@ -253,9 +251,8 @@ class DataflowAnalysisContext {
// Flow conditions depend on other flow conditions if they are created using
// `forkFlowCondition` or `joinFlowConditions`. The graph of flow condition
// dependencies is stored in the `FlowConditionDeps` map.
llvm::DenseMap<AtomicBoolValue *, llvm::DenseSet<AtomicBoolValue *>>
FlowConditionDeps;
llvm::DenseMap<AtomicBoolValue *, BoolValue *> FlowConditionConstraints;
llvm::DenseMap<Atom, llvm::DenseSet<Atom>> FlowConditionDeps;
llvm::DenseMap<Atom, const Formula *> FlowConditionConstraints;

llvm::DenseMap<const FunctionDecl *, ControlFlowContext> FunctionContexts;

Expand Down
35 changes: 23 additions & 12 deletions clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h
Original file line number Diff line number Diff line change
Expand Up @@ -456,77 +456,86 @@ class Environment {
template <typename T, typename... Args>
std::enable_if_t<std::is_base_of<Value, T>::value, T &>
create(Args &&...args) {
return DACtx->arena().create<T>(std::forward<Args>(args)...);
return arena().create<T>(std::forward<Args>(args)...);
}

/// Returns a symbolic integer value that models an integer literal equal to
/// `Value`
IntegerValue &getIntLiteralValue(llvm::APInt Value) const {
return DACtx->arena().makeIntLiteral(Value);
return arena().makeIntLiteral(Value);
}

/// Returns a symbolic boolean value that models a boolean literal equal to
/// `Value`
AtomicBoolValue &getBoolLiteralValue(bool Value) const {
return DACtx->arena().makeLiteral(Value);
return cast<AtomicBoolValue>(
arena().makeBoolValue(arena().makeLiteral(Value)));
}

/// Returns an atomic boolean value.
BoolValue &makeAtomicBoolValue() const {
return DACtx->arena().create<AtomicBoolValue>();
return arena().makeAtomValue();
}

/// Returns a unique instance of boolean Top.
BoolValue &makeTopBoolValue() const {
return DACtx->arena().create<TopBoolValue>();
return arena().makeTopValue();
}

/// Returns a boolean value that represents the conjunction of `LHS` and
/// `RHS`. Subsequent calls with the same arguments, regardless of their
/// order, will return the same result. If the given boolean values represent
/// the same value, the result will be the value itself.
BoolValue &makeAnd(BoolValue &LHS, BoolValue &RHS) const {
return DACtx->arena().makeAnd(LHS, RHS);
return arena().makeBoolValue(
arena().makeAnd(LHS.formula(), RHS.formula()));
}

/// Returns a boolean value that represents the disjunction of `LHS` and
/// `RHS`. Subsequent calls with the same arguments, regardless of their
/// order, will return the same result. If the given boolean values represent
/// the same value, the result will be the value itself.
BoolValue &makeOr(BoolValue &LHS, BoolValue &RHS) const {
return DACtx->arena().makeOr(LHS, RHS);
return arena().makeBoolValue(
arena().makeOr(LHS.formula(), RHS.formula()));
}

/// Returns a boolean value that represents the negation of `Val`. Subsequent
/// calls with the same argument will return the same result.
BoolValue &makeNot(BoolValue &Val) const {
return DACtx->arena().makeNot(Val);
return arena().makeBoolValue(arena().makeNot(Val.formula()));
}

/// Returns a boolean value represents `LHS` => `RHS`. Subsequent calls with
/// the same arguments, will return the same result. If the given boolean
/// values represent the same value, the result will be a value that
/// represents the true boolean literal.
BoolValue &makeImplication(BoolValue &LHS, BoolValue &RHS) const {
return DACtx->arena().makeImplies(LHS, RHS);
return arena().makeBoolValue(
arena().makeImplies(LHS.formula(), RHS.formula()));
}

/// Returns a boolean value represents `LHS` <=> `RHS`. Subsequent calls with
/// the same arguments, regardless of their order, will return the same
/// result. If the given boolean values represent the same value, the result
/// will be a value that represents the true boolean literal.
BoolValue &makeIff(BoolValue &LHS, BoolValue &RHS) const {
return DACtx->arena().makeEquals(LHS, RHS);
return arena().makeBoolValue(
arena().makeEquals(LHS.formula(), RHS.formula()));
}

/// Returns the token that identifies the flow condition of the environment.
AtomicBoolValue &getFlowConditionToken() const { return *FlowConditionToken; }
Atom getFlowConditionToken() const { return FlowConditionToken; }

/// Adds `Val` to the set of clauses that constitute the flow condition.
void addToFlowCondition(const Formula &);
LLVM_DEPRECATED("Use Formula version instead", "")
void addToFlowCondition(BoolValue &Val);

/// Returns true if and only if the clauses that constitute the flow condition
/// imply that `Val` is true.
bool flowConditionImplies(const Formula &) const;
LLVM_DEPRECATED("Use Formula version instead", "")
bool flowConditionImplies(BoolValue &Val) const;

/// Returns the `DeclContext` of the block being analysed, if any. Otherwise,
Expand All @@ -547,6 +556,8 @@ class Environment {
/// Returns the `DataflowAnalysisContext` used by the environment.
DataflowAnalysisContext &getDataflowAnalysisContext() const { return *DACtx; }

Arena &arena() const { return DACtx->arena(); }

LLVM_DUMP_METHOD void dump() const;
LLVM_DUMP_METHOD void dump(raw_ostream &OS) const;

Expand Down Expand Up @@ -617,7 +628,7 @@ class Environment {
std::pair<StructValue *, const ValueDecl *>>
MemberLocToStruct;

AtomicBoolValue *FlowConditionToken;
Atom FlowConditionToken;
};

/// Returns the storage location for the implicit object of a
Expand Down

0 comments on commit fc9821a

Please sign in to comment.