Skip to content

Commit

Permalink
[analyzer][NFC] Rework SVal kind representation (#71039)
Browse files Browse the repository at this point in the history
The goal of this patch is to refine how the `SVal` base and sub-kinds
are represented by forming one unified enum describing the possible
SVals. This means that the `unsigned SVal::Kind` and the attached
bit-packing semantics would be replaced by a single unified enum. This
is more conventional and leads to a better debugging experience by
default. This eases the need of using debug pretty-printers, or the use
of runtime functions doing the printing for us like we do today by
calling `Val.dump()` whenever we inspect the values.

Previously, the first 2 bits of the `unsigned SVal::Kind` discriminated
the following quartet: `UndefinedVal`, `UnknownVal`, `Loc`, or `NonLoc`.
The rest of the upper bits represented the sub-kind, where the value
represented the index among only the `Loc`s or `NonLoc`s, effectively
attaching 2 meanings of the upper bits depending on the base-kind. We
don't need to pack these bits, as we have plenty even if we would use
just a plan-old `unsigned char`.

Consequently, in this patch, I propose to lay out all the (non-abstract)
`SVal` kinds into a single enum, along with some metadata (`BEGIN_Loc`,
`END_Loc`, `BEGIN_NonLoc`, `END_NonLoc`) artificial enum values, similar
how we do with the `MemRegions`.

Note that in the unified `SVal::Kind` enum, to differentiate
`nonloc::ConcreteInt` from `loc::ConcreteInt`, I had to prefix them with
`Loc` and `NonLoc` to resolve this ambiguity.
This should not surface in general, because I'm replacing the
`nonloc::Kind` enum items with `inline constexpr` global constants to
mimic the original behavior - and offer nicer spelling to these enum
values.

Some `SVal` constructors were not marked explicit, which I now mark as
such to follow best practices, and marked others as `/*implicit*/` to
clarify the intent.
During refactoring, I also found at least one function not marked
`LLVM_ATTRIBUTE_RETURNS_NONNULL`, so I did that.

The `TypeRetrievingVisitor` visitor had some accidental dead code,
namely: `VisitNonLocConcreteInt` and `VisitLocConcreteInt`.

Previously, the `SValVisitor` expected visit handlers of
`VisitNonLocXXXXX(nonloc::XXXXX)` and `VisitLocXXXXX(loc::XXXXX)`, where
I felt that envoding `NonLoc` and `Loc` in the name is not necessary as
the type of the parameter would select the right overload anyways, so I
simplified the naming of those visit functions.

The rest of the diff is a lot of times just formatting, because
`getKind()` by nature, frequently appears in switches, which means that
the whole switch gets automatically reformatted. I could probably undo
the formatting, but I didn't want to deviate from the rule unless
explicitly requested.
  • Loading branch information
steakhal committed Nov 4, 2023
1 parent b3eac1a commit bde5717
Show file tree
Hide file tree
Showing 9 changed files with 219 additions and 304 deletions.
10 changes: 5 additions & 5 deletions clang/include/clang/StaticAnalyzer/Checkers/SValExplainer.h
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ class SValExplainer : public FullSValVisitor<SValExplainer, std::string> {
return "undefined value";
}

std::string VisitLocMemRegionVal(loc::MemRegionVal V) {
std::string VisitMemRegionVal(loc::MemRegionVal V) {
const MemRegion *R = V.getRegion();
// Avoid the weird "pointer to pointee of ...".
if (auto SR = dyn_cast<SymbolicRegion>(R)) {
Expand All @@ -76,19 +76,19 @@ class SValExplainer : public FullSValVisitor<SValExplainer, std::string> {
return "pointer to " + Visit(R);
}

std::string VisitLocConcreteInt(loc::ConcreteInt V) {
std::string VisitConcreteInt(loc::ConcreteInt V) {
const llvm::APSInt &I = V.getValue();
std::string Str;
llvm::raw_string_ostream OS(Str);
OS << "concrete memory address '" << I << "'";
return Str;
}

std::string VisitNonLocSymbolVal(nonloc::SymbolVal V) {
std::string VisitSymbolVal(nonloc::SymbolVal V) {
return Visit(V.getSymbol());
}

std::string VisitNonLocConcreteInt(nonloc::ConcreteInt V) {
std::string VisitConcreteInt(nonloc::ConcreteInt V) {
const llvm::APSInt &I = V.getValue();
std::string Str;
llvm::raw_string_ostream OS(Str);
Expand All @@ -97,7 +97,7 @@ class SValExplainer : public FullSValVisitor<SValExplainer, std::string> {
return Str;
}

std::string VisitNonLocLazyCompoundVal(nonloc::LazyCompoundVal V) {
std::string VisitLazyCompoundVal(nonloc::LazyCompoundVal V) {
return "lazily frozen compound value of " + Visit(V.getRegion());
}

Expand Down
55 changes: 23 additions & 32 deletions clang/include/clang/StaticAnalyzer/Core/PathSensitive/SValVisitor.h
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,9 @@
#ifndef LLVM_CLANG_STATICANALYZER_CORE_PATHSENSITIVE_SVALVISITOR_H
#define LLVM_CLANG_STATICANALYZER_CORE_PATHSENSITIVE_SVALVISITOR_H

#include "clang/StaticAnalyzer/Core/PathSensitive/MemRegion.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/SVals.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/SymbolManager.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/MemRegion.h"

namespace clang {

Expand All @@ -25,49 +25,40 @@ namespace ento {
/// SValVisitor - this class implements a simple visitor for SVal
/// subclasses.
template <typename ImplClass, typename RetTy = void> class SValVisitor {
public:

#define DISPATCH(NAME, CLASS) \
return static_cast<ImplClass *>(this)->Visit ## NAME(V.castAs<CLASS>())
ImplClass &derived() { return *static_cast<ImplClass *>(this); }

public:
RetTy Visit(SVal V) {
// Dispatch to VisitFooVal for each FooVal.
// Take namespaces (loc:: and nonloc::) into account.
switch (V.getBaseKind()) {
#define BASIC_SVAL(Id, Parent) case SVal::Id ## Kind: DISPATCH(Id, Id);
switch (V.getKind()) {
#define BASIC_SVAL(Id, Parent) \
case SVal::Id##Kind: \
return derived().Visit##Id(V.castAs<Id>());
#define LOC_SVAL(Id, Parent) \
case SVal::Loc##Id##Kind: \
return derived().Visit##Id(V.castAs<loc::Id>());
#define NONLOC_SVAL(Id, Parent) \
case SVal::NonLoc##Id##Kind: \
return derived().Visit##Id(V.castAs<nonloc::Id>());
#include "clang/StaticAnalyzer/Core/PathSensitive/SVals.def"
case SVal::LocKind:
switch (V.getSubKind()) {
#define LOC_SVAL(Id, Parent) \
case loc::Id ## Kind: DISPATCH(Loc ## Id, loc :: Id);
#include "clang/StaticAnalyzer/Core/PathSensitive/SVals.def"
}
llvm_unreachable("Unknown Loc sub-kind!");
case SVal::NonLocKind:
switch (V.getSubKind()) {
#define NONLOC_SVAL(Id, Parent) \
case nonloc::Id ## Kind: DISPATCH(NonLoc ## Id, nonloc :: Id);
#include "clang/StaticAnalyzer/Core/PathSensitive/SVals.def"
}
llvm_unreachable("Unknown NonLoc sub-kind!");
}
llvm_unreachable("Unknown SVal kind!");
}

#define BASIC_SVAL(Id, Parent) \
RetTy Visit ## Id(Id V) { DISPATCH(Parent, Id); }
#define ABSTRACT_SVAL(Id, Parent) \
BASIC_SVAL(Id, Parent)
#define LOC_SVAL(Id, Parent) \
RetTy VisitLoc ## Id(loc::Id V) { DISPATCH(Parent, Parent); }
#define NONLOC_SVAL(Id, Parent) \
RetTy VisitNonLoc ## Id(nonloc::Id V) { DISPATCH(Parent, Parent); }
// Dispatch to the more generic handler as a default implementation.
#define BASIC_SVAL(Id, Parent) \
RetTy Visit##Id(Id V) { return derived().Visit##Parent(V.castAs<Id>()); }
#define ABSTRACT_SVAL(Id, Parent) BASIC_SVAL(Id, Parent)
#define LOC_SVAL(Id, Parent) \
RetTy Visit##Id(loc::Id V) { return derived().VisitLoc(V.castAs<Loc>()); }
#define NONLOC_SVAL(Id, Parent) \
RetTy Visit##Id(nonloc::Id V) { \
return derived().VisitNonLoc(V.castAs<NonLoc>()); \
}
#include "clang/StaticAnalyzer/Core/PathSensitive/SVals.def"

// Base case, ignore it. :)
RetTy VisitSVal(SVal V) { return RetTy(); }

#undef DISPATCH
};

/// SymExprVisitor - this class implements a simple visitor for SymExpr
Expand Down
38 changes: 18 additions & 20 deletions clang/include/clang/StaticAnalyzer/Core/PathSensitive/SVals.def
Original file line number Diff line number Diff line change
Expand Up @@ -6,28 +6,24 @@
//
//===----------------------------------------------------------------------===//
//
// The list of symbolic values (SVal kinds and sub-kinds) used in the Static
// Analyzer. The distinction between loc:: and nonloc:: SVal namespaces is
// The list of symbolic values (SVal kinds) used in the Static Analyzer.
// The distinction between `loc::` and `nonloc::` SVal namespaces is
// currently hardcoded, because it is too peculiar and explicit to be handled
// uniformly. In order to use this information, users of this file must define
// one or more of the following macros:
//
// BASIC_SVAL(Id, Parent) - for specific SVal sub-kinds, which are
// neither in loc:: nor in nonloc:: namespace; these classes occupy
// their own base kind IdKind.
// BASIC_SVAL(Id, Parent) - for specific SVal kinds, which are
// neither in `loc::` nor in `nonloc::` namespace.
//
// ABSTRACT_SVAL(Id, Parent) - for abstract SVal classes which are
// neither in loc:: nor in nonloc:: namespace,
// neither in `loc::` nor in `nonloc::` namespace,
//
// ABSTRACT_SVAL_WITH_KIND(Id, Parent) - for SVal classes which are also
// neither in loc:: nor in nonloc:: namespace, but occupy a whole base kind
// identifier IdKind, much like BASIC_SVALs.
// LOC_SVAL(Id, Parent) - for values in `loc::` namespace.
//
// LOC_SVAL(Id, Parent) - for values in loc:: namespace, which occupy a sub-kind
// loc::IdKind.
// NONLOC_SVAL(Id, Parent) - for values in `nonloc::` namespace.
//
// NONLOC_SVAL(Id, Parent) - for values in nonloc:: namespace, which occupy a
// sub-kind nonloc::IdKind.
// SVAL_RANGE(Id, First, Last) - for defining range of subtypes of
// the abstract class `Id`.
//
//===----------------------------------------------------------------------===//

Expand All @@ -39,10 +35,6 @@
#define ABSTRACT_SVAL(Id, Parent)
#endif

#ifndef ABSTRACT_SVAL_WITH_KIND
#define ABSTRACT_SVAL_WITH_KIND(Id, Parent) ABSTRACT_SVAL(Id, Parent)
#endif

#ifndef LOC_SVAL
#define LOC_SVAL(Id, Parent)
#endif
Expand All @@ -51,24 +43,30 @@
#define NONLOC_SVAL(Id, Parent)
#endif

#ifndef SVAL_RANGE
#define SVAL_RANGE(Id, First, Last)
#endif

BASIC_SVAL(UndefinedVal, SVal)
ABSTRACT_SVAL(DefinedOrUnknownSVal, SVal)
BASIC_SVAL(UnknownVal, DefinedOrUnknownSVal)
ABSTRACT_SVAL(DefinedSVal, DefinedOrUnknownSVal)
ABSTRACT_SVAL_WITH_KIND(Loc, DefinedSVal)
ABSTRACT_SVAL(Loc, DefinedSVal)
LOC_SVAL(ConcreteInt, Loc)
LOC_SVAL(GotoLabel, Loc)
LOC_SVAL(MemRegionVal, Loc)
ABSTRACT_SVAL_WITH_KIND(NonLoc, DefinedSVal)
SVAL_RANGE(Loc, ConcreteInt, MemRegionVal)
ABSTRACT_SVAL(NonLoc, DefinedSVal)
NONLOC_SVAL(CompoundVal, NonLoc)
NONLOC_SVAL(ConcreteInt, NonLoc)
NONLOC_SVAL(LazyCompoundVal, NonLoc)
NONLOC_SVAL(LocAsInteger, NonLoc)
NONLOC_SVAL(SymbolVal, NonLoc)
NONLOC_SVAL(PointerToMember, NonLoc)
SVAL_RANGE(NonLoc, CompoundVal, PointerToMember)

#undef SVAL_RANGE
#undef NONLOC_SVAL
#undef LOC_SVAL
#undef ABSTRACT_SVAL_WITH_KIND
#undef ABSTRACT_SVAL
#undef BASIC_SVAL

0 comments on commit bde5717

Please sign in to comment.