Skip to content

Commit

Permalink
[analyzer] Add std::variant checker (#66481)
Browse files Browse the repository at this point in the history
As my BSc thesis I've implemented a checker for std::variant and
std::any, and in the following weeks I'll upload a revised version of
them here.

# Prelude

@Szelethus and I sent out an email with our initial plans here:
https://discourse.llvm.org/t/analyzer-new-checker-for-std-any-as-a-bsc-thesis/65613/2
We also created a stub checker patch here:
https://reviews.llvm.org/D142354.

Upon the recommendation of @haoNoQ , we explored an option where instead
of writing a checker, we tried to improve on how the analyzer natively
inlined the methods of std::variant and std::any. Our attempt is in this
patch https://reviews.llvm.org/D145069, but in a nutshell, this is what
happened: The analyzer was able to model much of what happened inside
those classes, but our false positive suppression machinery erroneously
suppressed it. After months of trying, we could not find a satisfying
enhancement on the heuristic without introducing an allowlist/denylist
of which functions to not suppress.

As a result (and partly on the encouragement of @Xazax-hun) I wrote a
dedicated checker!

The advantage of the checker is that it is not dependent on the
standard's implementation and won't put warnings in the standard library
definitions. Also without the checker it would be difficult to create
nice user-friendly warnings and NoteTags -- as per the standard's
specification, the analysis is sinked by an exception, which we don't
model well now.

# Design ideas

The working of the checker is straightforward: We find the creation of
an std::variant instance, store the type of the variable we want to
store in it, then save this type for the instance. When retrieving type
from the instance we check what type we want to retrieve as, and compare
it to the actual type. If the two don't march we emit an error.

Distinguishing variants by instance (e.g. MemRegion *) is not the most
optimal way. Other checkers, like MallocChecker uses a symbol-to-trait
map instead of region-to-trait. The upside of using symbols (which would
be the value of a variant, not the variant itself itself) is that the
analyzer would take care of modeling copies, moves, invalidation, etc,
out of the box. The problem is that for compound types, the analyzer
doesn't create a symbol as a result of a constructor call that is fit
for this job. MallocChecker in contrast manipulates simple pointers.

My colleges and I considered the option of making adjustments directly
to the memory model of the analyzer, but for the time being decided
against it, and go with the bit more cumbersome, but immediately viable
option of simply using MemRegions.

# Current state and review plan

This patch contains an already working checker that can find and report
certain variant/any misuses, but still lands it in alpha. I plan to
upload the rest of the checker in later patches.

The full checker is also able to "follow" the symbolic value held by the
std::variant and updates the program state whenever we assign the value
stored in the variant. I have also built a library that is meant to
model union-like types similar to variant, hence some functions being a
bit more multipurpose then is immediately needed.

I also intend to publish my std::any checker in a later commit.

---------

Co-authored-by: Gabor Spaits <gabor.spaits@ericsson.com>
Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>
  • Loading branch information
3 people committed Nov 21, 2023
1 parent 8336bfb commit 527fcb8
Show file tree
Hide file tree
Showing 9 changed files with 953 additions and 50 deletions.
4 changes: 4 additions & 0 deletions clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
Original file line number Diff line number Diff line change
Expand Up @@ -318,6 +318,10 @@ def C11LockChecker : Checker<"C11Lock">,
Dependencies<[PthreadLockBase]>,
Documentation<HasDocumentation>;

def StdVariantChecker : Checker<"StdVariant">,
HelpText<"Check for bad type access for std::variant.">,
Documentation<HasDocumentation>;

} // end "alpha.core"

//===----------------------------------------------------------------------===//
Expand Down
99 changes: 50 additions & 49 deletions clang/include/clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ enum CallEventKind {

class CallEvent;

template<typename T = CallEvent>
template <typename T = CallEvent>
class CallEventRef : public IntrusiveRefCntPtr<const T> {
public:
CallEventRef(const T *Call) : IntrusiveRefCntPtr<const T>(Call) {}
Expand All @@ -94,8 +94,7 @@ class CallEventRef : public IntrusiveRefCntPtr<const T> {

// Allow implicit conversions to a superclass type, since CallEventRef
// behaves like a pointer-to-const.
template <typename SuperT>
operator CallEventRef<SuperT> () const {
template <typename SuperT> operator CallEventRef<SuperT>() const {
return this->get();
}
};
Expand Down Expand Up @@ -124,9 +123,9 @@ class RuntimeDefinition {

public:
RuntimeDefinition() = default;
RuntimeDefinition(const Decl *InD): D(InD) {}
RuntimeDefinition(const Decl *InD) : D(InD) {}
RuntimeDefinition(const Decl *InD, bool Foreign) : D(InD), Foreign(Foreign) {}
RuntimeDefinition(const Decl *InD, const MemRegion *InR): D(InD), R(InR) {}
RuntimeDefinition(const Decl *InD, const MemRegion *InR) : D(InD), R(InR) {}

const Decl *getDecl() { return D; }
bool isForeign() const { return Foreign; }
Expand Down Expand Up @@ -207,8 +206,9 @@ class CallEvent {

/// Used to specify non-argument regions that will be invalidated as a
/// result of this call.
virtual void getExtraInvalidatedValues(ValueList &Values,
RegionAndSymbolInvalidationTraits *ETraits) const {}
virtual void
getExtraInvalidatedValues(ValueList &Values,
RegionAndSymbolInvalidationTraits *ETraits) const {}

public:
CallEvent &operator=(const CallEvent &) = delete;
Expand All @@ -231,14 +231,10 @@ class CallEvent {
void setForeign(bool B) const { Foreign = B; }

/// The state in which the call is being evaluated.
const ProgramStateRef &getState() const {
return State;
}
const ProgramStateRef &getState() const { return State; }

/// The context in which the call is being evaluated.
const LocationContext *getLocationContext() const {
return LCtx;
}
const LocationContext *getLocationContext() const { return LCtx; }

const CFGBlock::ConstCFGElementRef &getCFGElementRef() const {
return ElemRef;
Expand Down Expand Up @@ -270,7 +266,7 @@ class CallEvent {
SourceLocation Loc = D->getLocation();
if (Loc.isValid()) {
const SourceManager &SM =
getState()->getStateManager().getContext().getSourceManager();
getState()->getStateManager().getContext().getSourceManager();
return SM.isInSystemHeader(D->getLocation());
}

Expand Down Expand Up @@ -324,9 +320,7 @@ class CallEvent {
// NOTE: The exact semantics of this are still being defined!
// We don't really want a list of hardcoded exceptions in the long run,
// but we don't want duplicated lists of known APIs in the short term either.
virtual bool argumentsMayEscape() const {
return hasNonZeroCallbackArg();
}
virtual bool argumentsMayEscape() const { return hasNonZeroCallbackArg(); }

/// Returns true if the callee is an externally-visible function in the
/// top-level namespace, such as \c malloc.
Expand Down Expand Up @@ -456,6 +450,14 @@ class CallEvent {
/// can be retrieved from its construction context.
std::optional<SVal> getReturnValueUnderConstruction() const;

// Returns the CallEvent representing the caller of this function
const CallEventRef<> getCaller() const;

// Returns true if the function was called from a standard library function.
// If not or could not get the caller (it may be a top level function)
// returns false.
bool isCalledFromSystemHeader() const;

// Iterator access to formal parameters and their types.
private:
struct GetTypeFn {
Expand Down Expand Up @@ -579,8 +581,9 @@ class BlockCall : public CallEvent {

void cloneTo(void *Dest) const override { new (Dest) BlockCall(*this); }

void getExtraInvalidatedValues(ValueList &Values,
RegionAndSymbolInvalidationTraits *ETraits) const override;
void getExtraInvalidatedValues(
ValueList &Values,
RegionAndSymbolInvalidationTraits *ETraits) const override;

public:
const CallExpr *getOriginExpr() const override {
Expand Down Expand Up @@ -650,14 +653,12 @@ class BlockCall : public CallEvent {
// the block body and analyze the operator() method on the captured lambda.
const VarDecl *LambdaVD = getRegionStoringCapturedLambda()->getDecl();
const CXXRecordDecl *LambdaDecl = LambdaVD->getType()->getAsCXXRecordDecl();
CXXMethodDecl* LambdaCallOperator = LambdaDecl->getLambdaCallOperator();
CXXMethodDecl *LambdaCallOperator = LambdaDecl->getLambdaCallOperator();

return RuntimeDefinition(LambdaCallOperator);
}

bool argumentsMayEscape() const override {
return true;
}
bool argumentsMayEscape() const override { return true; }

void getInitialStackFrameContents(const StackFrameContext *CalleeCtx,
BindingsTy &Bindings) const override;
Expand All @@ -684,8 +685,9 @@ class CXXInstanceCall : public AnyFunctionCall {
: AnyFunctionCall(D, St, LCtx, ElemRef) {}
CXXInstanceCall(const CXXInstanceCall &Other) = default;

void getExtraInvalidatedValues(ValueList &Values,
RegionAndSymbolInvalidationTraits *ETraits) const override;
void getExtraInvalidatedValues(
ValueList &Values,
RegionAndSymbolInvalidationTraits *ETraits) const override;

public:
/// Returns the expression representing the implicit 'this' object.
Expand Down Expand Up @@ -843,7 +845,9 @@ class CXXDestructorCall : public CXXInstanceCall {

CXXDestructorCall(const CXXDestructorCall &Other) = default;

void cloneTo(void *Dest) const override {new (Dest) CXXDestructorCall(*this);}
void cloneTo(void *Dest) const override {
new (Dest) CXXDestructorCall(*this);
}

public:
SourceRange getSourceRange() const override { return Location; }
Expand Down Expand Up @@ -880,8 +884,9 @@ class AnyCXXConstructorCall : public AnyFunctionCall {
Data = Target;
}

void getExtraInvalidatedValues(ValueList &Values,
RegionAndSymbolInvalidationTraits *ETraits) const override;
void getExtraInvalidatedValues(
ValueList &Values,
RegionAndSymbolInvalidationTraits *ETraits) const override;

void getInitialStackFrameContents(const StackFrameContext *CalleeCtx,
BindingsTy &Bindings) const override;
Expand Down Expand Up @@ -921,7 +926,9 @@ class CXXConstructorCall : public AnyCXXConstructorCall {

CXXConstructorCall(const CXXConstructorCall &Other) = default;

void cloneTo(void *Dest) const override { new (Dest) CXXConstructorCall(*this); }
void cloneTo(void *Dest) const override {
new (Dest) CXXConstructorCall(*this);
}

public:
const CXXConstructExpr *getOriginExpr() const override {
Expand Down Expand Up @@ -1040,7 +1047,9 @@ class CXXAllocatorCall : public AnyFunctionCall {
: AnyFunctionCall(E, St, LCtx, ElemRef) {}
CXXAllocatorCall(const CXXAllocatorCall &Other) = default;

void cloneTo(void *Dest) const override { new (Dest) CXXAllocatorCall(*this); }
void cloneTo(void *Dest) const override {
new (Dest) CXXAllocatorCall(*this);
}

public:
const CXXNewExpr *getOriginExpr() const override {
Expand Down Expand Up @@ -1154,11 +1163,7 @@ class CXXDeallocatorCall : public AnyFunctionCall {
//
// Note to maintainers: OCM_Message should always be last, since it does not
// need to fit in the Data field's low bits.
enum ObjCMessageKind {
OCM_PropertyAccess,
OCM_Subscript,
OCM_Message
};
enum ObjCMessageKind { OCM_PropertyAccess, OCM_Subscript, OCM_Message };

/// Represents any expression that calls an Objective-C method.
///
Expand All @@ -1180,8 +1185,9 @@ class ObjCMethodCall : public CallEvent {

void cloneTo(void *Dest) const override { new (Dest) ObjCMethodCall(*this); }

void getExtraInvalidatedValues(ValueList &Values,
RegionAndSymbolInvalidationTraits *ETraits) const override;
void getExtraInvalidatedValues(
ValueList &Values,
RegionAndSymbolInvalidationTraits *ETraits) const override;

/// Check if the selector may have multiple definitions (may have overrides).
virtual bool canBeOverridenInSubclass(ObjCInterfaceDecl *IDecl,
Expand All @@ -1196,9 +1202,7 @@ class ObjCMethodCall : public CallEvent {
return getOriginExpr()->getMethodDecl();
}

unsigned getNumArgs() const override {
return getOriginExpr()->getNumArgs();
}
unsigned getNumArgs() const override { return getOriginExpr()->getNumArgs(); }

const Expr *getArgExpr(unsigned Index) const override {
return getOriginExpr()->getArg(Index);
Expand All @@ -1212,9 +1216,7 @@ class ObjCMethodCall : public CallEvent {
return getOriginExpr()->getMethodFamily();
}

Selector getSelector() const {
return getOriginExpr()->getSelector();
}
Selector getSelector() const { return getOriginExpr()->getSelector(); }

SourceRange getSourceRange() const override;

Expand Down Expand Up @@ -1262,7 +1264,7 @@ class ObjCMethodCall : public CallEvent {
void getInitialStackFrameContents(const StackFrameContext *CalleeCtx,
BindingsTy &Bindings) const override;

ArrayRef<ParmVarDecl*> parameters() const override;
ArrayRef<ParmVarDecl *> parameters() const override;

Kind getKind() const override { return CE_ObjCMessage; }
StringRef getKindAsString() const override { return "ObjCMethodCall"; }
Expand Down Expand Up @@ -1336,8 +1338,8 @@ class CallEventManager {
CallEventManager(llvm::BumpPtrAllocator &alloc) : Alloc(alloc) {}

/// Gets an outside caller given a callee context.
CallEventRef<>
getCaller(const StackFrameContext *CalleeCtx, ProgramStateRef State);
CallEventRef<> getCaller(const StackFrameContext *CalleeCtx,
ProgramStateRef State);

/// Gets a call event for a function call, Objective-C method call,
/// a 'new', or a 'delete' call.
Expand Down Expand Up @@ -1433,11 +1435,10 @@ inline void CallEvent::Release() const {
namespace llvm {

// Support isa<>, cast<>, and dyn_cast<> for CallEventRef.
template<class T> struct simplify_type< clang::ento::CallEventRef<T>> {
template <class T> struct simplify_type<clang::ento::CallEventRef<T>> {
using SimpleType = const T *;

static SimpleType
getSimplifiedValue(clang::ento::CallEventRef<T> Val) {
static SimpleType getSimplifiedValue(clang::ento::CallEventRef<T> Val) {
return Val.get();
}
};
Expand Down
1 change: 1 addition & 0 deletions clang/lib/StaticAnalyzer/Checkers/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,7 @@ add_clang_library(clangStaticAnalyzerCheckers
SmartPtrModeling.cpp
StackAddrEscapeChecker.cpp
StdLibraryFunctionsChecker.cpp
StdVariantChecker.cpp
STLAlgorithmModeling.cpp
StreamChecker.cpp
StringChecker.cpp
Expand Down
Loading

0 comments on commit 527fcb8

Please sign in to comment.