Skip to content

Commit

Permalink
[analyzer][NFC] Move the data structures from CheckerRegistry to the …
Browse files Browse the repository at this point in the history
…Core library

If you were around the analyzer for a while now, you must've seen a lot of
patches that awkwardly puts code from one library to the other:

* D75360 moves the constructors of CheckerManager, which lies in the Core
  library, to the Frontend library. Most the patch itself was a struggle along
  the library lines.
* D78126 had to be reverted because dependency information would be utilized
  in the Core library, but the actual data lied in the frontend.
  D78126#inline-751477 touches on this issue as well.

This stems from the often mentioned problem: the Frontend library depends on
Core and Checkers, Checkers depends on Core. The checker registry functions
(`registerMallocChecker`, etc) lie in the Checkers library in order to keep each
checker its own module. What this implies is that checker registration cannot
take place in the Core, but the Core might still want to use the data that
results from it (which checker/package is enabled, dependencies, etc).

D54436 was the patch that initiated this. Back in the days when CheckerRegistry
was super dumb and buggy, it implemented a non-documented solution to this
problem by keeping the data in the Core, and leaving the logic in the Frontend.
At the time when the patch landed, the merger to the Frontend made sense,
because the data hadn't been utilized anywhere, and the whole workaround without
any documentation made little sense to me.

So, lets put the data back where it belongs, in the Core library. This patch
introduces `CheckerRegistryData`, and turns `CheckerRegistry` into a short lived
wrapper around this data that implements the logic of checker registration. The
data is tied to CheckerManager because it is required to parse it.

Side note: I can't help but cringe at the fact how ridiculously awkward the
library lines are. I feel like I'm thinking too much inside the box, but I guess
this is just the price of keeping the checkers so modularized.

Differential Revision: https://reviews.llvm.org/D82585
  • Loading branch information
Szelethus committed Jul 4, 2020
1 parent 4f2e7f6 commit b6cbe6c
Show file tree
Hide file tree
Showing 10 changed files with 577 additions and 503 deletions.
11 changes: 7 additions & 4 deletions clang/include/clang/StaticAnalyzer/Core/CheckerManager.h
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ class CallEvent;
class CheckerBase;
class CheckerContext;
class CheckerRegistry;
struct CheckerRegistryData;
class ExplodedGraph;
class ExplodedNode;
class ExplodedNodeSet;
Expand Down Expand Up @@ -130,7 +131,7 @@ class CheckerManager {
const Preprocessor *PP = nullptr;
CheckerNameRef CurrentCheckerName;
DiagnosticsEngine &Diags;
std::unique_ptr<CheckerRegistry> Registry;
std::unique_ptr<CheckerRegistryData> RegistryData;

public:
// These constructors are defined in the Frontend library, because
Expand All @@ -152,8 +153,8 @@ class CheckerManager {
: CheckerManager(Context, AOptions, PP, {}, {}) {}

/// Constructs a CheckerManager without requiring an AST. No checker
/// registration will take place. Only useful for retrieving the
/// CheckerRegistry and print for help flags where the AST is unavalaible.
/// registration will take place. Only useful when one needs to print the
/// help flags through CheckerRegistryData, and the AST is unavalaible.
CheckerManager(AnalyzerOptions &AOptions, const LangOptions &LangOpts,
DiagnosticsEngine &Diags, ArrayRef<std::string> plugins);

Expand All @@ -172,7 +173,9 @@ class CheckerManager {
assert(PP);
return *PP;
}
const CheckerRegistry &getCheckerRegistry() const { return *Registry; }
const CheckerRegistryData &getCheckerRegistryData() const {
return *RegistryData;
}
DiagnosticsEngine &getDiagnostics() const { return Diags; }
ASTContext &getASTContext() const {
assert(Context);
Expand Down
226 changes: 226 additions & 0 deletions clang/include/clang/StaticAnalyzer/Core/CheckerRegistryData.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,226 @@
//===- CheckerRegistryData.h ------------------------------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
//
// This file contains the data structures to which the TableGen file Checkers.td
// maps to, as well as what was parsed from the the specific invocation (whether
// a checker/package is enabled, their options values, etc).
//
// The parsing of the invocation is done by CheckerRegistry, which is found in
// the Frontend library. This allows the Core and Checkers libraries to utilize
// this information, such as enforcing rules on checker dependency bug emission,
// ensuring all checker options were queried, etc.
//
//===----------------------------------------------------------------------===//

#ifndef LLVM_CLANG_STATICANALYZER_CORE_CHECKERREGISTRYDATA_H
#define LLVM_CLANG_STATICANALYZER_CORE_CHECKERREGISTRYDATA_H

#include "clang/Basic/LLVM.h"
#include "llvm/ADT/SetVector.h"
#include "llvm/ADT/StringMap.h"
#include "llvm/ADT/StringRef.h"
#include "llvm/Support/raw_ostream.h"

namespace clang {

class AnalyzerOptions;

namespace ento {

class CheckerManager;

/// Initialization functions perform any necessary setup for a checker.
/// They should include a call to CheckerManager::registerChecker.
using RegisterCheckerFn = void (*)(CheckerManager &);
using ShouldRegisterFunction = bool (*)(const CheckerManager &);

/// Specifies a command line option. It may either belong to a checker or a
/// package.
struct CmdLineOption {
StringRef OptionType;
StringRef OptionName;
StringRef DefaultValStr;
StringRef Description;
StringRef DevelopmentStatus;
bool IsHidden;

CmdLineOption(StringRef OptionType, StringRef OptionName,
StringRef DefaultValStr, StringRef Description,
StringRef DevelopmentStatus, bool IsHidden)
: OptionType(OptionType), OptionName(OptionName),
DefaultValStr(DefaultValStr), Description(Description),
DevelopmentStatus(DevelopmentStatus), IsHidden(IsHidden) {

assert((OptionType == "bool" || OptionType == "string" ||
OptionType == "int") &&
"Unknown command line option type!");

assert((OptionType != "bool" ||
(DefaultValStr == "true" || DefaultValStr == "false")) &&
"Invalid value for boolean command line option! Maybe incorrect "
"parameters to the addCheckerOption or addPackageOption method?");

int Tmp;
assert((OptionType != "int" || !DefaultValStr.getAsInteger(0, Tmp)) &&
"Invalid value for integer command line option! Maybe incorrect "
"parameters to the addCheckerOption or addPackageOption method?");
(void)Tmp;

assert((DevelopmentStatus == "alpha" || DevelopmentStatus == "beta" ||
DevelopmentStatus == "released") &&
"Invalid development status!");
}

LLVM_DUMP_METHOD void dump() const { dumpToStream(llvm::errs()); }
LLVM_DUMP_METHOD void dumpToStream(llvm::raw_ostream &Out) const;
};

using CmdLineOptionList = llvm::SmallVector<CmdLineOption, 0>;

struct CheckerInfo;

using CheckerInfoList = std::vector<CheckerInfo>;
using CheckerInfoListRange = llvm::iterator_range<CheckerInfoList::iterator>;
using ConstCheckerInfoList = llvm::SmallVector<const CheckerInfo *, 0>;
using CheckerInfoSet = llvm::SetVector<const CheckerInfo *>;

/// Specifies a checker. Note that this isn't what we call a checker object,
/// it merely contains everything required to create one.
struct CheckerInfo {
enum class StateFromCmdLine {
// This checker wasn't explicitly enabled or disabled.
State_Unspecified,
// This checker was explicitly disabled.
State_Disabled,
// This checker was explicitly enabled.
State_Enabled
};

RegisterCheckerFn Initialize = nullptr;
ShouldRegisterFunction ShouldRegister = nullptr;
StringRef FullName;
StringRef Desc;
StringRef DocumentationUri;
CmdLineOptionList CmdLineOptions;
bool IsHidden = false;
StateFromCmdLine State = StateFromCmdLine::State_Unspecified;

ConstCheckerInfoList Dependencies;
ConstCheckerInfoList WeakDependencies;

bool isEnabled(const CheckerManager &mgr) const {
return State == StateFromCmdLine::State_Enabled && ShouldRegister(mgr);
}

bool isDisabled(const CheckerManager &mgr) const {
return State == StateFromCmdLine::State_Disabled || !ShouldRegister(mgr);
}

// Since each checker must have a different full name, we can identify
// CheckerInfo objects by them.
bool operator==(const CheckerInfo &Rhs) const {
return FullName == Rhs.FullName;
}

CheckerInfo(RegisterCheckerFn Fn, ShouldRegisterFunction sfn, StringRef Name,
StringRef Desc, StringRef DocsUri, bool IsHidden)
: Initialize(Fn), ShouldRegister(sfn), FullName(Name), Desc(Desc),
DocumentationUri(DocsUri), IsHidden(IsHidden) {}

// Used for lower_bound.
explicit CheckerInfo(StringRef FullName) : FullName(FullName) {}

LLVM_DUMP_METHOD void dump() const { dumpToStream(llvm::errs()); }
LLVM_DUMP_METHOD void dumpToStream(llvm::raw_ostream &Out) const;
};

using StateFromCmdLine = CheckerInfo::StateFromCmdLine;

/// Specifies a package. Each package option is implicitly an option for all
/// checkers within the package.
struct PackageInfo {
StringRef FullName;
CmdLineOptionList CmdLineOptions;

// Since each package must have a different full name, we can identify
// CheckerInfo objects by them.
bool operator==(const PackageInfo &Rhs) const {
return FullName == Rhs.FullName;
}

explicit PackageInfo(StringRef FullName) : FullName(FullName) {}

LLVM_DUMP_METHOD void dump() const { dumpToStream(llvm::errs()); }
LLVM_DUMP_METHOD void dumpToStream(llvm::raw_ostream &Out) const;
};

using PackageInfoList = llvm::SmallVector<PackageInfo, 0>;

namespace checker_registry {

template <class T> struct FullNameLT {
bool operator()(const T &Lhs, const T &Rhs) {
return Lhs.FullName < Rhs.FullName;
}
};

using PackageNameLT = FullNameLT<PackageInfo>;
using CheckerNameLT = FullNameLT<CheckerInfo>;

template <class CheckerOrPackageInfoList>
std::conditional_t<std::is_const<CheckerOrPackageInfoList>::value,
typename CheckerOrPackageInfoList::const_iterator,
typename CheckerOrPackageInfoList::iterator>
binaryFind(CheckerOrPackageInfoList &Collection, StringRef FullName) {

using CheckerOrPackage = typename CheckerOrPackageInfoList::value_type;
using CheckerOrPackageFullNameLT = FullNameLT<CheckerOrPackage>;

assert(llvm::is_sorted(Collection, CheckerOrPackageFullNameLT{}) &&
"In order to efficiently gather checkers/packages, this function "
"expects them to be already sorted!");

return llvm::lower_bound(Collection, CheckerOrPackage(FullName),
CheckerOrPackageFullNameLT{});
}
} // namespace checker_registry

struct CheckerRegistryData {
public:
CheckerInfoSet EnabledCheckers;

CheckerInfoList Checkers;
PackageInfoList Packages;
/// Used for counting how many checkers belong to a certain package in the
/// \c Checkers field. For convenience purposes.
llvm::StringMap<size_t> PackageSizes;

/// Contains all (FullName, CmdLineOption) pairs. Similarly to dependencies,
/// we only modify the actual CheckerInfo and PackageInfo objects once all
/// of them have been added.
llvm::SmallVector<std::pair<StringRef, CmdLineOption>, 0> PackageOptions;
llvm::SmallVector<std::pair<StringRef, CmdLineOption>, 0> CheckerOptions;

llvm::SmallVector<std::pair<StringRef, StringRef>, 0> Dependencies;
llvm::SmallVector<std::pair<StringRef, StringRef>, 0> WeakDependencies;

CheckerInfoListRange getMutableCheckersForCmdLineArg(StringRef CmdLineArg);

/// Prints the name and description of all checkers in this registry.
/// This output is not intended to be machine-parseable.
void printCheckerWithDescList(const AnalyzerOptions &AnOpts, raw_ostream &Out,
size_t MaxNameChars = 30) const;
void printEnabledCheckerList(raw_ostream &Out) const;
void printCheckerOptionList(const AnalyzerOptions &AnOpts,
raw_ostream &Out) const;
};

} // namespace ento
} // namespace clang

#endif // LLVM_CLANG_STATICANALYZER_CORE_CHECKERREGISTRYDATA_H
Loading

0 comments on commit b6cbe6c

Please sign in to comment.