Skip to content

Commit

Permalink
[clang-tidy] Add support for determining constness of more expression…
Browse files Browse the repository at this point in the history
…s. (#82617)

This uses a more systematic approach for determining whcich
`DeclRefExpr`s mutate the underlying object: Instead of using a few
matchers, we walk up the AST until we find a parent that we can prove
cannot change the underlying object.

This allows us to handle most address taking and dereference, bindings
to value and const& variables, and track constness of pointee (see
changes in DeclRefExprUtilsTest.cpp).

This allows supporting more patterns in
`performance-unnecessary-copy-initialization`.

Those two patterns are relatively common:

```
const auto e = (*vector_ptr)[i]
```

and

```
const auto e = vector_ptr->at(i);
```

In our codebase, we have around 25% additional findings from
`performance-unnecessary-copy-initialization` with this change. I did
not see any additional false positives.
  • Loading branch information
legrosbuffle committed Feb 26, 2024
1 parent 56b63e0 commit 94ca854
Show file tree
Hide file tree
Showing 6 changed files with 394 additions and 183 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -86,16 +86,22 @@ AST_MATCHER_FUNCTION_P(StatementMatcher, isConstRefReturningMethodCall,
const auto MethodDecl =
cxxMethodDecl(returns(hasCanonicalType(matchers::isReferenceToConst())))
.bind(MethodDeclId);
const auto ReceiverExpr = declRefExpr(to(varDecl().bind(ObjectArgId)));
const auto ReceiverExpr =
ignoringParenImpCasts(declRefExpr(to(varDecl().bind(ObjectArgId))));
const auto OnExpr = anyOf(
// Direct reference to `*this`: `a.f()` or `a->f()`.
ReceiverExpr,
// Access through dereference, typically used for `operator[]`: `(*a)[3]`.
unaryOperator(hasOperatorName("*"), hasUnaryOperand(ReceiverExpr)));
const auto ReceiverType =
hasCanonicalType(recordType(hasDeclaration(namedDecl(
unless(matchers::matchesAnyListedName(ExcludedContainerTypes))))));

return expr(anyOf(
cxxMemberCallExpr(callee(MethodDecl), on(ReceiverExpr),
thisPointerType(ReceiverType)),
cxxOperatorCallExpr(callee(MethodDecl), hasArgument(0, ReceiverExpr),
hasArgument(0, hasType(ReceiverType)))));
return expr(
anyOf(cxxMemberCallExpr(callee(MethodDecl), on(OnExpr),
thisPointerType(ReceiverType)),
cxxOperatorCallExpr(callee(MethodDecl), hasArgument(0, OnExpr),
hasArgument(0, hasType(ReceiverType)))));
}

AST_MATCHER_FUNCTION(StatementMatcher, isConstRefReturningFunctionCall) {
Expand Down Expand Up @@ -136,10 +142,11 @@ AST_MATCHER_FUNCTION_P(StatementMatcher, initializerReturnsReferenceToConst,
static bool isInitializingVariableImmutable(
const VarDecl &InitializingVar, const Stmt &BlockStmt, ASTContext &Context,
const std::vector<StringRef> &ExcludedContainerTypes) {
if (!isOnlyUsedAsConst(InitializingVar, BlockStmt, Context))
QualType T = InitializingVar.getType().getCanonicalType();
if (!isOnlyUsedAsConst(InitializingVar, BlockStmt, Context,
T->isPointerType() ? 1 : 0))
return false;

QualType T = InitializingVar.getType().getCanonicalType();
// The variable is a value type and we know it is only used as const. Safe
// to reference it and avoid the copy.
if (!isa<ReferenceType, PointerType>(T))
Expand Down Expand Up @@ -273,7 +280,9 @@ void UnnecessaryCopyInitialization::check(
VarDeclStmt.isSingleDecl() && !NewVar.getLocation().isMacroID();
const bool IsVarUnused = isVariableUnused(NewVar, BlockStmt, *Result.Context);
const bool IsVarOnlyUsedAsConst =
isOnlyUsedAsConst(NewVar, BlockStmt, *Result.Context);
isOnlyUsedAsConst(NewVar, BlockStmt, *Result.Context,
// `NewVar` is always of non-pointer type.
0);
const CheckContext Context{
NewVar, BlockStmt, VarDeclStmt, *Result.Context,
IssueFix, IsVarUnused, IsVarOnlyUsedAsConst};
Expand Down
210 changes: 163 additions & 47 deletions clang-tools-extra/clang-tidy/utils/DeclRefExprUtils.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,9 @@
#include "Matchers.h"
#include "clang/AST/ASTContext.h"
#include "clang/AST/DeclCXX.h"
#include "clang/AST/ExprCXX.h"
#include "clang/ASTMatchers/ASTMatchFinder.h"
#include <cassert>

namespace clang::tidy::utils::decl_ref_expr {

Expand All @@ -34,69 +36,183 @@ void extractNodesByIdTo(ArrayRef<BoundNodes> Matches, StringRef ID,
Nodes.insert(Match.getNodeAs<Node>(ID));
}

// A matcher that matches DeclRefExprs that are used in ways such that the
// underlying declaration is not modified.
// If the declaration is of pointer type, `Indirections` specifies the level
// of indirection of the object whose mutations we are tracking.
//
// For example, given:
// ```
// int i;
// int* p;
// p = &i; // (A)
// *p = 3; // (B)
// ```
//
// `declRefExpr(to(varDecl(hasName("p"))), doesNotMutateObject(0))` matches
// (B), but `declRefExpr(to(varDecl(hasName("p"))), doesNotMutateObject(1))`
// matches (A).
//
AST_MATCHER_P(DeclRefExpr, doesNotMutateObject, int, Indirections) {
// We walk up the parents of the DeclRefExpr recursively until we end up on a
// parent that cannot modify the underlying object. There are a few kinds of
// expressions:
// - Those that cannot be used to mutate the underlying object. We can stop
// recursion there.
// - Those that can be used to mutate the underlying object in analyzable
// ways (such as taking the address or accessing a subobject). We have to
// examine the parents.
// - Those that we don't know how to analyze. In that case we stop there and
// we assume that they can mutate the underlying expression.

struct StackEntry {
StackEntry(const Expr *E, int Indirections)
: E(E), Indirections(Indirections) {}
// The expression to analyze.
const Expr *E;
// The number of pointer indirections of the object being tracked (how
// many times an address was taken).
int Indirections;
};

llvm::SmallVector<StackEntry, 4> Stack;
Stack.emplace_back(&Node, Indirections);
ASTContext &Ctx = Finder->getASTContext();

while (!Stack.empty()) {
const StackEntry Entry = Stack.back();
Stack.pop_back();

// If the expression type is const-qualified at the appropriate indirection
// level then we can not mutate the object.
QualType Ty = Entry.E->getType().getCanonicalType();
for (int I = 0; I < Entry.Indirections; ++I) {
assert(Ty->isPointerType());
Ty = Ty->getPointeeType().getCanonicalType();
}
if (Ty.isConstQualified())
continue;

// Otherwise we have to look at the parents to see how the expression is
// used.
const DynTypedNodeList Parents = Ctx.getParents(*Entry.E);
// Note: most nodes have a single parents, but there exist nodes that have
// several parents, such as `InitListExpr` that have semantic and syntactic
// forms.
for (const auto &Parent : Parents) {
if (Parent.get<CompoundStmt>()) {
// Unused block-scope statement.
continue;
}
const Expr *const P = Parent.get<Expr>();
if (P == nullptr) {
// `Parent` is not an expr (e.g. a `VarDecl`).
// The case of binding to a `const&` or `const*` variable is handled by
// the fact that there is going to be a `NoOp` cast to const below the
// `VarDecl`, so we're not even going to get there.
// The case of copying into a value-typed variable is handled by the
// rvalue cast.
// This triggers only when binding to a mutable reference/ptr variable.
// FIXME: When we take a mutable reference we could keep checking the
// new variable for const usage only.
return false;
}
// Cosmetic nodes.
if (isa<ParenExpr>(P) || isa<MaterializeTemporaryExpr>(P)) {
Stack.emplace_back(P, Entry.Indirections);
continue;
}
if (const auto *const Cast = dyn_cast<CastExpr>(P)) {
switch (Cast->getCastKind()) {
// NoOp casts are used to add `const`. We'll check whether adding that
// const prevents modification when we process the cast.
case CK_NoOp:
// These do nothing w.r.t. to mutability.
case CK_BaseToDerived:
case CK_DerivedToBase:
case CK_UncheckedDerivedToBase:
case CK_Dynamic:
case CK_BaseToDerivedMemberPointer:
case CK_DerivedToBaseMemberPointer:
Stack.emplace_back(Cast, Entry.Indirections);
continue;
case CK_ToVoid:
case CK_PointerToBoolean:
// These do not mutate the underlying variable.
continue;
case CK_LValueToRValue: {
// An rvalue is immutable.
if (Entry.Indirections == 0)
continue;
Stack.emplace_back(Cast, Entry.Indirections);
continue;
}
default:
// Bail out on casts that we cannot analyze.
return false;
}
}
if (const auto *const Member = dyn_cast<MemberExpr>(P)) {
if (const auto *const Method =
dyn_cast<CXXMethodDecl>(Member->getMemberDecl())) {
if (!Method->isConst()) {
// The method can mutate our variable.
return false;
}
continue;
}
Stack.emplace_back(Member, 0);
continue;
}
if (const auto *const Op = dyn_cast<UnaryOperator>(P)) {
switch (Op->getOpcode()) {
case UO_AddrOf:
Stack.emplace_back(Op, Entry.Indirections + 1);
continue;
case UO_Deref:
assert(Entry.Indirections > 0);
Stack.emplace_back(Op, Entry.Indirections - 1);
continue;
default:
// Bail out on unary operators that we cannot analyze.
return false;
}
}

// Assume any other expression can modify the underlying variable.
return false;
}
}

// No parent can modify the variable.
return true;
}

} // namespace

// Finds all DeclRefExprs where a const method is called on VarDecl or VarDecl
// is the a const reference or value argument to a CallExpr or CXXConstructExpr.
SmallPtrSet<const DeclRefExpr *, 16>
constReferenceDeclRefExprs(const VarDecl &VarDecl, const Stmt &Stmt,
ASTContext &Context) {
auto DeclRefToVar =
declRefExpr(to(varDecl(equalsNode(&VarDecl)))).bind("declRef");
auto MemberExprOfVar = memberExpr(hasObjectExpression(DeclRefToVar));
auto DeclRefToVarOrMemberExprOfVar =
stmt(anyOf(DeclRefToVar, MemberExprOfVar));
auto ConstMethodCallee = callee(cxxMethodDecl(isConst()));
// Match method call expressions where the variable is referenced as the this
// implicit object argument and operator call expression for member operators
// where the variable is the 0-th argument.
auto Matches = match(
findAll(expr(anyOf(
cxxMemberCallExpr(ConstMethodCallee,
on(DeclRefToVarOrMemberExprOfVar)),
cxxOperatorCallExpr(ConstMethodCallee,
hasArgument(0, DeclRefToVarOrMemberExprOfVar))))),
Stmt, Context);
ASTContext &Context, int Indirections) {
auto Matches = match(findAll(declRefExpr(to(varDecl(equalsNode(&VarDecl))),
doesNotMutateObject(Indirections))
.bind("declRef")),
Stmt, Context);
SmallPtrSet<const DeclRefExpr *, 16> DeclRefs;
extractNodesByIdTo(Matches, "declRef", DeclRefs);
auto ConstReferenceOrValue =
qualType(anyOf(matchers::isReferenceToConst(),
unless(anyOf(referenceType(), pointerType(),
substTemplateTypeParmType()))));
auto ConstReferenceOrValueOrReplaced = qualType(anyOf(
ConstReferenceOrValue,
substTemplateTypeParmType(hasReplacementType(ConstReferenceOrValue))));
auto UsedAsConstRefOrValueArg = forEachArgumentWithParam(
DeclRefToVarOrMemberExprOfVar,
parmVarDecl(hasType(ConstReferenceOrValueOrReplaced)));
Matches = match(findAll(invocation(UsedAsConstRefOrValueArg)), Stmt, Context);
extractNodesByIdTo(Matches, "declRef", DeclRefs);
// References and pointers to const assignments.
Matches = match(
findAll(declStmt(has(varDecl(
hasType(qualType(matchers::isReferenceToConst())),
hasInitializer(ignoringImpCasts(DeclRefToVarOrMemberExprOfVar)))))),
Stmt, Context);
extractNodesByIdTo(Matches, "declRef", DeclRefs);
Matches = match(findAll(declStmt(has(varDecl(
hasType(qualType(matchers::isPointerToConst())),
hasInitializer(ignoringImpCasts(unaryOperator(
hasOperatorName("&"),
hasUnaryOperand(DeclRefToVarOrMemberExprOfVar)))))))),
Stmt, Context);
extractNodesByIdTo(Matches, "declRef", DeclRefs);

return DeclRefs;
}

bool isOnlyUsedAsConst(const VarDecl &Var, const Stmt &Stmt,
ASTContext &Context) {
ASTContext &Context, int Indirections) {
// Collect all DeclRefExprs to the loop variable and all CallExprs and
// CXXConstructExprs where the loop variable is used as argument to a const
// reference parameter.
// If the difference is empty it is safe for the loop variable to be a const
// reference.
auto AllDeclRefs = allDeclRefExprs(Var, Stmt, Context);
auto ConstReferenceDeclRefs = constReferenceDeclRefExprs(Var, Stmt, Context);
auto ConstReferenceDeclRefs =
constReferenceDeclRefExprs(Var, Stmt, Context, Indirections);
return isSetDifferenceEmpty(AllDeclRefs, ConstReferenceDeclRefs);
}

Expand Down
33 changes: 23 additions & 10 deletions clang-tools-extra/clang-tidy/utils/DeclRefExprUtils.h
Original file line number Diff line number Diff line change
Expand Up @@ -15,15 +15,6 @@

namespace clang::tidy::utils::decl_ref_expr {

/// Returns true if all ``DeclRefExpr`` to the variable within ``Stmt``
/// do not modify it.
///
/// Returns ``true`` if only const methods or operators are called on the
/// variable or the variable is a const reference or value argument to a
/// ``callExpr()``.
bool isOnlyUsedAsConst(const VarDecl &Var, const Stmt &Stmt,
ASTContext &Context);

/// Returns set of all ``DeclRefExprs`` to ``VarDecl`` within ``Stmt``.
llvm::SmallPtrSet<const DeclRefExpr *, 16>
allDeclRefExprs(const VarDecl &VarDecl, const Stmt &Stmt, ASTContext &Context);
Expand All @@ -34,9 +25,31 @@ allDeclRefExprs(const VarDecl &VarDecl, const Decl &Decl, ASTContext &Context);

/// Returns set of all ``DeclRefExprs`` to ``VarDecl`` within ``Stmt`` where
/// ``VarDecl`` is guaranteed to be accessed in a const fashion.
///
/// If ``VarDecl`` is of pointer type, ``Indirections`` specifies the level
/// of indirection of the object whose mutations we are tracking.
///
/// For example, given:
/// ```
/// int i;
/// int* p;
/// p = &i; // (A)
/// *p = 3; // (B)
/// ```
///
/// - `constReferenceDeclRefExprs(P, Stmt, Context, 0)` returns the reference
// to `p` in (B): the pointee is modified, but the pointer is not;
/// - `constReferenceDeclRefExprs(P, Stmt, Context, 1)` returns the reference
// to `p` in (A): the pointee is modified, but the pointer is not;
llvm::SmallPtrSet<const DeclRefExpr *, 16>
constReferenceDeclRefExprs(const VarDecl &VarDecl, const Stmt &Stmt,
ASTContext &Context);
ASTContext &Context, int Indirections);

/// Returns true if all ``DeclRefExpr`` to the variable within ``Stmt``
/// do not modify it.
/// See `constReferenceDeclRefExprs` for the meaning of ``Indirections``.
bool isOnlyUsedAsConst(const VarDecl &Var, const Stmt &Stmt,
ASTContext &Context, int Indirections);

/// Returns ``true`` if ``DeclRefExpr`` is the argument of a copy-constructor
/// call expression within ``Decl``.
Expand Down
6 changes: 6 additions & 0 deletions clang-tools-extra/docs/ReleaseNotes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -183,6 +183,12 @@ Changes in existing checks
<clang-tidy/checks/modernize/use-override>` check to also remove any trailing
whitespace when deleting the ``virtual`` keyword.

- Improved :doc:`performance-unnecessary-copy-initialization
<clang-tidy/checks/performance/unnecessary-copy-initialization>` check by
detecting more cases of constant access. In particular, pointers can be
analyzed, se the check now handles the common patterns
`const auto e = (*vector_ptr)[i]` and `const auto e = vector_ptr->at(i);`.

- Improved :doc:`readability-implicit-bool-conversion
<clang-tidy/checks/readability/implicit-bool-conversion>` check to provide
valid fix suggestions for ``static_cast`` without a preceding space and
Expand Down

0 comments on commit 94ca854

Please sign in to comment.