Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[clang-tidy] support query based custom check #131804

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

HerrCai0907
Copy link
Contributor

@HerrCai0907 HerrCai0907 commented Mar 18, 2025

For better discussion, I summary all divergence here:

  • configuration format (flattening or structured)
  • diagnostic level (whether we need ERROR)

Inherit from #123734
RFC: https://discourse.llvm.org/t/support-query-based-clang-tidy-external-check/85331

this patch introduce query based custom check by CustomChecks options.

The major improvement compared with #123734:

  1. don't need to define config yaml file in command line
  2. all behavior including InheritFromParantConfig is same as other config.
  3. change configuration schema from KV structured to List structured to make extend new function easier.
  4. Split bind string and diag message to two field to give more freedom to design query.

example:

Checks: -*,custom-call-main-function
CustomChecks:
  - Name: call-main-function
    Query: |
        match callExpr(
          callee(
            functionDecl(isMain()).bind("fn")
          )
        ).bind("callee")
    Diagnostic:
      - BindName: fn
        Message: main function.
        Level: Note
      - BindName: callee
        Message: call to main function.
        Level: Warning
int main(); // note: main function.

void bar() {
  main(); // warning: call to main function. [custom-call-main-function]
}

To make this PR don't do too much things that hard to review, here are some possible features not included in this PR

  • support language
  • support traverse
  • easier used template string in diagnostics message

Copy link
Contributor Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

@HerrCai0907 HerrCai0907 marked this pull request as ready for review March 18, 2025 13:56
@HerrCai0907 HerrCai0907 changed the title origin pr [clang-tidy] support query based custom check Mar 18, 2025
@llvmbot
Copy link
Member

llvmbot commented Mar 18, 2025

@llvm/pr-subscribers-clang-tidy

@llvm/pr-subscribers-clang-tools-extra

Author: Congcong Cai (HerrCai0907)

Changes

Patch is 31.84 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/131804.diff

25 Files Affected:

  • (modified) clang-tools-extra/clang-tidy/CMakeLists.txt (+2)
  • (modified) clang-tools-extra/clang-tidy/ClangTidy.cpp (+8-1)
  • (modified) clang-tools-extra/clang-tidy/ClangTidyForceLinker.h (+5)
  • (modified) clang-tools-extra/clang-tidy/ClangTidyModule.h (+2)
  • (modified) clang-tools-extra/clang-tidy/ClangTidyOptions.cpp (+53-3)
  • (modified) clang-tools-extra/clang-tidy/ClangTidyOptions.h (+15)
  • (added) clang-tools-extra/clang-tidy/custom/CMakeLists.txt (+20)
  • (added) clang-tools-extra/clang-tidy/custom/CustomTidyModule.cpp (+50)
  • (added) clang-tools-extra/clang-tidy/custom/QueryCheck.cpp (+102)
  • (added) clang-tools-extra/clang-tidy/custom/QueryCheck.h (+42)
  • (modified) clang-tools-extra/clang-tidy/tool/ClangTidyMain.cpp (+2)
  • (modified) clang-tools-extra/docs/ReleaseNotes.rst (+3)
  • (added) clang-tools-extra/docs/clang-tidy/QueryBasedCustomChecks.rst (+57)
  • (modified) clang-tools-extra/docs/clang-tidy/index.rst (+3)
  • (added) clang-tools-extra/test/clang-tidy/checkers/custom/Inputs/clang-tidy.yml (+22)
  • (added) clang-tools-extra/test/clang-tidy/checkers/custom/Inputs/incorrect-clang-tidy.yml (+10)
  • (added) clang-tools-extra/test/clang-tidy/checkers/custom/query-incorrect-query.cpp (+3)
  • (added) clang-tools-extra/test/clang-tidy/checkers/custom/query-partially-active-check.cpp (+5)
  • (added) clang-tools-extra/test/clang-tidy/checkers/custom/query.cpp (+7)
  • (added) clang-tools-extra/test/clang-tidy/infrastructure/Inputs/custom-query-check/append-clang-tidy.yml (+8)
  • (added) clang-tools-extra/test/clang-tidy/infrastructure/Inputs/custom-query-check/empty-clang-tidy.yml (+1)
  • (added) clang-tools-extra/test/clang-tidy/infrastructure/Inputs/custom-query-check/override-clang-tidy.yml (+11)
  • (added) clang-tools-extra/test/clang-tidy/infrastructure/Inputs/custom-query-check/root-clang-tidy.yml (+11)
  • (added) clang-tools-extra/test/clang-tidy/infrastructure/Inputs/custom-query-check/vfsoverlay.yaml (+44)
  • (added) clang-tools-extra/test/clang-tidy/infrastructure/custom-query-check.cpp (+45)
diff --git a/clang-tools-extra/clang-tidy/CMakeLists.txt b/clang-tools-extra/clang-tidy/CMakeLists.txt
index 93117cf1d6373..90efd2ef1f451 100644
--- a/clang-tools-extra/clang-tidy/CMakeLists.txt
+++ b/clang-tools-extra/clang-tidy/CMakeLists.txt
@@ -58,6 +58,7 @@ add_subdirectory(bugprone)
 add_subdirectory(cert)
 add_subdirectory(concurrency)
 add_subdirectory(cppcoreguidelines)
+add_subdirectory(custom)
 add_subdirectory(darwin)
 add_subdirectory(fuchsia)
 add_subdirectory(google)
@@ -85,6 +86,7 @@ set(ALL_CLANG_TIDY_CHECKS
   clangTidyCERTModule
   clangTidyConcurrencyModule
   clangTidyCppCoreGuidelinesModule
+  clangTidyCustomModule
   clangTidyDarwinModule
   clangTidyFuchsiaModule
   clangTidyGoogleModule
diff --git a/clang-tools-extra/clang-tidy/ClangTidy.cpp b/clang-tools-extra/clang-tidy/ClangTidy.cpp
index d99847a82d168..f2a69e01a32c5 100644
--- a/clang-tools-extra/clang-tidy/ClangTidy.cpp
+++ b/clang-tools-extra/clang-tidy/ClangTidy.cpp
@@ -57,6 +57,11 @@ LLVM_INSTANTIATE_REGISTRY(clang::tidy::ClangTidyModuleRegistry)
 
 namespace clang::tidy {
 
+namespace custom {
+extern void registerCustomChecks(ClangTidyOptions const &O,
+                                 ClangTidyCheckFactories &Factories);
+} // namespace custom
+
 namespace {
 #if CLANG_TIDY_ENABLE_STATIC_ANALYZER
 static const char *AnalyzerCheckNamePrefix = "clang-analyzer-";
@@ -346,6 +351,7 @@ ClangTidyASTConsumerFactory::ClangTidyASTConsumerFactory(
     IntrusiveRefCntPtr<llvm::vfs::OverlayFileSystem> OverlayFS)
     : Context(Context), OverlayFS(std::move(OverlayFS)),
       CheckFactories(new ClangTidyCheckFactories) {
+  custom::registerCustomChecks(Context.getOptions(), *CheckFactories);
   for (ClangTidyModuleRegistry::entry E : ClangTidyModuleRegistry::entries()) {
     std::unique_ptr<ClangTidyModule> Module = E.instantiate();
     Module->addCheckFactories(*CheckFactories);
@@ -416,7 +422,7 @@ ClangTidyASTConsumerFactory::createASTConsumer(
                         .getCurrentWorkingDirectory();
   if (WorkingDir)
     Context.setCurrentBuildDirectory(WorkingDir.get());
-
+  custom::registerCustomChecks(Context.getOptions(), *CheckFactories);
   std::vector<std::unique_ptr<ClangTidyCheck>> Checks =
       CheckFactories->createChecksForLanguage(&Context);
 
@@ -659,6 +665,7 @@ getAllChecksAndOptions(bool AllowEnablingAnalyzerAlphaCheckers) {
       std::make_unique<DefaultOptionsProvider>(ClangTidyGlobalOptions(), Opts),
       AllowEnablingAnalyzerAlphaCheckers);
   ClangTidyCheckFactories Factories;
+  custom::registerCustomChecks(Context.getOptions(), Factories);
   for (const ClangTidyModuleRegistry::entry &Module :
        ClangTidyModuleRegistry::entries()) {
     Module.instantiate()->addCheckFactories(Factories);
diff --git a/clang-tools-extra/clang-tidy/ClangTidyForceLinker.h b/clang-tools-extra/clang-tidy/ClangTidyForceLinker.h
index adde9136ff1dd..50ac6e138df18 100644
--- a/clang-tools-extra/clang-tidy/ClangTidyForceLinker.h
+++ b/clang-tools-extra/clang-tidy/ClangTidyForceLinker.h
@@ -54,6 +54,11 @@ extern volatile int CppCoreGuidelinesModuleAnchorSource;
 static int LLVM_ATTRIBUTE_UNUSED CppCoreGuidelinesModuleAnchorDestination =
     CppCoreGuidelinesModuleAnchorSource;
 
+// This anchor is used to force the linker to link the CustomModule.
+extern volatile int CustomModuleAnchorSource;
+static int LLVM_ATTRIBUTE_UNUSED CustomModuleAnchorDestination =
+    CustomModuleAnchorSource;
+
 // This anchor is used to force the linker to link the DarwinModule.
 extern volatile int DarwinModuleAnchorSource;
 static int LLVM_ATTRIBUTE_UNUSED DarwinModuleAnchorDestination =
diff --git a/clang-tools-extra/clang-tidy/ClangTidyModule.h b/clang-tools-extra/clang-tidy/ClangTidyModule.h
index 28f54331755a7..6f0b2bf32a291 100644
--- a/clang-tools-extra/clang-tidy/ClangTidyModule.h
+++ b/clang-tools-extra/clang-tidy/ClangTidyModule.h
@@ -62,6 +62,8 @@ class ClangTidyCheckFactories {
                          });
   }
 
+  void erase(llvm::StringRef CheckName) { Factories.erase(CheckName); }
+
   /// Create instances of checks that are enabled.
   std::vector<std::unique_ptr<ClangTidyCheck>>
   createChecks(ClangTidyContext *Context) const;
diff --git a/clang-tools-extra/clang-tidy/ClangTidyOptions.cpp b/clang-tools-extra/clang-tidy/ClangTidyOptions.cpp
index 8bac6f161fa05..acedbd8d41faa 100644
--- a/clang-tools-extra/clang-tidy/ClangTidyOptions.cpp
+++ b/clang-tools-extra/clang-tidy/ClangTidyOptions.cpp
@@ -8,12 +8,12 @@
 
 #include "ClangTidyOptions.h"
 #include "ClangTidyModuleRegistry.h"
+#include "clang/Basic/DiagnosticIDs.h"
 #include "clang/Basic/LLVM.h"
 #include "llvm/ADT/SmallString.h"
+#include "llvm/ADT/StringExtras.h"
 #include "llvm/Support/Debug.h"
-#include "llvm/Support/Errc.h"
 #include "llvm/Support/ErrorOr.h"
-#include "llvm/Support/FileSystem.h"
 #include "llvm/Support/MemoryBufferRef.h"
 #include "llvm/Support/Path.h"
 #include "llvm/Support/YAMLTraits.h"
@@ -72,7 +72,8 @@ struct NOptionMap {
   NOptionMap(IO &, const ClangTidyOptions::OptionMap &OptionMap) {
     Options.reserve(OptionMap.size());
     for (const auto &KeyValue : OptionMap)
-      Options.emplace_back(std::string(KeyValue.getKey()), KeyValue.getValue().Value);
+      Options.emplace_back(std::string(KeyValue.getKey()),
+                           KeyValue.getValue().Value);
   }
   ClangTidyOptions::OptionMap denormalize(IO &) {
     ClangTidyOptions::OptionMap Map;
@@ -126,6 +127,52 @@ void yamlize(IO &IO, ClangTidyOptions::OptionMap &Val, bool,
   }
 }
 
+namespace {
+struct MultiLineString {
+  std::string &S;
+};
+} // namespace
+
+template <> struct BlockScalarTraits<MultiLineString> {
+  static void output(const MultiLineString &S, void *Ctxt, raw_ostream &OS) {
+    OS << S.S;
+  }
+  static StringRef input(StringRef Str, void *Ctxt, MultiLineString &S) {
+    S.S = Str;
+    return "";
+  }
+};
+
+template <> struct ScalarEnumerationTraits<clang::DiagnosticIDs::Level> {
+  static void enumeration(IO &IO, clang::DiagnosticIDs::Level &Level) {
+    IO.enumCase(Level, "Error", clang::DiagnosticIDs::Level::Error);
+    IO.enumCase(Level, "Warning", clang::DiagnosticIDs::Level::Warning);
+    IO.enumCase(Level, "Note", clang::DiagnosticIDs::Level::Note);
+  }
+};
+template <> struct SequenceElementTraits<ClangTidyOptions::CustomCheckDiag> {
+  static const bool flow = false;
+};
+template <> struct MappingTraits<ClangTidyOptions::CustomCheckDiag> {
+  static void mapping(IO &IO, ClangTidyOptions::CustomCheckDiag &D) {
+    IO.mapRequired("BindName", D.BindName);
+    MultiLineString MLS{D.Message};
+    IO.mapRequired("Message", MLS);
+    IO.mapOptional("Level", D.Level);
+  }
+};
+template <> struct SequenceElementTraits<ClangTidyOptions::CustomCheckValue> {
+  static const bool flow = false;
+};
+template <> struct MappingTraits<ClangTidyOptions::CustomCheckValue> {
+  static void mapping(IO &IO, ClangTidyOptions::CustomCheckValue &V) {
+    IO.mapRequired("Name", V.Name);
+    MultiLineString MLS{V.Query};
+    IO.mapRequired("Query", MLS);
+    IO.mapRequired("Diagnostic", V.Diags);
+  }
+};
+
 struct ChecksVariant {
   std::optional<std::string> AsString;
   std::optional<std::vector<std::string>> AsVector;
@@ -181,6 +228,7 @@ template <> struct MappingTraits<ClangTidyOptions> {
     IO.mapOptional("InheritParentConfig", Options.InheritParentConfig);
     IO.mapOptional("UseColor", Options.UseColor);
     IO.mapOptional("SystemHeaders", Options.SystemHeaders);
+    IO.mapOptional("CustomChecks", Options.CustomChecks);
   }
 };
 
@@ -249,6 +297,8 @@ ClangTidyOptions &ClangTidyOptions::mergeWith(const ClangTidyOptions &Other,
         ClangTidyValue(KeyValue.getValue().Value,
                        KeyValue.getValue().Priority + Order));
   }
+  mergeVectors(CustomChecks, Other.CustomChecks);
+
   return *this;
 }
 
diff --git a/clang-tools-extra/clang-tidy/ClangTidyOptions.h b/clang-tools-extra/clang-tidy/ClangTidyOptions.h
index dd78c570d25d9..2a64ee8fdf7ea 100644
--- a/clang-tools-extra/clang-tidy/ClangTidyOptions.h
+++ b/clang-tools-extra/clang-tidy/ClangTidyOptions.h
@@ -9,6 +9,7 @@
 #ifndef LLVM_CLANG_TOOLS_EXTRA_CLANG_TIDY_CLANGTIDYOPTIONS_H
 #define LLVM_CLANG_TOOLS_EXTRA_CLANG_TIDY_CLANGTIDYOPTIONS_H
 
+#include "clang/Basic/DiagnosticIDs.h"
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
 #include "llvm/ADT/SmallString.h"
 #include "llvm/ADT/StringMap.h"
@@ -17,6 +18,7 @@
 #include "llvm/Support/MemoryBufferRef.h"
 #include "llvm/Support/VirtualFileSystem.h"
 #include <functional>
+#include <map>
 #include <optional>
 #include <string>
 #include <system_error>
@@ -129,6 +131,19 @@ struct ClangTidyOptions {
   /// Key-value mapping used to store check-specific options.
   OptionMap CheckOptions;
 
+  struct CustomCheckDiag {
+    std::string BindName;
+    std::string Message;
+    std::optional<DiagnosticIDs::Level> Level;
+  };
+  struct CustomCheckValue {
+    std::string Name;
+    std::string Query;
+    llvm::SmallVector<CustomCheckDiag> Diags;
+  };
+  using CustomCheckValueList = llvm::SmallVector<CustomCheckValue>;
+  std::optional<CustomCheckValueList> CustomChecks;
+
   using ArgList = std::vector<std::string>;
 
   /// Add extra compilation arguments to the end of the list.
diff --git a/clang-tools-extra/clang-tidy/custom/CMakeLists.txt b/clang-tools-extra/clang-tidy/custom/CMakeLists.txt
new file mode 100644
index 0000000000000..40387b73b5253
--- /dev/null
+++ b/clang-tools-extra/clang-tidy/custom/CMakeLists.txt
@@ -0,0 +1,20 @@
+set(LLVM_LINK_COMPONENTS
+  support
+  )
+
+add_clang_library(clangTidyCustomModule STATIC
+  CustomTidyModule.cpp
+  QueryCheck.cpp
+
+  LINK_LIBS
+  clangTidy
+  clangTidyUtils
+
+  DEPENDS
+  ClangDriverOptions
+  )
+
+clang_target_link_libraries(clangTidyCustomModule
+  PRIVATE
+  clangQuery
+  )
diff --git a/clang-tools-extra/clang-tidy/custom/CustomTidyModule.cpp b/clang-tools-extra/clang-tidy/custom/CustomTidyModule.cpp
new file mode 100644
index 0000000000000..e11a39f1a4ccf
--- /dev/null
+++ b/clang-tools-extra/clang-tidy/custom/CustomTidyModule.cpp
@@ -0,0 +1,50 @@
+#include "../ClangTidy.h"
+#include "../ClangTidyModule.h"
+#include "../ClangTidyModuleRegistry.h"
+#include "../ClangTidyOptions.h"
+#include "QueryCheck.h"
+#include "llvm/ADT/SmallSet.h"
+#include "llvm/ADT/SmallString.h"
+#include "llvm/ADT/StringRef.h"
+#include <memory>
+
+namespace clang::tidy {
+namespace custom {
+
+class CustomModule : public ClangTidyModule {
+public:
+  void addCheckFactories(ClangTidyCheckFactories &CheckFactories) override {}
+};
+
+// FIXME: could be clearer to add parameter of addCheckFactories to pass
+// Options?
+extern void registerCustomChecks(ClangTidyOptions const &Options,
+                                 ClangTidyCheckFactories &Factories) {
+  static llvm::SmallSet<llvm::SmallString<32>, 8> CustomCheckNames{};
+  if (!Options.CustomChecks.has_value() || Options.CustomChecks->empty())
+    return;
+  for (llvm::SmallString<32> const &Name : CustomCheckNames)
+    Factories.erase(Name);
+  for (const ClangTidyOptions::CustomCheckValue &V :
+       Options.CustomChecks.value()) {
+    llvm::SmallString<32> Name = llvm::StringRef{"custom-" + V.Name};
+    Factories.registerCheckFactory(
+        // add custom- prefix to avoid conflicts with builtin checks
+        Name, [&V](llvm::StringRef Name, ClangTidyContext *Context) {
+          return std::make_unique<custom::QueryCheck>(Name, V, Context);
+        });
+    CustomCheckNames.insert(std::move(Name));
+  }
+}
+
+} // namespace custom
+
+// Register the AlteraTidyModule using this statically initialized variable.
+static ClangTidyModuleRegistry::Add<custom::CustomModule>
+    X("custom-module", "Adds custom query lint checks.");
+
+// This anchor is used to force the linker to link in the generated object file
+// and thus register the AlteraModule.
+volatile int CustomModuleAnchorSource = 0;
+
+} // namespace clang::tidy
diff --git a/clang-tools-extra/clang-tidy/custom/QueryCheck.cpp b/clang-tools-extra/clang-tidy/custom/QueryCheck.cpp
new file mode 100644
index 0000000000000..c69e76918f7ed
--- /dev/null
+++ b/clang-tools-extra/clang-tidy/custom/QueryCheck.cpp
@@ -0,0 +1,102 @@
+//===--- QueryCheck.cpp - clang-tidy --------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "QueryCheck.h"
+#include "../../clang-query/Query.h"
+#include "../../clang-query/QueryParser.h"
+#include "clang/ASTMatchers/ASTMatchers.h"
+#include "clang/ASTMatchers/Dynamic/VariantValue.h"
+#include "clang/Basic/DiagnosticIDs.h"
+#include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/StringRef.h"
+#include <string>
+
+using namespace clang::ast_matchers;
+
+namespace clang::tidy::custom {
+
+QueryCheck::QueryCheck(llvm::StringRef Name,
+                       const ClangTidyOptions::CustomCheckValue &V,
+                       ClangTidyContext *Context)
+    : ClangTidyCheck(Name, Context) {
+  for (const ClangTidyOptions::CustomCheckDiag &D : V.Diags) {
+    auto DiagnosticIdIt =
+        Diags
+            .try_emplace(D.Level.value_or(DiagnosticIDs::Warning),
+                         llvm::StringMap<llvm::SmallVector<std::string>>{})
+            .first;
+    auto DiagMessageIt =
+        DiagnosticIdIt->getSecond()
+            .try_emplace(D.BindName, llvm::SmallVector<std::string>{})
+            .first;
+    DiagMessageIt->second.emplace_back(D.Message);
+  }
+
+  clang::query::QuerySession QS({});
+  llvm::StringRef QueryStringRef{V.Query};
+  while (!QueryStringRef.empty()) {
+    query::QueryRef Q = query::QueryParser::parse(QueryStringRef, QS);
+    switch (Q->Kind) {
+    case query::QK_Match: {
+      const auto &MatchQuerry = llvm::cast<query::MatchQuery>(*Q);
+      Matchers.push_back(MatchQuerry.Matcher);
+      break;
+    }
+    case query::QK_Let: {
+      const auto &LetQuerry = llvm::cast<query::LetQuery>(*Q);
+      LetQuerry.run(llvm::errs(), QS);
+      break;
+    }
+    case query::QK_Invalid: {
+      const auto &InvalidQuerry = llvm::cast<query::InvalidQuery>(*Q);
+      Context->configurationDiag(InvalidQuerry.ErrStr);
+      break;
+    }
+    // FIXME: TODO
+    case query::QK_File:
+    case query::QK_DisableOutputKind:
+    case query::QK_EnableOutputKind:
+    case query::QK_SetOutputKind:
+    case query::QK_SetTraversalKind:
+    case query::QK_Help:
+    case query::QK_NoOp:
+    case query::QK_Quit:
+    case query::QK_SetBool: {
+      Context->configurationDiag("unsupported querry kind");
+    }
+    }
+    QueryStringRef = Q->RemainingContent;
+  }
+}
+
+void QueryCheck::registerMatchers(MatchFinder *Finder) {
+  for (const ast_matchers::dynamic::DynTypedMatcher &M : Matchers)
+    Finder->addDynamicMatcher(M, this);
+}
+
+void QueryCheck::check(const MatchFinder::MatchResult &Result) {
+  auto Emit = [this](DiagMaps const &DiagMaps, std::string const &BindName,
+                     DynTypedNode const &Node, DiagnosticIDs::Level Level) {
+    if (!DiagMaps.contains(Level))
+      return;
+    auto &DiagMap = DiagMaps.at(Level);
+    if (!DiagMap.contains(BindName))
+      return;
+    for (const std::string &Message : DiagMap.at(BindName)) {
+      diag(Node.getSourceRange().getBegin(), Message, Level);
+    }
+  };
+  for (auto &[Name, Node] : Result.Nodes.getMap())
+    Emit(Diags, Name, Node, DiagnosticIDs::Error);
+  for (auto &[Name, Node] : Result.Nodes.getMap())
+    Emit(Diags, Name, Node, DiagnosticIDs::Warning);
+  // place Note last, otherwise it will not be emitted
+  for (auto &[Name, Node] : Result.Nodes.getMap())
+    Emit(Diags, Name, Node, DiagnosticIDs::Note);
+}
+} // namespace clang::tidy::custom
diff --git a/clang-tools-extra/clang-tidy/custom/QueryCheck.h b/clang-tools-extra/clang-tidy/custom/QueryCheck.h
new file mode 100644
index 0000000000000..ded4cad4e3459
--- /dev/null
+++ b/clang-tools-extra/clang-tidy/custom/QueryCheck.h
@@ -0,0 +1,42 @@
+//===--- QueryCheck.h - clang-tidy ------------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_CLANG_TOOLS_EXTRA_CLANG_TIDY_CUSTOM_QUERYCHECK_H
+#define LLVM_CLANG_TOOLS_EXTRA_CLANG_TIDY_CUSTOM_QUERYCHECK_H
+
+#include "../ClangTidyCheck.h"
+#include "clang/ASTMatchers/Dynamic/VariantValue.h"
+#include "clang/Basic/DiagnosticIDs.h"
+#include "llvm/ADT/DenseMap.h"
+#include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/StringMap.h"
+
+namespace clang::tidy::custom {
+
+/// FIXME: Write a short description.
+///
+/// For the user-facing documentation see:
+/// http://clang.llvm.org/extra/clang-tidy/checks/custom/query.html
+class QueryCheck : public ClangTidyCheck {
+public:
+  QueryCheck(llvm::StringRef Name, const ClangTidyOptions::CustomCheckValue &V,
+             ClangTidyContext *Context);
+  void registerMatchers(ast_matchers::MatchFinder *Finder) override;
+  void check(const ast_matchers::MatchFinder::MatchResult &Result) override;
+
+private:
+  llvm::SmallVector<ast_matchers::dynamic::DynTypedMatcher> Matchers{};
+  using DiagMaps =
+      llvm::DenseMap<DiagnosticIDs::Level,
+                     llvm::StringMap<llvm::SmallVector<std::string>>>;
+  DiagMaps Diags;
+};
+
+} // namespace clang::tidy::custom
+
+#endif // LLVM_CLANG_TOOLS_EXTRA_CLANG_TIDY_CUSTOM_QUERYCHECK_H
diff --git a/clang-tools-extra/clang-tidy/tool/ClangTidyMain.cpp b/clang-tools-extra/clang-tidy/tool/ClangTidyMain.cpp
index fa8887e4639b4..5784b05d2281d 100644
--- a/clang-tools-extra/clang-tidy/tool/ClangTidyMain.cpp
+++ b/clang-tools-extra/clang-tidy/tool/ClangTidyMain.cpp
@@ -60,6 +60,8 @@ Configuration files:
   Checks                       - Same as '--checks'. Additionally, the list of
                                  globs can be specified as a list instead of a
                                  string.
+  CustomChecks                 - Array of user defined checks based on
+                                 clang-query syntax.
   ExcludeHeaderFilterRegex     - Same as '--exclude-header-filter'.
   ExtraArgs                    - Same as '--extra-arg'.
   ExtraArgsBefore              - Same as '--extra-arg-before'.
diff --git a/clang-tools-extra/docs/ReleaseNotes.rst b/clang-tools-extra/docs/ReleaseNotes.rst
index 2252efb498c2c..6d22f83f2248b 100644
--- a/clang-tools-extra/docs/ReleaseNotes.rst
+++ b/clang-tools-extra/docs/ReleaseNotes.rst
@@ -96,6 +96,9 @@ Improvements to clang-tidy
   `SystemHeaders` option is enabled.
   Note: this may lead to false negatives; downstream users may need to adjust
   their checks to preserve existing behavior.
+- :program:`clang-tidy` now supports query based custom checks by `CustomChecks`
+  configuration option.
+  :doc:`Query Based Custom Check Document <clang-tidy/QueryBasedCustomChecks>`
 
 New checks
 ^^^^^^^^^^
diff --git a/clang-tools-extra/docs/clang-tidy/QueryBasedCustomChecks.rst b/clang-tools-extra/docs/clang-tidy/QueryBasedCustomChecks.rst
new file mode 100644
index 0000000000000..6efd8fe6797df
--- /dev/null
+++ b/clang-tools-extra/docs/clang-tidy/QueryBasedCustomChecks.rst
@@ -0,0 +1,57 @@
+====================================
+Query Based Custom Clang-Tidy Checks
+====================================
+
+Introduction
+============
+
+This page provides examples of how to add query based custom checks for
+:program:`clang-tidy`.
+
+Custom checks are based on clang-query syntax. Every custom checks will be
+registered in `custom` module to avoid name conflict. They can be enabled or
+disabled by the checks option like the builtin checks.
+
+Custom checks support inheritance from parent configurations like other
+configuration items.
+
+Configuration
+=============
+
+`CustomChecks` is a list of custom checks. Each check must contain
+  - Name: check name can been used in `-checks` option.
+  - Query: query string
+  - Diagnostic: list of diagnostics to be reported.
+     - BindName: name of the node to be bound in `Query`.
+     - Message: message to be reported.
+     - Level: severity of the diagnostic, the possible val...
[truncated]

@HerrCai0907 HerrCai0907 force-pushed the users/ccc/clang-tidy/query-check branch from c6e485e to 64c45dc Compare March 18, 2025 14:05
@HerrCai0907
Copy link
Contributor Author

--dump-config will create incorrect YAML due to #131694.

HerrCai0907 and others added 2 commits March 19, 2025 09:58
Co-authored-by: Baranov Victor <bar.victor.2002@gmail.com>
@carlosgalvezp
Copy link
Contributor

Sorry for being late for the review. This is a major new feature of clang-tidy, so I believe it would be good to have an RFC before looking at the detailed implementation. What problem does it solve, why should clang-tidy solve it, what the API towards the user should be, etc. Should we do that?

@olologin
Copy link

olologin commented Mar 19, 2025

@carlosgalvezp Hi, I tried to explain everything here: #107680
@HerrCai0907 maybe you can reference that issue in the first post in this PR.

@olologin
Copy link

I like it if it does not break anything :)
One small nitpick: Maybe word "custom" is too broad for this. Maybe something that mentions "clang-query" is more specific.
But it is just my opinion and I am not related to clang-tidy, so whatever you guys decide.

Copy link
Member

@PiotrZSL PiotrZSL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not like this "CustomChecks"

What I would like to have in config:

custom-call-main-function.Engine: Query
custom-call-main-function.Language: C
custom-call-main-function.Traverse: AsIs
custom-call-main-function.Code: 'match callExpr(callee(functionDecl(isMain()).bind("fn"))).bind("callee")'
custom-call-main-function.Diagnostic.Warning.BindName: callee
custom-call-main-function.Diagnostic.Warning.Message: "call to {fn} function"
custom-call-main-function.Diagnostic.Note.1.BindName: fn
custom-call-main-function.Diagnostic.Note.1.Message: "fn is defined here"
custom-call-main-function.Diagnostic.Note.2.BindName: fn
custom-call-main-function.Diagnostic.Note.2.Message: "fn here is not nice"

Main reason is simple, easy to modify check and add new check in subconfigs.
I do not see need for errors, always someone can use warnings as errors.
If you do not want to split this into separate options, then at least move configs into invidual checks.


clang_target_link_libraries(clangTidyCustomModule
PRIVATE
clangQuery
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now clang-tidy is made to be dependent on clangQuery, which is a major change. Is this OK? How are build times affected?

It would be great to have @AaronBallman's opinion on this one :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably would be fine, would be good to have cmake options, to enable custom checks, and then if not enabled then support for them woudn't be included in binary

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, something similar to what's done with the static-analyzer checks

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the suggestion from @PiotrZSL is a reasonable compromise.

I'm presuming it's not an issue that the clangQuery library links against a bunch of clang libraries that clang-tidy already links against, but it would be good to double-check that we're not accidentally getting two copies of those libraries in a static link build.

@carlosgalvezp
Copy link
Contributor

Hi, I tried to explain everything here:

Thanks, I missed that! For RFCs it's preferable to use Discourse, as there is a lot less traffic and can more easily catch the eyes of relevant people. I did not see much discussion in the original RFC though, which may hint that not many people were aware.

My understanding is that the goal is to avoid rebuilding clang-tidy/plugins to be able to add new simple checks via .clang-tidy file. I see the benefit of this, however I'm not sure it justifies introducing this dependency from clang-tidy to clang-query in the build system.

Do you know how much build times and binary size are affected by this change? Are there any drawbacks of this now tight coupling between clang-tidy and clang-query, from an architectural/design point of view?

I tagged @AaronBallman in the code comment but maybe it's easier to bring it up in the top-level discussion (sorry for the double ping!)

@vbvictor
Copy link
Contributor

Do you know how much build times and binary size are affected by this change?

Binary size comparison was addressed in discussion on previous PR, here is comment for reference #123734 (comment).
However, this may have changed now because of new architecture.

@HerrCai0907
Copy link
Contributor Author

However, this may have changed now because of new architecture.

It will be too much differences because dependences are the same. only some wrapper code in clangtidy changed.

@HerrCai0907
Copy link
Contributor Author

Main reason is simple, easy to modify check and add new check in subconfigs.

It is also support to add new check in subconfigs, to modify check is thing I did not implement yet. (There are FIXME in merge config part of code)
In theory, these two configurations are equivalent. I think flattening the configuration items will only reduce readability, because we cannot define a YAML schema for it.

@HerrCai0907
Copy link
Contributor Author

I do not see need for errors, always someone can use warnings as errors.

Agree. I will remove it if no one against it.

@PiotrZSL
Copy link
Member

PiotrZSL commented Mar 19, 2025

I think flattening the configuration items will only reduce readability, because we cannot define a YAML schema for it.

If we put that entire YAML under "custom-call-main-function.Config", that would also be fine. Thing is that with this there could be a way to make all chances in "custom" module without modify tool core. In such case we could have;
custom-call-main-function.Config and custom-call-main-function.Engine, and in such case entire config parsing would be actually inside a check code, and creating different base checks could also be possible.

@HerrCai0907
Copy link
Contributor Author

If we put that entire YAML under "custom-call-main-function.Config", that would also be fine.

I don't want to introduce a new config search mechanism. so everything is in.clang-tidy files. we still merge all config from file to root including custom check field.
if we split config to difference files, how can we find it from original .clang-tidy file?

Thing is that with this there could be a way to make all chances in "custom" module without modify tool core.

we cannot do everything in custom check inside. at least checkfactories needs to know custom check to correct them. I think custom check should be same as built-in check instead of give a check maybe custom-query-base and wrapper everything inside. it will make --list-checks and nolint unfriend.

@olologin
Copy link

olologin commented Mar 19, 2025

Splitting config into multiple files is potentially bad because caching wrappers need to be adjusted to take into account this new type of input file (they only know about .clang-tidy file now).

@PiotrZSL
Copy link
Member

checkfactories could read name of checks from an options config.

@HerrCai0907
Copy link
Contributor Author

checkfactories could read name of checks from an options config.

Then everything go back. checkfactories or some other part in core will parse the options and create check instance. It is similar as what this pr did.

Put everything in config also have some disadvantages, because there are some array structured data, in check, we need to guess all possible key and try them. It waste CPU when there are lots of custom checks and looks like hack.

@AaronBallman
Copy link
Collaborator

Hi, I tried to explain everything here:

Thanks, I missed that! For RFCs it's preferable to use Discourse, as there is a lot less traffic and can more easily catch the eyes of relevant people. I did not see much discussion in the original RFC though, which may hint that not many people were aware.

Yeah, that Discourse thread is really too recent to know whether the community wants the extension or not. I think the RFC should run longer before we move forward.

My understanding is that the goal is to avoid rebuilding clang-tidy/plugins to be able to add new simple checks via .clang-tidy file. I see the benefit of this, however I'm not sure it justifies introducing this dependency from clang-tidy to clang-query in the build system.

+1; it sounds like the goal is somewhat less about clang-tidy checks and more about running code transformations more easily. clang-query doesn't really lend itself to "find this pattern, replace with this other pattern", and clang-tidy requires enough boilerplate for checks that it's a heavy-handed solution. It might make more sense to design a tool specific for code transformations -- something that can be easily run over a compile_commands.json file, for example.

@olologin
Copy link

olologin commented Mar 20, 2025

it sounds like the goal is somewhat less about clang-tidy checks and more about running code transformations more easily.

Original goal was to have just clang-tidy checks. We use it on our CI to guard main branch from people accidentally merging prohibited stuff. If CI informs devs their PR is bad - they are smart enough to follow the guidelines and fix their problem themselves without relying on automatic transformers. Or to use "NOLINT" as a way to override checks.
Extending this functionality with transformers is optional and can be done later in another PR, if someone is willing.

Also, it is a bit unclear when RFC is considered "popular enough". Don't get me wrong, I understand you don't want to merge every shiny PR people find cool, but I don't want this PR to be stuck for 1 year and then no one will want to rebase it.

@AaronBallman
Copy link
Collaborator

Extending this functionality with transformers is optional and can be done later in another PR, if someone is willing.

Yes and no. If the goal is just to write checks, then clang-tidy is a good home for the functionality. But I had the impression that for some people this is also about code migrations, where clang-tidy may be a less appropriate home and a dedicated tool is a potentially better approach. We don't want to walk down the path of adding it to clang-tidy only to add similar-but-extended functionality in another tool as that increases our maintenance burdens and users have to work harder to figure out which tool to reach for or migrate between. (This isn't to say clang-tidy isn't still a reasonable home for this. Just bringing it up so other maintainers are thinking about the long-term picture of where we want to be.)

Also, just my opinion: I cannot force people into merging this PR, but if it does not break anything and if nobody has significant concerns - I don't see any reason in slowing it down and postponing the merge, because the feature is already implemented and the time is spent.

That's only part of the decision. Once we accept a feature, we have to support it basically forever. So there are long-term maintenance costs, and each feature we add increases the expense of adding new features later due to the complexity of interactions between them. It's not a simple matter of "someone already wrote it", it's also "we need to review it, we need to maintain it forever, what else do we lose if we take it, etc."

This feature will obviously find its users. You can see a lot of public repos (last time I saw it in Firefox) extending clang-tidy with their simple custom checks, some of them are on the level of "Let's not allow people to use LoadLibraryEx directly).
Also, it is a bit unclear when RFC is considered "popular enough". Don't get me wrong, I understand you don't want to merge every shiny PR people find cool, but I don't want this PR to be stuck for 1 year and then no one will want to rebase it.

We're in the process of improving our governance to make it more clear what next steps there are for RFCs which aren't reaching obvious consensus. There's no hard rule for "popular enough" (and clang-tools-extra extensions are a bit different from clang ones in terms of how much the community is likely to weigh in), but we do expect the RFC to run for a reasonable period of time so people have the opportunity to express their support or concerns.

@HerrCai0907
Copy link
Contributor Author

clang-query doesn't really lend itself to "find this pattern, replace with this other pattern", and clang-tidy requires enough boilerplate for checks that it's a heavy-handed solution. It might make more sense to design a tool specific for code transformations -- something that can be easily run over a compile_commands.json file, for example.

I think transforming code and detecting code are totally different things. If someone want a tools to help him automatically replace code, it should be done by other tools.

Here the only thing I want to support is let adding custom checks for special projects possible. They are lots of projects have some code guidelines which is not suitable for every projects. so they cannot be supported in official clangtidy. For example, in our projects, we cannot use static variable because linker script cannot place them correct, but it definitely cannot be merged in mainline.
If we limit the usage as a checker. then clangtidy is the best way to support it. we can reuse the config, nolint and so on.

@carlosgalvezp
Copy link
Contributor

My main concern about this patch is putting the implementation into the config file. We must be very careful with what we add to the config file, because it must remain there unchanged forever. We must have an upfront design that we believe will hold for the next N years without modification.

This is because clang-tidy not only analyzes our projects' own .clang-tidy file (which we can change), but it also analyzes the .clang-tidy files of the third-party projects we include (which we cannot change). readability-identifier-naming picks up any third-party .clang-tidy file anytime you include a third-party header (regardless of whether you later filter out those warnings or not). If the config file changes in a non-backwards compatible way, clang-tidy bails out and refuses to analyze anything, which is a big problem.

This feature request requires adding a lot of things in the config file. And once it's in, people will want more things in it (eventually they will want exactly the same functionality as can be achieved by regular checks). So I see this feature growing without possibility of refactoring and cleaning, which can add a lot of complexity and maintenance work.

It would feel safer if the custom check was passed via command-line instead, similar to --config. Then, we can be more aggressive with backwards-incompatible changes, because people running clang-tidy are in control of their CLI command and can adjust it upon breaking changes. Or simply another file entirely that does not have the problems I described.

@HerrCai0907
Copy link
Contributor Author

It would feel safer if the custom check was passed via command-line instead, similar to --config

good point. should we only support command line for this feature? for me it is enough.

@olologin
Copy link

olologin commented Mar 26, 2025

It would feel safer if the custom check was passed via command-line instead, similar to --config

But then you will not be able to properly disable such checks for subfolders (modules) of some huge project, right?

@HerrCai0907
Copy link
Contributor Author

It would feel safer if the custom check was passed via command-line instead, similar to --config

But then you will not be able to properly disable such checks for subfolders (modules) of some huge project, right?

it can be done by checks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants