-
Notifications
You must be signed in to change notification settings - Fork 11k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[clang][test] add testing for the AST matcher reference #94248
base: users/5chmidti/rm_not_needed_run_overload_in_BoundNodesCallback
Are you sure you want to change the base?
[clang][test] add testing for the AST matcher reference #94248
Conversation
@llvm/pr-subscribers-clang Author: Julian Schmidt (5chmidti) ChangesPreviously, the examples in the AST matcher reference, which gets generated by the doxygen comments in This patch introduces a simple DSL around doxygen commands to enable testing the AST matcher documentation in a way that should be relatively easy.
This patch rewrites/extends the documentation such that all matchers have a documented example. The current statistics emitted by the parser are:
The tests are generated during building and the script will only print something if it found an issue (compile failure, parsing issues, the expected and actual number of failures differs). DSL for generating the tests from documentation. TLDR: \header{a.h} \code \matcher{expr()} <- one or more matchers in succession \matcher{varDecl()} <- new matcher resets the context, the above The above block can be repeated inside a doxygen command for multiple code examples. Language Grammar: compile_args j:= \compile_args{[<compile_arg>;]<compile_arg>} The 'std' tag and '\compile_args' support specifying a specific language version, a whole language and all of its versions, and thresholds (implies ranges). Multiple arguments are passed with a ',' separator. For a language and version to execute a tested matcher, it has to match the specified '\compile_args' for the code, and the 'std' tag for the matcher. Predicates for the 'std' compiler flag are used with disjunction between languages (e.g. 'c || c++') and conjunction for all predicates specific to each language (e.g. 'c++11-or-later && c++23-or-earlier'). Examples:
Tags: Type: Matcher types are used to mark matchers as submatchers with 'sub' or as Count: Std: Fixes #57607 Patch is 899.09 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/94248.diff 8 Files Affected:
diff --git a/clang/docs/LibASTMatchersReference.html b/clang/docs/LibASTMatchersReference.html
index a16b9c44ef0ea..baf39befd796a 100644
--- a/clang/docs/LibASTMatchersReference.html
+++ b/clang/docs/LibASTMatchersReference.html
@@ -586,28 +586,36 @@ <h2 id="decl-matchers">Node Matchers</h2>
#pragma omp declare simd
int min();
-attr()
- matches "nodiscard", "nonnull", "noinline", and the whole "#pragma" line.
+
+The matcher attr()
+matches nodiscard, nonnull, noinline, and
+declare simd.
</pre></td></tr>
<tr><td>Matcher<<a href="https://clang.llvm.org/doxygen/classclang_1_1CXXBaseSpecifier.html">CXXBaseSpecifier</a>></td><td class="name" onclick="toggle('cxxBaseSpecifier0')"><a name="cxxBaseSpecifier0Anchor">cxxBaseSpecifier</a></td><td>Matcher<<a href="https://clang.llvm.org/doxygen/classclang_1_1CXXBaseSpecifier.html">CXXBaseSpecifier</a>>...</td></tr>
<tr><td colspan="4" class="doc" id="cxxBaseSpecifier0"><pre>Matches class bases.
-Examples matches public virtual B.
+Given
class B {};
class C : public virtual B {};
+
+The matcher cxxRecordDecl(hasDirectBase(cxxBaseSpecifier()))
+matches C.
</pre></td></tr>
<tr><td>Matcher<<a href="https://clang.llvm.org/doxygen/classclang_1_1CXXCtorInitializer.html">CXXCtorInitializer</a>></td><td class="name" onclick="toggle('cxxCtorInitializer0')"><a name="cxxCtorInitializer0Anchor">cxxCtorInitializer</a></td><td>Matcher<<a href="https://clang.llvm.org/doxygen/classclang_1_1CXXCtorInitializer.html">CXXCtorInitializer</a>>...</td></tr>
<tr><td colspan="4" class="doc" id="cxxCtorInitializer0"><pre>Matches constructor initializers.
-Examples matches i(42).
+Given
class C {
C() : i(42) {}
int i;
};
+
+The matcher cxxCtorInitializer()
+matches i(42).
</pre></td></tr>
@@ -619,17 +627,22 @@ <h2 id="decl-matchers">Node Matchers</h2>
public:
int a;
};
-accessSpecDecl()
- matches 'public:'
+
+The matcher accessSpecDecl()
+matches public:.
</pre></td></tr>
<tr><td>Matcher<<a href="https://clang.llvm.org/doxygen/classclang_1_1Decl.html">Decl</a>></td><td class="name" onclick="toggle('bindingDecl0')"><a name="bindingDecl0Anchor">bindingDecl</a></td><td>Matcher<<a href="https://clang.llvm.org/doxygen/classclang_1_1BindingDecl.html">BindingDecl</a>>...</td></tr>
<tr><td colspan="4" class="doc" id="bindingDecl0"><pre>Matches binding declarations
-Example matches foo and bar
-(matcher = bindingDecl()
- auto [foo, bar] = std::make_pair{42, 42};
+Given
+ struct pair { int x; int y; };
+ pair make(int, int);
+ auto [foo, bar] = make(42, 42);
+
+The matcher bindingDecl()
+matches foo and bar.
</pre></td></tr>
@@ -642,14 +655,18 @@ <h2 id="decl-matchers">Node Matchers</h2>
myFunc(^(int p) {
printf("%d", p);
})
+
</pre></td></tr>
<tr><td>Matcher<<a href="https://clang.llvm.org/doxygen/classclang_1_1Decl.html">Decl</a>></td><td class="name" onclick="toggle('classTemplateDecl0')"><a name="classTemplateDecl0Anchor">classTemplateDecl</a></td><td>Matcher<<a href="https://clang.llvm.org/doxygen/classclang_1_1ClassTemplateDecl.html">ClassTemplateDecl</a>>...</td></tr>
<tr><td colspan="4" class="doc" id="classTemplateDecl0"><pre>Matches C++ class template declarations.
-Example matches Z
+Given
template<class T> class Z {};
+
+The matcher classTemplateDecl()
+matches Z.
</pre></td></tr>
@@ -660,13 +677,14 @@ <h2 id="decl-matchers">Node Matchers</h2>
template<class T1, class T2, int I>
class A {};
- template<class T, int I>
- class A<T, T*, I> {};
+ template<class T, int I> class A<T, T*, I> {};
template<>
class A<int, int, 1> {};
-classTemplatePartialSpecializationDecl()
- matches the specialization A<T,T*,I> but not A<int,int,1>
+
+The matcher classTemplatePartialSpecializationDecl()
+matches template<class T, int I> class A<T, T*, I> {},
+but does not match A<int, int, 1>.
</pre></td></tr>
@@ -677,87 +695,128 @@ <h2 id="decl-matchers">Node Matchers</h2>
template<typename T> class A {};
template<> class A<double> {};
A<int> a;
-classTemplateSpecializationDecl()
- matches the specializations A<int> and A<double>
+
+The matcher classTemplateSpecializationDecl()
+matches class A<int>
+and class A<double>.
</pre></td></tr>
<tr><td>Matcher<<a href="https://clang.llvm.org/doxygen/classclang_1_1Decl.html">Decl</a>></td><td class="name" onclick="toggle('conceptDecl0')"><a name="conceptDecl0Anchor">conceptDecl</a></td><td>Matcher<<a href="https://clang.llvm.org/doxygen/classclang_1_1ConceptDecl.html">ConceptDecl</a>>...</td></tr>
<tr><td colspan="4" class="doc" id="conceptDecl0"><pre>Matches concept declarations.
-Example matches integral
- template<typename T>
- concept integral = std::is_integral_v<T>;
+Given
+ template<typename T> concept my_concept = true;
+
+
+The matcher conceptDecl()
+matches template<typename T>
+concept my_concept = true.
</pre></td></tr>
<tr><td>Matcher<<a href="https://clang.llvm.org/doxygen/classclang_1_1Decl.html">Decl</a>></td><td class="name" onclick="toggle('cxxConstructorDecl0')"><a name="cxxConstructorDecl0Anchor">cxxConstructorDecl</a></td><td>Matcher<<a href="https://clang.llvm.org/doxygen/classclang_1_1CXXConstructorDecl.html">CXXConstructorDecl</a>>...</td></tr>
<tr><td colspan="4" class="doc" id="cxxConstructorDecl0"><pre>Matches C++ constructor declarations.
-Example matches Foo::Foo() and Foo::Foo(int)
+Given
class Foo {
public:
Foo();
Foo(int);
int DoSomething();
};
+
+ struct Bar {};
+
+
+The matcher cxxConstructorDecl()
+matches Foo() and Foo(int).
</pre></td></tr>
<tr><td>Matcher<<a href="https://clang.llvm.org/doxygen/classclang_1_1Decl.html">Decl</a>></td><td class="name" onclick="toggle('cxxConversionDecl0')"><a name="cxxConversionDecl0Anchor">cxxConversionDecl</a></td><td>Matcher<<a href="https://clang.llvm.org/doxygen/classclang_1_1CXXConversionDecl.html">CXXConversionDecl</a>>...</td></tr>
<tr><td colspan="4" class="doc" id="cxxConversionDecl0"><pre>Matches conversion operator declarations.
-Example matches the operator.
+Given
class X { operator int() const; };
+
+
+The matcher cxxConversionDecl()
+matches operator int() const.
</pre></td></tr>
<tr><td>Matcher<<a href="https://clang.llvm.org/doxygen/classclang_1_1Decl.html">Decl</a>></td><td class="name" onclick="toggle('cxxDeductionGuideDecl0')"><a name="cxxDeductionGuideDecl0Anchor">cxxDeductionGuideDecl</a></td><td>Matcher<<a href="https://clang.llvm.org/doxygen/classclang_1_1CXXDeductionGuideDecl.html">CXXDeductionGuideDecl</a>>...</td></tr>
<tr><td colspan="4" class="doc" id="cxxDeductionGuideDecl0"><pre>Matches user-defined and implicitly generated deduction guide.
-Example matches the deduction guide.
+Given
template<typename T>
- class X { X(int) };
+ class X { X(int); };
X(int) -> X<int>;
+
+
+The matcher cxxDeductionGuideDecl()
+matches the written deduction guide
+auto (int) -> X<int>,
+the implicit copy deduction guide auto (int) -> X<T>
+and the implicitly declared deduction guide
+auto (X<T>) -> X<T>.
</pre></td></tr>
<tr><td>Matcher<<a href="https://clang.llvm.org/doxygen/classclang_1_1Decl.html">Decl</a>></td><td class="name" onclick="toggle('cxxDestructorDecl0')"><a name="cxxDestructorDecl0Anchor">cxxDestructorDecl</a></td><td>Matcher<<a href="https://clang.llvm.org/doxygen/classclang_1_1CXXDestructorDecl.html">CXXDestructorDecl</a>>...</td></tr>
<tr><td colspan="4" class="doc" id="cxxDestructorDecl0"><pre>Matches explicit C++ destructor declarations.
-Example matches Foo::~Foo()
+Given
class Foo {
public:
virtual ~Foo();
};
+
+ struct Bar {};
+
+
+The matcher cxxDestructorDecl()
+matches virtual ~Foo().
</pre></td></tr>
<tr><td>Matcher<<a href="https://clang.llvm.org/doxygen/classclang_1_1Decl.html">Decl</a>></td><td class="name" onclick="toggle('cxxMethodDecl0')"><a name="cxxMethodDecl0Anchor">cxxMethodDecl</a></td><td>Matcher<<a href="https://clang.llvm.org/doxygen/classclang_1_1CXXMethodDecl.html">CXXMethodDecl</a>>...</td></tr>
<tr><td colspan="4" class="doc" id="cxxMethodDecl0"><pre>Matches method declarations.
-Example matches y
+Given
class X { void y(); };
+
+
+The matcher cxxMethodDecl()
+matches void y().
</pre></td></tr>
<tr><td>Matcher<<a href="https://clang.llvm.org/doxygen/classclang_1_1Decl.html">Decl</a>></td><td class="name" onclick="toggle('cxxRecordDecl0')"><a name="cxxRecordDecl0Anchor">cxxRecordDecl</a></td><td>Matcher<<a href="https://clang.llvm.org/doxygen/classclang_1_1CXXRecordDecl.html">CXXRecordDecl</a>>...</td></tr>
<tr><td colspan="4" class="doc" id="cxxRecordDecl0"><pre>Matches C++ class declarations.
-Example matches X, Z
+Given
class X;
template<class T> class Z {};
+
+The matcher cxxRecordDecl()
+matches X and Z.
</pre></td></tr>
<tr><td>Matcher<<a href="https://clang.llvm.org/doxygen/classclang_1_1Decl.html">Decl</a>></td><td class="name" onclick="toggle('decl0')"><a name="decl0Anchor">decl</a></td><td>Matcher<<a href="https://clang.llvm.org/doxygen/classclang_1_1Decl.html">Decl</a>>...</td></tr>
<tr><td colspan="4" class="doc" id="decl0"><pre>Matches declarations.
-Examples matches X, C, and the friend declaration inside C;
+Given
void X();
class C {
- friend X;
+ friend void X();
};
+
+The matcher decl()
+matches void X(), C
+and friend void X().
</pre></td></tr>
@@ -767,40 +826,49 @@ <h2 id="decl-matchers">Node Matchers</h2>
Given
class X { int y; };
-declaratorDecl()
- matches int y.
+
+The matcher declaratorDecl()
+matches int y.
</pre></td></tr>
<tr><td>Matcher<<a href="https://clang.llvm.org/doxygen/classclang_1_1Decl.html">Decl</a>></td><td class="name" onclick="toggle('decompositionDecl0')"><a name="decompositionDecl0Anchor">decompositionDecl</a></td><td>Matcher<<a href="https://clang.llvm.org/doxygen/classclang_1_1DecompositionDecl.html">DecompositionDecl</a>>...</td></tr>
<tr><td colspan="4" class="doc" id="decompositionDecl0"><pre>Matches decomposition-declarations.
-Examples matches the declaration node with foo and bar, but not
-number.
-(matcher = declStmt(has(decompositionDecl())))
-
+Given
+ struct pair { int x; int y; };
+ pair make(int, int);
int number = 42;
- auto [foo, bar] = std::make_pair{42, 42};
+ auto [foo, bar] = make(42, 42);
+
+The matcher decompositionDecl()
+matches auto [foo, bar] = make(42, 42),
+but does not match number.
</pre></td></tr>
<tr><td>Matcher<<a href="https://clang.llvm.org/doxygen/classclang_1_1Decl.html">Decl</a>></td><td class="name" onclick="toggle('enumConstantDecl0')"><a name="enumConstantDecl0Anchor">enumConstantDecl</a></td><td>Matcher<<a href="https://clang.llvm.org/doxygen/classclang_1_1EnumConstantDecl.html">EnumConstantDecl</a>>...</td></tr>
<tr><td colspan="4" class="doc" id="enumConstantDecl0"><pre>Matches enum constants.
-Example matches A, B, C
+Given
enum X {
A, B, C
};
+The matcher enumConstantDecl()
+matches A, B and C.
</pre></td></tr>
<tr><td>Matcher<<a href="https://clang.llvm.org/doxygen/classclang_1_1Decl.html">Decl</a>></td><td class="name" onclick="toggle('enumDecl0')"><a name="enumDecl0Anchor">enumDecl</a></td><td>Matcher<<a href="https://clang.llvm.org/doxygen/classclang_1_1EnumDecl.html">EnumDecl</a>>...</td></tr>
<tr><td colspan="4" class="doc" id="enumDecl0"><pre>Matches enum declarations.
-Example matches X
+Given
enum X {
A, B, C
};
+
+The matcher enumDecl()
+matches the enum X.
</pre></td></tr>
@@ -808,9 +876,14 @@ <h2 id="decl-matchers">Node Matchers</h2>
<tr><td colspan="4" class="doc" id="fieldDecl0"><pre>Matches field declarations.
Given
- class X { int m; };
-fieldDecl()
- matches 'm'.
+ int a;
+ struct Foo {
+ int x;
+ };
+ void bar(int val);
+
+The matcher fieldDecl()
+matches int x.
</pre></td></tr>
@@ -819,16 +892,20 @@ <h2 id="decl-matchers">Node Matchers</h2>
Given
class X { friend void foo(); };
-friendDecl()
- matches 'friend void foo()'.
+
+The matcher friendDecl()
+matches friend void foo().
</pre></td></tr>
<tr><td>Matcher<<a href="https://clang.llvm.org/doxygen/classclang_1_1Decl.html">Decl</a>></td><td class="name" onclick="toggle('functionDecl0')"><a name="functionDecl0Anchor">functionDecl</a></td><td>Matcher<<a href="https://clang.llvm.org/doxygen/classclang_1_1FunctionDecl.html">FunctionDecl</a>>...</td></tr>
<tr><td colspan="4" class="doc" id="functionDecl0"><pre>Matches function declarations.
-Example matches f
+Given
void f();
+
+The matcher functionDecl()
+matches void f().
</pre></td></tr>
@@ -837,6 +914,10 @@ <h2 id="decl-matchers">Node Matchers</h2>
Example matches f
template<class T> void f(T t) {}
+
+
+The matcher functionTemplateDecl()
+matches template<class T> void f(T t) {}.
</pre></td></tr>
@@ -845,8 +926,8 @@ <h2 id="decl-matchers">Node Matchers</h2>
Given
struct X { struct { int a; }; };
-indirectFieldDecl()
- matches 'a'.
+The matcher indirectFieldDecl()
+matches a.
</pre></td></tr>
@@ -854,10 +935,13 @@ <h2 id="decl-matchers">Node Matchers</h2>
<tr><td colspan="4" class="doc" id="labelDecl0"><pre>Matches a declaration of label.
Given
- goto FOO;
- FOO: bar();
-labelDecl()
- matches 'FOO:'
+ void bar();
+ void foo() {
+ goto FOO;
+ FOO: bar();
+ }
+The matcher labelDecl()
+matches FOO: bar().
</pre></td></tr>
@@ -866,8 +950,9 @@ <h2 id="decl-matchers">Node Matchers</h2>
Given
extern "C" {}
-linkageSpecDecl()
- matches "extern "C" {}"
+
+The matcher linkageSpecDecl()
+matches extern "C" {}.
</pre></td></tr>
@@ -875,12 +960,18 @@ <h2 id="decl-matchers">Node Matchers</h2>
<tr><td colspan="4" class="doc" id="namedDecl0"><pre>Matches a declaration of anything that could have a name.
Example matches X, S, the anonymous union type, i, and U;
+Given
typedef int X;
struct S {
union {
int i;
} U;
};
+The matcher namedDecl()
+matches typedef int X, S, int i
+ and U,
+with S matching twice in C++.
+Once for the injected class name and once for the declaration itself.
</pre></td></tr>
@@ -890,8 +981,10 @@ <h2 id="decl-matchers">Node Matchers</h2>
Given
namespace test {}
namespace alias = ::test;
-namespaceAliasDecl()
- matches "namespace alias" but not "namespace test"
+
+The matcher namespaceAliasDecl()
+matches alias,
+but does not match test.
</pre></td></tr>
@@ -901,8 +994,9 @@ <h2 id="decl-matchers">Node Matchers</h2>
Given
namespace {}
namespace test {}
-namespaceDecl()
- matches "namespace {}" and "namespace test {}"
+
+The matcher namespaceDecl()
+matches namespace {} and namespace test {}.
</pre></td></tr>
@@ -911,8 +1005,10 @@ <h2 id="decl-matchers">Node Matchers</h2>
Given
template <typename T, int N> struct C {};
-nonTypeTemplateParmDecl()
- matches 'N', but not 'T'.
+
+The matcher nonTypeTemplateParmDecl()
+matches int N,
+but does not match typename T.
</pre></td></tr>
@@ -922,6 +1018,7 @@ <h2 id="decl-matchers">Node Matchers</h2>
Example matches Foo (Additions)
@interface Foo (Additions)
@end
+
</pre></td></tr>
@@ -931,6 +1028,7 @@ <h2 id="decl-matchers">Node Matchers</h2>
Example matches Foo (Additions)
@implementation Foo (Additions)
@end
+
</pre></td></tr>
@@ -940,6 +1038,7 @@ <h2 id="decl-matchers">Node Matchers</h2>
Example matches Foo
@implementation Foo
@end
+
</pre></td></tr>
@@ -949,6 +1048,7 @@ <h2 id="decl-matchers">Node Matchers</h2>
Example matches Foo
@interface Foo
@end
+
</pre></td></tr>
@@ -960,6 +1060,7 @@ <h2 id="decl-matchers">Node Matchers</h2>
BOOL _enabled;
}
@end
+
</pre></td></tr>
@@ -974,6 +1075,7 @@ <h2 id="decl-matchers">Node Matchers</h2>
@implementation Foo
- (void)method {}
@end
+
</pre></td></tr>
@@ -984,6 +1086,7 @@ <h2 id="decl-matchers">Node Matchers</h2>
@interface Foo
@property BOOL enabled;
@end
+
</pre></td></tr>
@@ -993,6 +1096,7 @@ <h2 id="decl-matchers">Node Matchers</h2>
Example matches FooDelegate
@protocol FooDelegate
@end
+
</pre></td></tr>
@@ -1001,48 +1105,58 @@ <h2 id="decl-matchers">Node Matchers</h2>
Given
void f(int x);
-parmVarDecl()
- matches int x.
+The matcher parmVarDecl()
+matches int x.
</pre></td></tr>
<tr><td>Matcher<<a href="https://clang.llvm.org/doxygen/classclang_1_1Decl.html">Decl</a>></td><td class="name" onclick="toggle('recordDecl0')"><a name="recordDecl0Anchor">recordDecl</a></td><td>Matcher<<a href="https://clang.llvm.org/doxygen/classclang_1_1RecordDecl.html">RecordDecl</a>>...</td></tr>
<tr><td colspan="4" class="doc" id="recordDecl0"><pre>Matches class, struct, and union declarations.
-Example matches X, Z, U, and S
+Given
class X;
template<class T> class Z {};
struct S {};
union U {};
+
+The matcher recordDecl()
+matches X, Z,
+S and U.
</pre></td></tr>
<tr><td>Matcher<<a href="https://clang.llvm.org/doxygen/classclang_1_1Decl.html">Decl</a>></td><td class="name" onclick="toggle('staticAssertDecl0')"><a name="staticAssertDecl0Anchor">staticAssertDecl</a></td><td>Matcher<<a href="https://clang.llvm.org/doxygen/classclang_1_1StaticAssertDecl.html">StaticAssertDecl</a>>...</td></tr>
<tr><td colspan="4" class="doc" id="staticAssertDecl0"><pre>Matches a C++ static_assert declaration.
-Example:
- staticAssertDecl()
-matches
- static_assert(sizeof(S) == sizeof(int))
-in
+Given
struct S {
int x;
};
static_assert(sizeof(S) == sizeof(int));
+
+
+The matcher staticAssertDecl()
+matches static_assert(sizeof(S) == sizeof(int)).
</pre></td></tr>
<tr><td>Matcher<<a href="https://clang.llvm.org/doxygen/classclang_1_1Decl.html">Decl</a>></td><td class="name" onclick="toggle('tagDecl0')"><a name="tagDecl0Anchor">tagDecl</a></td><td>Matcher<<a href="https://clang.llvm.org/doxygen/classclang_1_1TagDecl.html">TagDecl</a>>...</td></tr>
<tr><td colspan="4" class="doc" id="tagDecl0"><pre>Matches tag declarations.
-Example matches X, Z, U, S, E
+Given
class X;
template<class T> class Z {};
struct S {};
union U {};
- enum E {
- A, B, C
- };
+ enum E { A, B, C };
+
+
+The matcher tagDecl()
+matches class X, class Z {}, the injected class name
+class Z, struct S {},
+the injected class name struct S, union U {},
+the injected class name union U
+and enum E { A, B, C }.
</pre></td></tr>
@@ -1051,8 +1165,10 @@ <h2 id="decl-matchers">Node Matchers</h2>
Given
template <template <typename> class Z, int N> struct C {};
-templateTypeParmDecl()
- matches 'Z', but not 'N'.
+
+The matcher templateTemplateParmDecl()
+matches template <typename> class Z,
+but does not match int N.
</pre></td></tr>
@@ -1061,8 +1177,10 @@ <h2 id="decl-matchers">Node Matchers</h2>
Given
template <typename T, int N> struct C {};
-templateTypeParmDecl()
- matches 'T', but not 'N'.
+
+The matcher templateTypeParmDecl()
+matches typename T,
+but does not int N.
</pre></td></tr>
@@ -1072,10 +1190,12 @@ <h2 id="decl-matchers">Node Matchers</h2>
Given
int X;
namespace NS {
- int Y;
+ int Y;
} // namespace NS
-decl(hasDeclContext(translationUnitDecl()))
- matches "int X", but not "int Y".
+
+The matcher namedDecl(hasDeclContext(translationUnitDecl()))
+matches X and NS,
+but does not match Y.
</pre></td></tr>
@@ -1085,17 +1205,22 @@ <h2 id="decl-matchers">Node Matchers</h2>
Given
typedef int X;
using Y = int;
-typeAliasDecl()
- matches "using Y = int", but not "typedef int X"
+
+The matcher typeAliasDecl()
+matches using Y = int,
+but does not match typedef int X.
</pre></td></tr>
<tr><td>Matcher<<a href="https://clang.llvm.org/doxygen/classclang_1_1Decl.html">Decl</a>></td><td class="name" onclick="toggle('typeAliasTemplateDecl0')"><a name="typeAliasTemplateDecl0Anchor">typeAliasTemplateDecl</a></td><td>Matcher<<a href="https://clang.llvm.org/doxygen/classclang_1_1TypeAliasTemplateDecl.html">TypeAliasTemplateDecl</a>>...</td></tr>
<tr><td ...
[truncated]
|
CC @llvm/pr-subscribers-clang-tidy as stake-holders in matchers |
0c53f15
to
615f30b
Compare
1b4b4e4
to
2e90b54
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like precommit CI found relevant failures:
�_bk;t=1717501095887�FAILED: tools/clang/unittests/ASTMatchers/ASTMatchersDocTests.cpp
�_bk;t=1717501095887�cmd.exe /C "cd /D C:\ws\src\build\tools\clang\unittests\ASTMatchers && C:\ws\src\clang\utils\generate_ast_matcher_doc_tests.py --input-file C:/ws/src/clang/include/clang/ASTMatchers/ASTMatchers.h --output-file C:/ws/src/build/tools/clang/unittests/ASTMatchers/ASTMatchersDocTests.cpp"
�_bk;t=1717501095887� File "C:\ws\src\clang\utils\generate_ast_matcher_doc_tests.py", line 613
�_bk;t=1717501099131� const StringRef Code = R"cpp(\n{"\t#include \"cuda.h\"\n" if has_cuda else ""}{self.code})cpp";\n"""
�_bk;t=1717501099131� ^
�_bk;t=1717501099131�SyntaxError: f-string expression part cannot include a backslash
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should add some documentation to ASTMatchers.h about the new special syntax for comments so that users who hit test failures with the new automatic tests have some more help getting to a solution.
/// matches "int X", but not "int Y". | ||
/// \compile_args{-std=c++} | ||
/// The matcher \matcher{namedDecl(hasDeclContext(translationUnitDecl()))} | ||
/// matches \match{type=name$X} and \match{type=name$NS}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Under what circumstances do you need to use this special type=name$foo
syntax?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using type=name
can be generally considered to be a style/readability/expressiveness choice if the AST node supports it. The X
example would probably be better spelling the declaration out, the same goes for Y
(probably remnants of the early days). There may be other trivial examples that could be spelled out.
There are for sure some more trivial cases which could be spelled out. I'll check on the documentation again tomorrow and provide some updates (also w.r.t to your other comment).
If we wanted to spell out the namespace, we could, but that would require writing the NS
in a single line. It's an artificial limitation in the script that can probably be implemented if we want to have the option.
/// matches \match{void X()}, \match{type=name;count=2$C} | ||
/// and \match{count=2$friend void X()}. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain \match{type=name;count=2$C}
? I can see it matching class C
, but I'm wondering what the second match is (and should we add a comment explaining that other match?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|-FunctionDecl <line:1:1, col:8> col:6 X 'void ()'
`-CXXRecordDecl <line:2:1, line:4:1> line:2:7 class C definition
|-DefinitionData pass_in_registers empty aggregate standard_layout trivially_copyable pod trivial literal has_constexpr_non_copy_move_ctor can_const_default_init
| |-DefaultConstructor exists trivial constexpr needs_implicit defaulted_is_constexpr
| |-CopyConstructor simple trivial has_const_param needs_implicit implicit_has_const_param
| |-MoveConstructor exists simple trivial needs_implicit
| |-CopyAssignment simple trivial has_const_param needs_implicit implicit_has_const_param
| |-MoveAssignment exists simple trivial needs_implicit
| `-Destructor simple irrelevant trivial needs_implicit
|-CXXRecordDecl <col:1, col:7> col:7 implicit class C
`-FriendDecl <line:3:5, col:19> col:17
`-FunctionDecl parent 0xf23a388 prev 0xf284370 <col:5, col:19> col:17 friend X 'void ()'
Can you explain
\match{type=name;count=2$C}
?
That is the implicit class C
in the AST above. I couldn't access it from the top-level C
and I couldn't find a way from the implicit class C
back to the top-level one, so I don't know how to call it. I thought it would be a decl but not a definition, however, getDefinition
returns a nullptr
for the implicit class C
.
should we add a comment explaining that other match?
Certainly. I'll read the documentation again to see if there are more cases like this that could be improved as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, thank you! I kind of figured it was the implicit class declaration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should add some documentation to ASTMatchers.h
I'll add one in a day or so. I updated the description with some more information, and I'll probably take parts of that as a basis for the comment in the header (and update the script comment as well).
so that users who hit test failures with the new automatic tests have some more help getting to a solution.
There is now a What if ...?
section to the pr description, which I will put into the header comment as well.
/// matches \match{void X()}, \match{type=name;count=2$C} | ||
/// and \match{count=2$friend void X()}. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|-FunctionDecl <line:1:1, col:8> col:6 X 'void ()'
`-CXXRecordDecl <line:2:1, line:4:1> line:2:7 class C definition
|-DefinitionData pass_in_registers empty aggregate standard_layout trivially_copyable pod trivial literal has_constexpr_non_copy_move_ctor can_const_default_init
| |-DefaultConstructor exists trivial constexpr needs_implicit defaulted_is_constexpr
| |-CopyConstructor simple trivial has_const_param needs_implicit implicit_has_const_param
| |-MoveConstructor exists simple trivial needs_implicit
| |-CopyAssignment simple trivial has_const_param needs_implicit implicit_has_const_param
| |-MoveAssignment exists simple trivial needs_implicit
| `-Destructor simple irrelevant trivial needs_implicit
|-CXXRecordDecl <col:1, col:7> col:7 implicit class C
`-FriendDecl <line:3:5, col:19> col:17
`-FunctionDecl parent 0xf23a388 prev 0xf284370 <col:5, col:19> col:17 friend X 'void ()'
Can you explain
\match{type=name;count=2$C}
?
That is the implicit class C
in the AST above. I couldn't access it from the top-level C
and I couldn't find a way from the implicit class C
back to the top-level one, so I don't know how to call it. I thought it would be a decl but not a definition, however, getDefinition
returns a nullptr
for the implicit class C
.
should we add a comment explaining that other match?
Certainly. I'll read the documentation again to see if there are more cases like this that could be improved as well.
/// matches "int X", but not "int Y". | ||
/// \compile_args{-std=c++} | ||
/// The matcher \matcher{namedDecl(hasDeclContext(translationUnitDecl()))} | ||
/// matches \match{type=name$X} and \match{type=name$NS}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using type=name
can be generally considered to be a style/readability/expressiveness choice if the AST node supports it. The X
example would probably be better spelling the declaration out, the same goes for Y
(probably remnants of the early days). There may be other trivial examples that could be spelled out.
There are for sure some more trivial cases which could be spelled out. I'll check on the documentation again tomorrow and provide some updates (also w.r.t to your other comment).
If we wanted to spell out the namespace, we could, but that would require writing the NS
in a single line. It's an artificial limitation in the script that can probably be implemented if we want to have the option.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes generally LGTM, though I would appreciate a second set of eyes on the CMake and Python changes because I have a bit less confidence in my review abilities there.
Thank you for adding the documentation to the header file, I think that will help folks when working on their own matchers.
One question I have is: do you happen to know how this impacts build times for Clang itself? I'm assuming that if ASTMatchers.h isn't modified, CMake won't re-run generate_ast_matcher_doc_tests.py
and so the compile time performance hit is only on full rebuilds or when changing the header?
Thanks.
The 'state' of the generated file is only checked when the |
Excellent, thank you for the confirmation! That sounds reasonable to me. I think we've waited long enough for feedback on the cmake bits, so this is ready to land. We can address concerns post-commit. Do you need me to land the changes on your behalf? |
615f30b
to
51de627
Compare
c4a014e
to
fe4328c
Compare
rebase on trunk + rebased stack |
51de627
to
26d5b03
Compare
Previously, the examples in the AST matcher reference, which gets generated by the doxygen comments in `ASTMatchers.h`, were untested and best effort. Some of the matchers had no or wrong examples of how to use the matcher. This patch introduces a simple DSL around doxygen commands to enable testing the AST matcher documentation in a way that should be relatively easy. In `ASTMatchers.h`, most matchers are documented with a doxygen comment. Most of these also have a code example that aims to show what the matcher will match, given a matcher somewhere in the documentation text. The way that testing the documentation is done, is by using doxygens alias feature to declare custom aliases. These aliases forward to `<tt>text</tt>` (which is what doxygens \c does, but for multiple words). Using the doxygen aliases was the obvious choice, because there are (now) four consumers: - people reading the header/using signature help - the doxygen generated documentation - the generated html AST matcher reference - (new) the generated matcher tests This patch rewrites/extends the documentation such that all matchers have a documented example. The new `generate_ast_matcher_doc_tests.py` script will warn on any undocumented matchers (but not on matchers without a doxygen comment) and provides diagnostics and statistics about the matchers. Below is a file-level comment from the test generation script that describes how documenting matchers to be tested works on a slightly more technical level. In general, the new comments can be used as a reference for how to implement a tested documentation. The current statistics emitted by the parser are: ```text Statistics: doxygen_blocks : 519 missing_tests : 10 skipped_objc : 42 code_snippets : 503 matches : 820 matchers : 580 tested_matchers : 574 none_type_matchers : 6 ``` The tests are generated during building and the script will only print something if it found an issue (compile failure, parsing issues, the expected and actual number of failures differs). DSL for generating the tests from documentation. TLDR: The order for a single code snippet example is: \header{a.h} \endheader <- zero or more header \code int a = 42; \endcode \compile_args{-std=c++,c23-or-later} <- optional, supports std ranges and whole languages \matcher{expr()} <- one or more matchers in succession \match{42} <- one ore more matches in succession \matcher{varDecl()} <- new matcher resets the context, the above \match will not count for this new matcher(-group) \match{int a = 42} <- only applies to the previous matcher (no the previous case) The above block can be repeated inside of a doxygen command for multiple code examples. Language Grammar: [] denotes an optional, and <> denotes user-input compile_args j:= \compile_args{[<compile_arg>;]<compile_arg>} matcher_tag_key ::= type match_tag_key ::= type || std || count matcher_tags ::= [matcher_tag_key=<value>;]matcher_tag_key=<value> match_tags ::= [match_tag_key=<value>;]match_tag_key=<value> matcher ::= \matcher{[matcher_tags$]<matcher>} matchers ::= [matcher] matcher match ::= \match{[match_tags$]<match>} matches ::= [match] match case ::= matchers matches cases ::= [case] case header-block ::= \header{<name>} <code> \endheader code-block ::= \code <code> \endcode testcase ::= code-block [compile_args] cases The 'std' tag and '\compile_args' support specifying a specific language version, a whole language and all of it's versions, and thresholds (implies ranges). Multiple arguments are passed with a ',' seperator. For a language and version to execute a tested matcher, it has to match the specified '\compile_args' for the code, and the 'std' tag for the matcher. Predicates for the 'std' compiler flag are used with disjunction between languages (e.g. 'c || c++') and conjunction for all predicates specific to each language (e.g. 'c++11-or-later && c++23-or-earlier'). Examples: - c all available versions of C - c++11 only C++11 - c++11-or-later C++11 or later - c++11-or-earlier C++11 or earlier - c++11-or-later,c++23-or-earlier,c all of C and C++ between 11 and 23 (inclusive) - c++11-23,c same as above Tags: Type: Match types are used to select where the string that is used to check if a node matches comes from. Available: code, name, typestr, typeofstr. The default is 'code'. Matcher types are used to mark matchers as submatchers with 'sub' or as deactivated using 'none'. Testing submatchers is not implemented. Count: Specifying a 'count=n' on a match will result in a test that requires that the specified match will be matched n times. Default is 1. Std: A match allows specifying if it matches only in specific language versions. This may be needed when the AST differs between language versions. Fixes #57607 Fixes #63748
fe4328c
to
ad11a89
Compare
Problem Statement
Previously, the examples in the AST matcher reference, which gets generated by the doxygen comments in
ASTMatchers.h
, were untested and best effort.Some of the matchers had no or wrong examples of how to use the matcher.
Solution
This patch introduces a simple DSL around doxygen commands to enable testing the AST matcher documentation in a way that should be relatively easy to use.
In
ASTMatchers.h
, most matchers are documented with a doxygen comment. Most of these also have a code example that aims to show what the matcher will match, given a matcher somewhere in the documentation text. The way that the documentation is tested, is by using doxygen's alias feature to declare custom aliases. These aliases forward to<tt>text</tt>
(which is what doxygen's\c
does, but for multiple words). Using the doxygen aliases is the obvious choice, because there are (now) four consumers:This patch rewrites/extends the documentation such that all matchers have a documented example.
The new
generate_ast_matcher_doc_tests.py
script will warn on any undocumented matchers (but not on matchers without a doxygen comment) and provides diagnostics and statistics about the matchers.The current statistics emitted by the parser are:
The tests are generated during building and the script will only print something if it found an issue (compile failure, parsing issues, the expected and actual number of failures differs).
Description
DSL for generating the tests from documentation.
TLDR:
The above block can be repeated inside a doxygen command for multiple code examples for a single matcher.
The test generation script will only look for these annotations and ignore anything else like
\c
or the sentences where these annotations are embedded into:The matcher \matcher{expr()} matches the number \match{42}.
.Language Grammar
[] denotes an optional, and <> denotes user-input
Language Standard Versions
The 'std' tag and '\compile_args' support specifying a specific language version, a whole language and all of its versions, and thresholds (implies ranges). Multiple arguments are passed with a ',' separator. For a language and version to execute a tested matcher, it has to match the specified '\compile_args' for the code, and the 'std' tag for the matcher. Predicates for the 'std' compiler flag are used with disjunction between languages (e.g. 'c || c++') and conjunction for all predicates specific to each language (e.g. 'c++11-or-later && c++23-or-earlier').
Examples:
c
all available versions of Cc++11
only C++11c++11-or-later
C++11 or laterc++11-or-earlier
C++11 or earlierc++11-or-later,c++23-or-earlier,c
all of C and C++ between 11 and23 (inclusive)
c++11-23,c
same as aboveTags
type
:Match types are used to select where the string that is used to check if a node matches comes from.
Available:
code
,name
,typestr
,typeofstr
. The default iscode
.code
: Forwards totooling::fixit::getText(...)
and should be the preferred way to show what matches.name
: Casts the match to aNamedDecl
and returns the result ofgetNameAsString
. Useful when the matched AST node is not easy to spell out (code
type), e.g., namespaces or classes with many members.typestr
: Returns the result ofQualType::getAsString
for the type derived fromType
(otherwise, if it is derived fromDecl
, recurses withNode->getTypeForDecl()
)Matcher types are used to mark matchers as sub-matcher with 'sub' or as deactivated using 'none'. Testing sub-matcher is not implemented.
count
:Specifying a 'count=n' on a match will result in a test that requires that the specified match will be matched n times. Default is 1.
std
:A match allows specifying if it matches only in specific language versions. This may be needed when the AST differs between language versions.
sub
:The
sub
tag on a\match
will indicate that the match is for a node of a bound sub-matcher.E.g.,
\matcher{expr(expr().bind("inner"))}
has a sub-matcher that binds toinner
, which is the value for thesub
tag of the expected match for the sub-matcher\match{sub=inner$...}
. Currently, sub-matchers are not tested in any way.What if ...?
... I want to add a matcher?
Add a doxygen comment to the matcher with a code example, corresponding matchers and matches, that shows what the matcher is supposed to do. Specify the compile arguments/supported languages if required, and run
ninja check-clang-unit
to test the documentation.... the example I wrote is wrong?
The test-generation script will try to compile your example code before it continues. This makes finding issues with your example code easier because the test-failures are much more verbose.
The test-failure output of the generated test file will provide information about
ASTMatcher.h
the example is fromtype
s-target
flag (also in failure summary)... I don't adhere to the required order of the syntax?
The script will diagnose any found issues, such as
matcher is missing an example
with afile:line:
prefix,which should provide enough information about the issue.
... the script diagnoses a false-positive issue with a doxygen comment?
It hopefully shouldn't, but if you, e.g., added some non-matcher code and documented it with doxygen, then the script will consider that as a matcher documentation. As a result, the script will print that it detected a mismatch between the actual and the expected number of failures. If the diagnostic truly is a false-positive, change the
expected_failure_statistics
at the top of thegenerate_ast_matcher_doc_tests.py
file.Fixes #57607
Fixes #63748