Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[clang][ExtractAPI] fix a couple crashes when used via libclang #132297

Merged
merged 2 commits into from
Mar 26, 2025

Conversation

QuietMisdreavus
Copy link
Contributor

This PR fixes two crashes in ExtractAPI that occur when decls are requested via libclang:

  • A null-dereference would sometimes happen in DeclarationFragmentsBuilder::getFragmentsForClassTemplateSpecialization when the template being processed was loaded indirectly via a typedef, with parameters filled in. The first commit loads the template parameter locations ahead of time to perform a null check before dereferencing.
  • An assertion (or another null-dereference) was happening in CXXRecordDecl::bases when processing a forward-declaration (i.e. a record without a definition). The second commit guards the use of bases in ExtractAPIVisitorBase::getBases by first checking that the decl in question has a complete definition.

The added test extract-api-cursor-cpp adds tests for these two scenarios to protect against the crash in the future.

Fixes rdar://140592475, fixes rdar://123430367

@llvmbot llvmbot added the clang Clang issues not falling into any other category label Mar 20, 2025
@llvmbot
Copy link
Member

llvmbot commented Mar 20, 2025

@llvm/pr-subscribers-clang

Author: QuietMisdreavus (QuietMisdreavus)

Changes

This PR fixes two crashes in ExtractAPI that occur when decls are requested via libclang:

  • A null-dereference would sometimes happen in DeclarationFragmentsBuilder::getFragmentsForClassTemplateSpecialization when the template being processed was loaded indirectly via a typedef, with parameters filled in. The first commit loads the template parameter locations ahead of time to perform a null check before dereferencing.
  • An assertion (or another null-dereference) was happening in CXXRecordDecl::bases when processing a forward-declaration (i.e. a record without a definition). The second commit guards the use of bases in ExtractAPIVisitorBase::getBases by first checking that the decl in question has a complete definition.

The added test extract-api-cursor-cpp adds tests for these two scenarios to protect against the crash in the future.

Fixes rdar://140592475, fixes rdar://123430367


Full diff: https://github.com/llvm/llvm-project/pull/132297.diff

3 Files Affected:

  • (modified) clang/include/clang/ExtractAPI/ExtractAPIVisitor.h (+22-20)
  • (modified) clang/lib/ExtractAPI/DeclarationFragments.cpp (+5-1)
  • (added) clang/test/Index/extract-api-cursor-cpp.cpp (+33)
diff --git a/clang/include/clang/ExtractAPI/ExtractAPIVisitor.h b/clang/include/clang/ExtractAPI/ExtractAPIVisitor.h
index e60440e14a9fe..13e51413017d5 100644
--- a/clang/include/clang/ExtractAPI/ExtractAPIVisitor.h
+++ b/clang/include/clang/ExtractAPI/ExtractAPIVisitor.h
@@ -175,28 +175,30 @@ class ExtractAPIVisitorBase : public RecursiveASTVisitor<Derived> {
   SmallVector<SymbolReference> getBases(const CXXRecordDecl *Decl) {
     // FIXME: store AccessSpecifier given by inheritance
     SmallVector<SymbolReference> Bases;
-    for (const auto &BaseSpecifier : Decl->bases()) {
-      // skip classes not inherited as public
-      if (BaseSpecifier.getAccessSpecifier() != AccessSpecifier::AS_public)
-        continue;
-      if (auto *BaseDecl = BaseSpecifier.getType()->getAsTagDecl()) {
-        Bases.emplace_back(createSymbolReferenceForDecl(*BaseDecl));
-      } else {
-        SymbolReference BaseClass;
-        BaseClass.Name = API.copyString(BaseSpecifier.getType().getAsString(
-            Decl->getASTContext().getPrintingPolicy()));
-
-        if (BaseSpecifier.getType().getTypePtr()->isTemplateTypeParmType()) {
-          if (auto *TTPTD = BaseSpecifier.getType()
-                                ->getAs<TemplateTypeParmType>()
-                                ->getDecl()) {
-            SmallString<128> USR;
-            index::generateUSRForDecl(TTPTD, USR);
-            BaseClass.USR = API.copyString(USR);
-            BaseClass.Source = API.copyString(getOwningModuleName(*TTPTD));
+    if (Decl->isCompleteDefinition()) {
+      for (const auto &BaseSpecifier : Decl->bases()) {
+        // skip classes not inherited as public
+        if (BaseSpecifier.getAccessSpecifier() != AccessSpecifier::AS_public)
+          continue;
+        if (auto *BaseDecl = BaseSpecifier.getType()->getAsTagDecl()) {
+          Bases.emplace_back(createSymbolReferenceForDecl(*BaseDecl));
+        } else {
+          SymbolReference BaseClass;
+          BaseClass.Name = API.copyString(BaseSpecifier.getType().getAsString(
+              Decl->getASTContext().getPrintingPolicy()));
+
+          if (BaseSpecifier.getType().getTypePtr()->isTemplateTypeParmType()) {
+            if (auto *TTPTD = BaseSpecifier.getType()
+                                  ->getAs<TemplateTypeParmType>()
+                                  ->getDecl()) {
+              SmallString<128> USR;
+              index::generateUSRForDecl(TTPTD, USR);
+              BaseClass.USR = API.copyString(USR);
+              BaseClass.Source = API.copyString(getOwningModuleName(*TTPTD));
+            }
           }
+          Bases.emplace_back(BaseClass);
         }
-        Bases.emplace_back(BaseClass);
       }
     }
     return Bases;
diff --git a/clang/lib/ExtractAPI/DeclarationFragments.cpp b/clang/lib/ExtractAPI/DeclarationFragments.cpp
index 66c03863293c2..480e33f607bb0 100644
--- a/clang/lib/ExtractAPI/DeclarationFragments.cpp
+++ b/clang/lib/ExtractAPI/DeclarationFragments.cpp
@@ -1225,6 +1225,10 @@ DeclarationFragments
 DeclarationFragmentsBuilder::getFragmentsForClassTemplateSpecialization(
     const ClassTemplateSpecializationDecl *Decl) {
   DeclarationFragments Fragments;
+  std::optional<ArrayRef<TemplateArgumentLoc>> TemplateArgumentLocs = {};
+  if (auto *TemplateArgs = Decl->getTemplateArgsAsWritten()) {
+    TemplateArgumentLocs = TemplateArgs->arguments();
+  }
   return Fragments
       .append("template", DeclarationFragments::FragmentKind::Keyword)
       .appendSpace()
@@ -1237,7 +1241,7 @@ DeclarationFragmentsBuilder::getFragmentsForClassTemplateSpecialization(
       .append("<", DeclarationFragments::FragmentKind::Text)
       .append(getFragmentsForTemplateArguments(
           Decl->getTemplateArgs().asArray(), Decl->getASTContext(),
-          Decl->getTemplateArgsAsWritten()->arguments()))
+          TemplateArgumentLocs))
       .append(">", DeclarationFragments::FragmentKind::Text)
       .appendSemicolon();
 }
diff --git a/clang/test/Index/extract-api-cursor-cpp.cpp b/clang/test/Index/extract-api-cursor-cpp.cpp
new file mode 100644
index 0000000000000..c3219ddece73a
--- /dev/null
+++ b/clang/test/Index/extract-api-cursor-cpp.cpp
@@ -0,0 +1,33 @@
+// Test is line- and column-sensitive. Run lines are below
+
+template <typename T>
+class basic_vector {
+public:
+    T x;
+    T y;
+};
+
+using my_vec = basic_vector<int>;
+
+class MyClass {
+    my_vec myVec;
+};
+
+struct OuterStruct {
+    struct InnerStruct;
+    int outer_field;
+};
+
+// RUN: c-index-test -single-symbol-sgf-at=%s:13:13 local %s | FileCheck --check-prefix=CHECK-MYVEC %s
+// CHECK-MYVEC: "parentContexts":[{"kind":"c++.class","name":"MyClass","usr":"c:@S@MyClass"},{"kind":"c++.property","name":"myVec","usr":"c:@S@MyClass@FI@myVec"}]
+// CHECK-MYVEC: "identifier":{"interfaceLanguage":"c++","precise":"c:@S@MyClass@FI@myVec"}
+// CHECK-MYVEC: "kind":{"displayName":"Instance Property","identifier":"c++.property"}
+// CHECK-MYVEC: "title":"myVec"
+// CHECK-MYVEC: "pathComponents":["MyClass","myVec"]
+
+// RUN: c-index-test -single-symbol-sgf-at=%s:17:17 local %s | FileCheck --check-prefix=CHECK-INNER %s
+// CHECK-INNER: "parentContexts":[{"kind":"c++.struct","name":"OuterStruct","usr":"c:@S@OuterStruct"},{"kind":"c++.struct","name":"InnerStruct","usr":"c:@S@OuterStruct@S@InnerStruct"}]
+// CHECK-INNER: "identifier":{"interfaceLanguage":"c++","precise":"c:@S@OuterStruct@S@InnerStruct"}
+// CHECK-INNER: "kind":{"displayName":"Structure","identifier":"c++.struct"}
+// CHECK-INNER: "title":"InnerStruct"
+// CHECK-INNER: "pathComponents":["OuterStruct","InnerStruct"]

Copy link
Member

@zixu-w zixu-w left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little bit worried that we are seeing incomplete decls and fixing by guarding with null-checks. The concern is that we could be losing information by just skipping over these null fields without knowing all probable causes. For example I've seen bugs with redecl chains and modules that leave decls in incomplete states (not saying that this applies to ExtractAPI, it very likely does not). And I'm not an expert in libclang to understand what to expect from an AST from it.

I guess if this is for single-line-symbol-graph, it probably doesn't matter that much if we are not getting the missing information? But then the visitor code is shared with the full ExtractAPI scan and it would be nice to understand if this change would impact that.


// RUN: c-index-test -single-symbol-sgf-at=%s:13:7 local %s | FileCheck --check-prefix=CHECK-VEC-TYPE %s
// CHECK-VEC-TYPE: "parentContexts":[{"kind":"c++.typealias","name":"my_vec","usr":"c:@my_vec"}]
// CHECK-VEC-TYPE: "declarationFragments":[{"kind":"keyword","spelling":"typedef"},{"kind":"text","spelling":" "},{"kind":"typeIdentifier","preciseIdentifier":"c:@ST>1#T@basic_vector","spelling":"basic_vector"},{"kind":"text","spelling":"<"},{"kind":"typeIdentifier","preciseIdentifier":"c:I","spelling":"int"},{"kind":"text","spelling":"> "},{"kind":"identifier","spelling":"my_vec"},{"kind":"text","spelling":";"}]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not an issue for this change, but I guess for further C++ support we might want to correctly record the spelled keyword using in the decl fragments. Right now this is using typedef for all type aliases.

Might be an editorial choice to use a consistent type alias keyword, but it might be better to preserve the actual spelling. Plus I think there are some minor distinctions and eligible contexts between the two.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely but this should definitely be handled in a follow up patch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, i agree that this is good, but we should track that separately.

@QuietMisdreavus
Copy link
Contributor Author

I rebased the PR to squash the review commits into their respective patches, but then i noticed that i can't do a merge that preserves the individual commits without pushing to main directly. 😅 Thanks for the review!

@QuietMisdreavus QuietMisdreavus merged commit 3386156 into llvm:main Mar 26, 2025
10 of 11 checks passed
@QuietMisdreavus QuietMisdreavus deleted the libclang-sgf-crashes branch March 26, 2025 23:46
QuietMisdreavus added a commit to swiftlang/llvm-project that referenced this pull request Mar 26, 2025
…#132297)

This PR fixes two crashes in ExtractAPI that occur when decls are
requested via libclang:

- A null-dereference would sometimes happen in
`DeclarationFragmentsBuilder::getFragmentsForClassTemplateSpecialization`
when the template being processed was loaded indirectly via a typedef,
with parameters filled in. The first commit loads the template parameter
locations ahead of time to perform a null check before dereferencing.
- An assertion (or another null-dereference) was happening in
`CXXRecordDecl::bases` when processing a forward-declaration (i.e. a
record without a definition). The second commit guards the use of
`bases` in `ExtractAPIVisitorBase::getBases` by first checking that the
decl in question has a complete definition.

The added test `extract-api-cursor-cpp` adds tests for these two
scenarios to protect against the crash in the future.

Fixes rdar://140592475, fixes rdar://123430367
QuietMisdreavus added a commit to swiftlang/llvm-project that referenced this pull request Mar 27, 2025
…#132297) (#10362)

This PR fixes two crashes in ExtractAPI that occur when decls are
requested via libclang:

- A null-dereference would sometimes happen in
`DeclarationFragmentsBuilder::getFragmentsForClassTemplateSpecialization`
when the template being processed was loaded indirectly via a typedef,
with parameters filled in. The first commit loads the template parameter
locations ahead of time to perform a null check before dereferencing.
- An assertion (or another null-dereference) was happening in
`CXXRecordDecl::bases` when processing a forward-declaration (i.e. a
record without a definition). The second commit guards the use of
`bases` in `ExtractAPIVisitorBase::getBases` by first checking that the
decl in question has a complete definition.

The added test `extract-api-cursor-cpp` adds tests for these two
scenarios to protect against the crash in the future.

Fixes rdar://140592475, fixes rdar://123430367
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants