Skip to content

Conversation

alejandro-alvarez-sonarsource
Copy link
Contributor

lookupRuntimeDefinition assumed that a process would handle only one TU. This is not true for unit tests, for instance. Multiple snippets of code get parsed, and their AST are unloaded each time.

Since the cache relies on pointers as keys, if the same address happens to be reused between runs, the cache would return a stale pointer, potentially causing a segmentation fault. This is not that unlikely if the snippets are similar, which would trigger similar allocation patterns.

CPP-4889

`lookupRuntimeDefinition` assumed that a process would handle only one
TU. This is not true for unit tests, for instance. Multiple snippets of
code would get parsed, and their AST would be unloaded each time.

Since the cache relies on pointers as keys, if the same address happens
to be reused between runs, the cache would return a stale pointer,
potentially causing a segmentation fault. This is not that unlikely if
the snippets are similar, which would trigger similar allocation
patterns.

CPP-4889
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:static analyzer labels Sep 30, 2025
@llvmbot
Copy link
Member

llvmbot commented Sep 30, 2025

@llvm/pr-subscribers-clang-static-analyzer-1

Author: Alejandro Álvarez Ayllón (alejandro-alvarez-sonarsource)

Changes

lookupRuntimeDefinition assumed that a process would handle only one TU. This is not true for unit tests, for instance. Multiple snippets of code get parsed, and their AST are unloaded each time.

Since the cache relies on pointers as keys, if the same address happens to be reused between runs, the cache would return a stale pointer, potentially causing a segmentation fault. This is not that unlikely if the snippets are similar, which would trigger similar allocation patterns.

CPP-4889


Full diff: https://github.com/llvm/llvm-project/pull/161327.diff

3 Files Affected:

  • (modified) clang/include/clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h (+1-1)
  • (modified) clang/lib/StaticAnalyzer/Core/CallEvent.cpp (+17-14)
  • (modified) clang/unittests/StaticAnalyzer/CallEventTest.cpp (+40)
diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h
index 5dcf03f7a4648..c233ca1af0256 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h
@@ -1414,7 +1414,7 @@ class CallEventManager {
   }
 
 public:
-  CallEventManager(llvm::BumpPtrAllocator &alloc) : Alloc(alloc) {}
+  CallEventManager(llvm::BumpPtrAllocator &alloc);
 
   /// Gets an outside caller given a callee context.
   CallEventRef<> getCaller(const StackFrameContext *CalleeCtx,
diff --git a/clang/lib/StaticAnalyzer/Core/CallEvent.cpp b/clang/lib/StaticAnalyzer/Core/CallEvent.cpp
index 180056cf68b64..06ba01507fa4f 100644
--- a/clang/lib/StaticAnalyzer/Core/CallEvent.cpp
+++ b/clang/lib/StaticAnalyzer/Core/CallEvent.cpp
@@ -1254,6 +1254,15 @@ template <> struct DenseMapInfo<PrivateMethodKey> {
 };
 } // end namespace llvm
 
+// NOTE: This cache is a "global" variable, and it is cleared by
+// CallEventManager's constructor so we do not keep old entries when
+// loading/unloading ASTs. If we are worried about concurrency, we may  need to
+// revisit this someday. In terms of memory, this table stays around until clang
+// quits, which also may be bad if we need to release memory.
+using PrivateMethodCacheTy =
+    llvm::DenseMap<PrivateMethodKey, std::optional<const ObjCMethodDecl *>>;
+static PrivateMethodCacheTy PrivateMethodCache;
+
 static const ObjCMethodDecl *
 lookupRuntimeDefinition(const ObjCInterfaceDecl *Interface,
                         Selector LookupSelector, bool InstanceMethod) {
@@ -1262,21 +1271,8 @@ lookupRuntimeDefinition(const ObjCInterfaceDecl *Interface,
   // that repeated queries on the same ObjCIntefaceDecl and Selector
   // don't incur the same cost.  On some test cases, we can see the
   // same query being issued thousands of times.
-  //
-  // NOTE: This cache is essentially a "global" variable, but it
-  // only gets lazily created when we get here.  The value of the
-  // cache probably comes from it being global across ExprEngines,
-  // where the same queries may get issued.  If we are worried about
-  // concurrency, or possibly loading/unloading ASTs, etc., we may
-  // need to revisit this someday.  In terms of memory, this table
-  // stays around until clang quits, which also may be bad if we
-  // need to release memory.
-  using PrivateMethodCache =
-      llvm::DenseMap<PrivateMethodKey, std::optional<const ObjCMethodDecl *>>;
-
-  static PrivateMethodCache PMC;
   std::optional<const ObjCMethodDecl *> &Val =
-      PMC[{Interface, LookupSelector, InstanceMethod}];
+      PrivateMethodCache[{Interface, LookupSelector, InstanceMethod}];
 
   // Query lookupPrivateMethod() if the cache does not hit.
   if (!Val) {
@@ -1422,6 +1418,13 @@ void ObjCMethodCall::getInitialStackFrameContents(
   }
 }
 
+CallEventManager::CallEventManager(llvm::BumpPtrAllocator &alloc)
+    : Alloc(alloc) {
+  // Clear the method cache to avoid hits when multiple AST are loaded/unloaded
+  // within a single process. This can happen with unit tests, for instance.
+  PrivateMethodCache.clear();
+}
+
 CallEventRef<>
 CallEventManager::getSimpleCall(const CallExpr *CE, ProgramStateRef State,
                                 const LocationContext *LCtx,
diff --git a/clang/unittests/StaticAnalyzer/CallEventTest.cpp b/clang/unittests/StaticAnalyzer/CallEventTest.cpp
index 8b5289ea7472b..9015afdae87fc 100644
--- a/clang/unittests/StaticAnalyzer/CallEventTest.cpp
+++ b/clang/unittests/StaticAnalyzer/CallEventTest.cpp
@@ -84,6 +84,46 @@ TEST(CXXDeallocatorCall, SimpleDestructor) {
 #endif
 }
 
+TEST(PrivateMethodCache, NeverReturnDanglingPointersWithMultipleASTs) {
+  // Each iteration will load and unload an AST multiple times. Since the code is always the same,
+  // we increase the chance of hitting a bug in the private method cache, returning a dangling pointer and
+  // crashing the process. If the cache is properly cleared between runs, the test should pass.
+  for (int I = 0; I < 100; ++I) {
+    auto const *Code = R"(
+    typedef __typeof(sizeof(int)) size_t;
+
+    extern void *malloc(size_t size);
+    extern void *memcpy(void *dest, const void *src, size_t n);
+
+    @interface SomeMoreData {
+      char const* _buffer;
+      int _size;
+    }
+    @property(nonatomic, readonly) const char* buffer;
+    @property(nonatomic) int size;
+
+    - (void)appendData:(SomeMoreData*)other;
+
+    @end
+
+    @implementation SomeMoreData
+    @synthesize size = _size;
+    @synthesize buffer = _buffer;
+
+    - (void)appendData:(SomeMoreData*)other {
+      int const len = (_size + other.size); // implicit self._length
+      char* d = malloc(sizeof(char) * len);
+      memcpy(d + 20, other.buffer, len);
+    }
+
+    @end
+  )";
+    std::string Diags;
+    EXPECT_TRUE(runCheckerOnCodeWithArgs<addCXXDeallocatorChecker>(
+        Code, {"-x", "objective-c", "-Wno-objc-root-class"}, Diags));
+  }
+}
+
 } // namespace
 } // namespace ento
 } // namespace clang

@llvmbot
Copy link
Member

llvmbot commented Sep 30, 2025

@llvm/pr-subscribers-clang

Author: Alejandro Álvarez Ayllón (alejandro-alvarez-sonarsource)

Changes

lookupRuntimeDefinition assumed that a process would handle only one TU. This is not true for unit tests, for instance. Multiple snippets of code get parsed, and their AST are unloaded each time.

Since the cache relies on pointers as keys, if the same address happens to be reused between runs, the cache would return a stale pointer, potentially causing a segmentation fault. This is not that unlikely if the snippets are similar, which would trigger similar allocation patterns.

CPP-4889


Full diff: https://github.com/llvm/llvm-project/pull/161327.diff

3 Files Affected:

  • (modified) clang/include/clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h (+1-1)
  • (modified) clang/lib/StaticAnalyzer/Core/CallEvent.cpp (+17-14)
  • (modified) clang/unittests/StaticAnalyzer/CallEventTest.cpp (+40)
diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h
index 5dcf03f7a4648..c233ca1af0256 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h
@@ -1414,7 +1414,7 @@ class CallEventManager {
   }
 
 public:
-  CallEventManager(llvm::BumpPtrAllocator &alloc) : Alloc(alloc) {}
+  CallEventManager(llvm::BumpPtrAllocator &alloc);
 
   /// Gets an outside caller given a callee context.
   CallEventRef<> getCaller(const StackFrameContext *CalleeCtx,
diff --git a/clang/lib/StaticAnalyzer/Core/CallEvent.cpp b/clang/lib/StaticAnalyzer/Core/CallEvent.cpp
index 180056cf68b64..06ba01507fa4f 100644
--- a/clang/lib/StaticAnalyzer/Core/CallEvent.cpp
+++ b/clang/lib/StaticAnalyzer/Core/CallEvent.cpp
@@ -1254,6 +1254,15 @@ template <> struct DenseMapInfo<PrivateMethodKey> {
 };
 } // end namespace llvm
 
+// NOTE: This cache is a "global" variable, and it is cleared by
+// CallEventManager's constructor so we do not keep old entries when
+// loading/unloading ASTs. If we are worried about concurrency, we may  need to
+// revisit this someday. In terms of memory, this table stays around until clang
+// quits, which also may be bad if we need to release memory.
+using PrivateMethodCacheTy =
+    llvm::DenseMap<PrivateMethodKey, std::optional<const ObjCMethodDecl *>>;
+static PrivateMethodCacheTy PrivateMethodCache;
+
 static const ObjCMethodDecl *
 lookupRuntimeDefinition(const ObjCInterfaceDecl *Interface,
                         Selector LookupSelector, bool InstanceMethod) {
@@ -1262,21 +1271,8 @@ lookupRuntimeDefinition(const ObjCInterfaceDecl *Interface,
   // that repeated queries on the same ObjCIntefaceDecl and Selector
   // don't incur the same cost.  On some test cases, we can see the
   // same query being issued thousands of times.
-  //
-  // NOTE: This cache is essentially a "global" variable, but it
-  // only gets lazily created when we get here.  The value of the
-  // cache probably comes from it being global across ExprEngines,
-  // where the same queries may get issued.  If we are worried about
-  // concurrency, or possibly loading/unloading ASTs, etc., we may
-  // need to revisit this someday.  In terms of memory, this table
-  // stays around until clang quits, which also may be bad if we
-  // need to release memory.
-  using PrivateMethodCache =
-      llvm::DenseMap<PrivateMethodKey, std::optional<const ObjCMethodDecl *>>;
-
-  static PrivateMethodCache PMC;
   std::optional<const ObjCMethodDecl *> &Val =
-      PMC[{Interface, LookupSelector, InstanceMethod}];
+      PrivateMethodCache[{Interface, LookupSelector, InstanceMethod}];
 
   // Query lookupPrivateMethod() if the cache does not hit.
   if (!Val) {
@@ -1422,6 +1418,13 @@ void ObjCMethodCall::getInitialStackFrameContents(
   }
 }
 
+CallEventManager::CallEventManager(llvm::BumpPtrAllocator &alloc)
+    : Alloc(alloc) {
+  // Clear the method cache to avoid hits when multiple AST are loaded/unloaded
+  // within a single process. This can happen with unit tests, for instance.
+  PrivateMethodCache.clear();
+}
+
 CallEventRef<>
 CallEventManager::getSimpleCall(const CallExpr *CE, ProgramStateRef State,
                                 const LocationContext *LCtx,
diff --git a/clang/unittests/StaticAnalyzer/CallEventTest.cpp b/clang/unittests/StaticAnalyzer/CallEventTest.cpp
index 8b5289ea7472b..9015afdae87fc 100644
--- a/clang/unittests/StaticAnalyzer/CallEventTest.cpp
+++ b/clang/unittests/StaticAnalyzer/CallEventTest.cpp
@@ -84,6 +84,46 @@ TEST(CXXDeallocatorCall, SimpleDestructor) {
 #endif
 }
 
+TEST(PrivateMethodCache, NeverReturnDanglingPointersWithMultipleASTs) {
+  // Each iteration will load and unload an AST multiple times. Since the code is always the same,
+  // we increase the chance of hitting a bug in the private method cache, returning a dangling pointer and
+  // crashing the process. If the cache is properly cleared between runs, the test should pass.
+  for (int I = 0; I < 100; ++I) {
+    auto const *Code = R"(
+    typedef __typeof(sizeof(int)) size_t;
+
+    extern void *malloc(size_t size);
+    extern void *memcpy(void *dest, const void *src, size_t n);
+
+    @interface SomeMoreData {
+      char const* _buffer;
+      int _size;
+    }
+    @property(nonatomic, readonly) const char* buffer;
+    @property(nonatomic) int size;
+
+    - (void)appendData:(SomeMoreData*)other;
+
+    @end
+
+    @implementation SomeMoreData
+    @synthesize size = _size;
+    @synthesize buffer = _buffer;
+
+    - (void)appendData:(SomeMoreData*)other {
+      int const len = (_size + other.size); // implicit self._length
+      char* d = malloc(sizeof(char) * len);
+      memcpy(d + 20, other.buffer, len);
+    }
+
+    @end
+  )";
+    std::string Diags;
+    EXPECT_TRUE(runCheckerOnCodeWithArgs<addCXXDeallocatorChecker>(
+        Code, {"-x", "objective-c", "-Wno-objc-root-class"}, Diags));
+  }
+}
+
 } // namespace
 } // namespace ento
 } // namespace clang

@balazs-benics-sonarsource
Copy link
Contributor

I remember debugging this. Fun times.
I wonder if we could do even better by smuggling in some reference to that helper function, so we could move that global into a field. E.g. as a sibling of the allocator pool.

@alejandro-alvarez-sonarsource
Copy link
Contributor Author

alejandro-alvarez-sonarsource commented Sep 30, 2025

I remember debugging this. Fun times. I wonder if we could do even better by smuggling in some reference to that helper function, so we could move that global into a field. E.g. as a sibling of the allocator pool.

I have been looking at this, and I am not sure is it worth? IIUC the allocator comes from ExprEngine.Engine.G, and the CallEventManager belongs to ProgramStateManager that also belongs to the ExprEngine. So they should be destroyed at the "same" time.
This said, instead of having a static, just having it be a member of CallEvenManager should be equivalent (and thread safe), shouldn't it?

Ah, but the content of the cache comes from a different allocator pool. Maybe creating it into ASTContext and rely on AddDeallocation would work and would keep the cache alive for longer.

It seems you did, so if you say it's not worth it, I'm not going to insist.

TBH I did after you proposed to consider it :)

@balazs-benics-sonarsource
Copy link
Contributor

I remember debugging this. Fun times. I wonder if we could do even better by smuggling in some reference to that helper function, so we could move that global into a field. E.g. as a sibling of the allocator pool.

I have been looking at this, and I am not sure is it worth? IIUC the allocator comes from ExprEngine.Engine.G, and the CallEventManager belongs to ProgramStateManager that also belongs to the ExprEngine. So they should be destroyed at the "same" time. This said, instead of having a static, just having it be a member of CallEvenManager should be equivalent (and thread safe), shouldn't it?

Technically the solution is correct. It's just a matter of taste of using globals. That said, I'm not opposing, I just wanted to know if you considered this at all. It seems you did, so if you say it's not worth it, I'm not going to insist.

@balazs-benics-sonarsource
Copy link
Contributor

LGTM btw. @NagyDonat

Copy link
Contributor

@NagyDonat NagyDonat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, with the disclaimer that I'm unfamiliar with the handling of Objective-C in the analyzer.

Obviously this approach with the global cache is not exactly "best practices", but the PR suggests a straightforward fix for a concrete issue, so there is no reason to oppose it.

@alejandro-alvarez-sonarsource alejandro-alvarez-sonarsource merged commit 5f5a84e into llvm:main Oct 2, 2025
9 checks passed
@alejandro-alvarez-sonarsource alejandro-alvarez-sonarsource deleted the aa/upstream/objc_method_cache branch October 2, 2025 08:02
mahesh-attarde pushed a commit to mahesh-attarde/llvm-project that referenced this pull request Oct 3, 2025
…61327)

`lookupRuntimeDefinition` assumed that a process would handle only one
TU. This is not true for unit tests, for instance. Multiple snippets of
code get parsed, and their AST are unloaded each time.

Since the cache relies on pointers as keys, if the same address happens
to be reused between runs, the cache would return a stale pointer,
potentially causing a segmentation fault. This is not that unlikely if
the snippets are similar, which would trigger similar allocation
patterns.

CPP-4889
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:static analyzer clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants