Skip to content

Conversation

balazske
Copy link
Collaborator

Symbols used for dynamic extent information of memory regions are now kept as live as long as the memory region exists.

Symbols used for dynamic extent information of memory regions
are now kept as live as long as the memory region exists.
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:static analyzer labels Oct 15, 2025
@llvmbot
Copy link
Member

llvmbot commented Oct 15, 2025

@llvm/pr-subscribers-clang

Author: Balázs Kéri (balazske)

Changes

Symbols used for dynamic extent information of memory regions are now kept as live as long as the memory region exists.


Full diff: https://github.com/llvm/llvm-project/pull/163562.diff

4 Files Affected:

  • (modified) clang/include/clang/StaticAnalyzer/Core/PathSensitive/DynamicExtent.h (+2)
  • (modified) clang/lib/StaticAnalyzer/Core/DynamicExtent.cpp (+6)
  • (modified) clang/lib/StaticAnalyzer/Core/ExprEngine.cpp (+2)
  • (modified) clang/test/Analysis/ArrayBound/verbose-tests.c (-18)
diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/DynamicExtent.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/DynamicExtent.h
index 1a9bef06b15a4..440603fb4d8c7 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/DynamicExtent.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/DynamicExtent.h
@@ -58,6 +58,8 @@ SVal getDynamicExtentWithOffset(ProgramStateRef State, SVal BufV);
 DefinedOrUnknownSVal getDynamicElementCountWithOffset(ProgramStateRef State,
                                                       SVal BufV, QualType Ty);
 
+void markAllDynamicExtentLive(ProgramStateRef State, SymbolReaper &SymReaper);
+
 } // namespace ento
 } // namespace clang
 
diff --git a/clang/lib/StaticAnalyzer/Core/DynamicExtent.cpp b/clang/lib/StaticAnalyzer/Core/DynamicExtent.cpp
index 34078dbce0b68..e436b186a2148 100644
--- a/clang/lib/StaticAnalyzer/Core/DynamicExtent.cpp
+++ b/clang/lib/StaticAnalyzer/Core/DynamicExtent.cpp
@@ -128,5 +128,11 @@ ProgramStateRef setDynamicExtent(ProgramStateRef State, const MemRegion *MR,
   return State->set<DynamicExtentMap>(MR->StripCasts(), Size);
 }
 
+void markAllDynamicExtentLive(ProgramStateRef State, SymbolReaper &SymReaper) {
+  for (const auto &I : State->get<DynamicExtentMap>())
+    if (SymbolRef Sym = I.second.getAsSymbol())
+      SymReaper.markLive(Sym);
+}
+
 } // namespace ento
 } // namespace clang
diff --git a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
index 785cdfa15bf04..d9ddc12c54985 100644
--- a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
+++ b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
@@ -1064,6 +1064,8 @@ void ExprEngine::removeDead(ExplodedNode *Pred, ExplodedNodeSet &Out,
       SymReaper.markLive(MR);
   }
 
+  markAllDynamicExtentLive(CleanedState, SymReaper);
+
   getCheckerManager().runCheckersForLiveSymbols(CleanedState, SymReaper);
 
   // Create a state in which dead bindings are removed from the environment
diff --git a/clang/test/Analysis/ArrayBound/verbose-tests.c b/clang/test/Analysis/ArrayBound/verbose-tests.c
index e3416886d13e5..9ee290ab6b5b8 100644
--- a/clang/test/Analysis/ArrayBound/verbose-tests.c
+++ b/clang/test/Analysis/ArrayBound/verbose-tests.c
@@ -381,30 +381,12 @@ int *symbolicExtent(int arg) {
     return 0;
   int *mem = (int*)malloc(arg);
 
-  // TODO: without the following reference to 'arg', the analyzer would discard
-  // the range information about (the symbolic value of) 'arg'. This is
-  // incorrect because while the variable itself is inaccessible, it becomes
-  // the symbolic extent of 'mem', so we still want to reason about its
-  // potential values.
-  (void)arg;
-
   mem[8] = -2;
   // expected-warning@-1 {{Out of bound access to memory after the end of the heap area}}
   // expected-note@-2 {{Access of 'int' element in the heap area at index 8}}
   return mem;
 }
 
-int *symbolicExtentDiscardedRangeInfo(int arg) {
-  // This is a copy of the case 'symbolicExtent' without the '(void)arg' hack.
-  // TODO: if the analyzer can detect the out-of-bounds access within this
-  // testcase, then remove this and the `(void)arg` hack from `symbolicExtent`.
-  if (arg >= 5)
-    return 0;
-  int *mem = (int*)malloc(arg);
-  mem[8] = -2;
-  return mem;
-}
-
 void symbolicIndex(int arg) {
   // expected-note@+2 {{Assuming 'arg' is >= 12}}
   // expected-note@+1 {{Taking true branch}}

@llvmbot
Copy link
Member

llvmbot commented Oct 15, 2025

@llvm/pr-subscribers-clang-static-analyzer-1

Author: Balázs Kéri (balazske)

Changes

Symbols used for dynamic extent information of memory regions are now kept as live as long as the memory region exists.


Full diff: https://github.com/llvm/llvm-project/pull/163562.diff

4 Files Affected:

  • (modified) clang/include/clang/StaticAnalyzer/Core/PathSensitive/DynamicExtent.h (+2)
  • (modified) clang/lib/StaticAnalyzer/Core/DynamicExtent.cpp (+6)
  • (modified) clang/lib/StaticAnalyzer/Core/ExprEngine.cpp (+2)
  • (modified) clang/test/Analysis/ArrayBound/verbose-tests.c (-18)
diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/DynamicExtent.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/DynamicExtent.h
index 1a9bef06b15a4..440603fb4d8c7 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/DynamicExtent.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/DynamicExtent.h
@@ -58,6 +58,8 @@ SVal getDynamicExtentWithOffset(ProgramStateRef State, SVal BufV);
 DefinedOrUnknownSVal getDynamicElementCountWithOffset(ProgramStateRef State,
                                                       SVal BufV, QualType Ty);
 
+void markAllDynamicExtentLive(ProgramStateRef State, SymbolReaper &SymReaper);
+
 } // namespace ento
 } // namespace clang
 
diff --git a/clang/lib/StaticAnalyzer/Core/DynamicExtent.cpp b/clang/lib/StaticAnalyzer/Core/DynamicExtent.cpp
index 34078dbce0b68..e436b186a2148 100644
--- a/clang/lib/StaticAnalyzer/Core/DynamicExtent.cpp
+++ b/clang/lib/StaticAnalyzer/Core/DynamicExtent.cpp
@@ -128,5 +128,11 @@ ProgramStateRef setDynamicExtent(ProgramStateRef State, const MemRegion *MR,
   return State->set<DynamicExtentMap>(MR->StripCasts(), Size);
 }
 
+void markAllDynamicExtentLive(ProgramStateRef State, SymbolReaper &SymReaper) {
+  for (const auto &I : State->get<DynamicExtentMap>())
+    if (SymbolRef Sym = I.second.getAsSymbol())
+      SymReaper.markLive(Sym);
+}
+
 } // namespace ento
 } // namespace clang
diff --git a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
index 785cdfa15bf04..d9ddc12c54985 100644
--- a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
+++ b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
@@ -1064,6 +1064,8 @@ void ExprEngine::removeDead(ExplodedNode *Pred, ExplodedNodeSet &Out,
       SymReaper.markLive(MR);
   }
 
+  markAllDynamicExtentLive(CleanedState, SymReaper);
+
   getCheckerManager().runCheckersForLiveSymbols(CleanedState, SymReaper);
 
   // Create a state in which dead bindings are removed from the environment
diff --git a/clang/test/Analysis/ArrayBound/verbose-tests.c b/clang/test/Analysis/ArrayBound/verbose-tests.c
index e3416886d13e5..9ee290ab6b5b8 100644
--- a/clang/test/Analysis/ArrayBound/verbose-tests.c
+++ b/clang/test/Analysis/ArrayBound/verbose-tests.c
@@ -381,30 +381,12 @@ int *symbolicExtent(int arg) {
     return 0;
   int *mem = (int*)malloc(arg);
 
-  // TODO: without the following reference to 'arg', the analyzer would discard
-  // the range information about (the symbolic value of) 'arg'. This is
-  // incorrect because while the variable itself is inaccessible, it becomes
-  // the symbolic extent of 'mem', so we still want to reason about its
-  // potential values.
-  (void)arg;
-
   mem[8] = -2;
   // expected-warning@-1 {{Out of bound access to memory after the end of the heap area}}
   // expected-note@-2 {{Access of 'int' element in the heap area at index 8}}
   return mem;
 }
 
-int *symbolicExtentDiscardedRangeInfo(int arg) {
-  // This is a copy of the case 'symbolicExtent' without the '(void)arg' hack.
-  // TODO: if the analyzer can detect the out-of-bounds access within this
-  // testcase, then remove this and the `(void)arg` hack from `symbolicExtent`.
-  if (arg >= 5)
-    return 0;
-  int *mem = (int*)malloc(arg);
-  mem[8] = -2;
-  return mem;
-}
-
 void symbolicIndex(int arg) {
   // expected-note@+2 {{Assuming 'arg' is >= 12}}
   // expected-note@+1 {{Taking true branch}}

Copy link
Contributor

@NagyDonat NagyDonat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this issue, I'm very happy to see this patch!

What happens when you have a symbolic region (e.g. allocated by malloc) with a dynamic extent, then both the region and its extent symbol become inaccessible at the same time? Will you prolong the lifetime of the extent symbol? Is it eventually "garbage-collected"?

My first instinct is that instead of markAllDynamicExtentLive you would need a loop that marks the dynamic extent live only if the region itself is live. (Or non-symbolic -- by the way, can non-symbolic regions have a symbolic dynamic extent?)

@steakhal
Copy link
Contributor

This patch would effectively leak all the extent symbols.
If that would be the objective, we could achieve that by hardcoding that rule in the SymbolReaper.

There is also quite some history to the liveness subject around symbol extents on Discourse.
Like https://discourse.llvm.org/t/keeping-the-extent-symbol-alive/76676 and probably more.

I wish you could write up what do we currently have and options to fix this for extents (and possibly for any other symbols for that matter because extents are not alone in this issue).

@balazske
Copy link
Collaborator Author

With the current version the extent symbol should become "dead" (not live any more) if the memory region becomes dead. The liveness of the extent symbol does not have effect on the liveness of the memory region. If the leak of the memory region can be detected, the leak of the extent can be detected too (maybe at a later point than before the patch). Normally the extent symbol is not something that can leak so this is anyway a rare case. Purpose of this patch is only to have the information about extent of the memory region available. If the original symbol would be garbage-collected but we keep it live like in this patch, it is not expected that the applied constraints change later. It should work like creating a new symbol for the extent and applying the constraints of the original symbol to it (and not making the original live).

Copy link
Contributor

@NagyDonat NagyDonat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, I'm satisfied with the logic introduced by this commit.

For extent symbols, this "if the region is alive, keep its extent alive" seems to be the logically correct behavior: this preserves the extent while it's relevant and ensures that the extent is garbage-collected after that point (assuming that the region itself is garbage-collected properly).


I wish you could write up what do we currently have and options to fix this for extents (and possibly for any other symbols for that matter because extents are not alone in this issue).

@steakhal What do you mean by this?

The old discussion at https://discourse.llvm.org/t/keeping-the-extent-symbol-alive/76676/3 seems to be a fairly complete write-up of the current situation (which hadn't changed since then) and the options for fixing the lifetime issues of extent symbols.

I think that lifetime issues of other symbols are mostly irrelevant for this PR: we shouldn't postpone this concrete and clear solution to wait for vague hypothetical improvements in partially related areas.

Also, I strongly suspect that even a highly advanced symbol lifetime system will need the concrete logic for extent symbols (which is added here), so this commit will be a good step toward the complete solution (even if we don't see the complete solution yet).

return State->set<DynamicExtentMap>(MR->StripCasts(), Size);
}

void markAllDynamicExtentLive(ProgramStateRef State, SymbolReaper &SymReaper) {
Copy link
Contributor

@NagyDonat NagyDonat Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps rename this to transferLivenessToDynamicExtent or something similar now that we no longer mark all the dynamic extent values as live?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:static analyzer clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants