-
Notifications
You must be signed in to change notification settings - Fork 14.9k
[clang][analyzer] Extend lifetime of dynamic extent information #163562
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Symbols used for dynamic extent information of memory regions are now kept as live as long as the memory region exists.
@llvm/pr-subscribers-clang Author: Balázs Kéri (balazske) ChangesSymbols used for dynamic extent information of memory regions are now kept as live as long as the memory region exists. Full diff: https://github.com/llvm/llvm-project/pull/163562.diff 4 Files Affected:
diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/DynamicExtent.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/DynamicExtent.h
index 1a9bef06b15a4..440603fb4d8c7 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/DynamicExtent.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/DynamicExtent.h
@@ -58,6 +58,8 @@ SVal getDynamicExtentWithOffset(ProgramStateRef State, SVal BufV);
DefinedOrUnknownSVal getDynamicElementCountWithOffset(ProgramStateRef State,
SVal BufV, QualType Ty);
+void markAllDynamicExtentLive(ProgramStateRef State, SymbolReaper &SymReaper);
+
} // namespace ento
} // namespace clang
diff --git a/clang/lib/StaticAnalyzer/Core/DynamicExtent.cpp b/clang/lib/StaticAnalyzer/Core/DynamicExtent.cpp
index 34078dbce0b68..e436b186a2148 100644
--- a/clang/lib/StaticAnalyzer/Core/DynamicExtent.cpp
+++ b/clang/lib/StaticAnalyzer/Core/DynamicExtent.cpp
@@ -128,5 +128,11 @@ ProgramStateRef setDynamicExtent(ProgramStateRef State, const MemRegion *MR,
return State->set<DynamicExtentMap>(MR->StripCasts(), Size);
}
+void markAllDynamicExtentLive(ProgramStateRef State, SymbolReaper &SymReaper) {
+ for (const auto &I : State->get<DynamicExtentMap>())
+ if (SymbolRef Sym = I.second.getAsSymbol())
+ SymReaper.markLive(Sym);
+}
+
} // namespace ento
} // namespace clang
diff --git a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
index 785cdfa15bf04..d9ddc12c54985 100644
--- a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
+++ b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
@@ -1064,6 +1064,8 @@ void ExprEngine::removeDead(ExplodedNode *Pred, ExplodedNodeSet &Out,
SymReaper.markLive(MR);
}
+ markAllDynamicExtentLive(CleanedState, SymReaper);
+
getCheckerManager().runCheckersForLiveSymbols(CleanedState, SymReaper);
// Create a state in which dead bindings are removed from the environment
diff --git a/clang/test/Analysis/ArrayBound/verbose-tests.c b/clang/test/Analysis/ArrayBound/verbose-tests.c
index e3416886d13e5..9ee290ab6b5b8 100644
--- a/clang/test/Analysis/ArrayBound/verbose-tests.c
+++ b/clang/test/Analysis/ArrayBound/verbose-tests.c
@@ -381,30 +381,12 @@ int *symbolicExtent(int arg) {
return 0;
int *mem = (int*)malloc(arg);
- // TODO: without the following reference to 'arg', the analyzer would discard
- // the range information about (the symbolic value of) 'arg'. This is
- // incorrect because while the variable itself is inaccessible, it becomes
- // the symbolic extent of 'mem', so we still want to reason about its
- // potential values.
- (void)arg;
-
mem[8] = -2;
// expected-warning@-1 {{Out of bound access to memory after the end of the heap area}}
// expected-note@-2 {{Access of 'int' element in the heap area at index 8}}
return mem;
}
-int *symbolicExtentDiscardedRangeInfo(int arg) {
- // This is a copy of the case 'symbolicExtent' without the '(void)arg' hack.
- // TODO: if the analyzer can detect the out-of-bounds access within this
- // testcase, then remove this and the `(void)arg` hack from `symbolicExtent`.
- if (arg >= 5)
- return 0;
- int *mem = (int*)malloc(arg);
- mem[8] = -2;
- return mem;
-}
-
void symbolicIndex(int arg) {
// expected-note@+2 {{Assuming 'arg' is >= 12}}
// expected-note@+1 {{Taking true branch}}
|
@llvm/pr-subscribers-clang-static-analyzer-1 Author: Balázs Kéri (balazske) ChangesSymbols used for dynamic extent information of memory regions are now kept as live as long as the memory region exists. Full diff: https://github.com/llvm/llvm-project/pull/163562.diff 4 Files Affected:
diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/DynamicExtent.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/DynamicExtent.h
index 1a9bef06b15a4..440603fb4d8c7 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/DynamicExtent.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/DynamicExtent.h
@@ -58,6 +58,8 @@ SVal getDynamicExtentWithOffset(ProgramStateRef State, SVal BufV);
DefinedOrUnknownSVal getDynamicElementCountWithOffset(ProgramStateRef State,
SVal BufV, QualType Ty);
+void markAllDynamicExtentLive(ProgramStateRef State, SymbolReaper &SymReaper);
+
} // namespace ento
} // namespace clang
diff --git a/clang/lib/StaticAnalyzer/Core/DynamicExtent.cpp b/clang/lib/StaticAnalyzer/Core/DynamicExtent.cpp
index 34078dbce0b68..e436b186a2148 100644
--- a/clang/lib/StaticAnalyzer/Core/DynamicExtent.cpp
+++ b/clang/lib/StaticAnalyzer/Core/DynamicExtent.cpp
@@ -128,5 +128,11 @@ ProgramStateRef setDynamicExtent(ProgramStateRef State, const MemRegion *MR,
return State->set<DynamicExtentMap>(MR->StripCasts(), Size);
}
+void markAllDynamicExtentLive(ProgramStateRef State, SymbolReaper &SymReaper) {
+ for (const auto &I : State->get<DynamicExtentMap>())
+ if (SymbolRef Sym = I.second.getAsSymbol())
+ SymReaper.markLive(Sym);
+}
+
} // namespace ento
} // namespace clang
diff --git a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
index 785cdfa15bf04..d9ddc12c54985 100644
--- a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
+++ b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
@@ -1064,6 +1064,8 @@ void ExprEngine::removeDead(ExplodedNode *Pred, ExplodedNodeSet &Out,
SymReaper.markLive(MR);
}
+ markAllDynamicExtentLive(CleanedState, SymReaper);
+
getCheckerManager().runCheckersForLiveSymbols(CleanedState, SymReaper);
// Create a state in which dead bindings are removed from the environment
diff --git a/clang/test/Analysis/ArrayBound/verbose-tests.c b/clang/test/Analysis/ArrayBound/verbose-tests.c
index e3416886d13e5..9ee290ab6b5b8 100644
--- a/clang/test/Analysis/ArrayBound/verbose-tests.c
+++ b/clang/test/Analysis/ArrayBound/verbose-tests.c
@@ -381,30 +381,12 @@ int *symbolicExtent(int arg) {
return 0;
int *mem = (int*)malloc(arg);
- // TODO: without the following reference to 'arg', the analyzer would discard
- // the range information about (the symbolic value of) 'arg'. This is
- // incorrect because while the variable itself is inaccessible, it becomes
- // the symbolic extent of 'mem', so we still want to reason about its
- // potential values.
- (void)arg;
-
mem[8] = -2;
// expected-warning@-1 {{Out of bound access to memory after the end of the heap area}}
// expected-note@-2 {{Access of 'int' element in the heap area at index 8}}
return mem;
}
-int *symbolicExtentDiscardedRangeInfo(int arg) {
- // This is a copy of the case 'symbolicExtent' without the '(void)arg' hack.
- // TODO: if the analyzer can detect the out-of-bounds access within this
- // testcase, then remove this and the `(void)arg` hack from `symbolicExtent`.
- if (arg >= 5)
- return 0;
- int *mem = (int*)malloc(arg);
- mem[8] = -2;
- return mem;
-}
-
void symbolicIndex(int arg) {
// expected-note@+2 {{Assuming 'arg' is >= 12}}
// expected-note@+1 {{Taking true branch}}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this issue, I'm very happy to see this patch!
What happens when you have a symbolic region (e.g. allocated by malloc
) with a dynamic extent, then both the region and its extent symbol become inaccessible at the same time? Will you prolong the lifetime of the extent symbol? Is it eventually "garbage-collected"?
My first instinct is that instead of markAllDynamicExtentLive
you would need a loop that marks the dynamic extent live only if the region itself is live. (Or non-symbolic -- by the way, can non-symbolic regions have a symbolic dynamic extent?)
This patch would effectively leak all the extent symbols. There is also quite some history to the liveness subject around symbol extents on Discourse. I wish you could write up what do we currently have and options to fix this for extents (and possibly for any other symbols for that matter because extents are not alone in this issue). |
With the current version the extent symbol should become "dead" (not live any more) if the memory region becomes dead. The liveness of the extent symbol does not have effect on the liveness of the memory region. If the leak of the memory region can be detected, the leak of the extent can be detected too (maybe at a later point than before the patch). Normally the extent symbol is not something that can leak so this is anyway a rare case. Purpose of this patch is only to have the information about extent of the memory region available. If the original symbol would be garbage-collected but we keep it live like in this patch, it is not expected that the applied constraints change later. It should work like creating a new symbol for the extent and applying the constraints of the original symbol to it (and not making the original live). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, I'm satisfied with the logic introduced by this commit.
For extent symbols, this "if the region is alive, keep its extent alive" seems to be the logically correct behavior: this preserves the extent while it's relevant and ensures that the extent is garbage-collected after that point (assuming that the region itself is garbage-collected properly).
I wish you could write up what do we currently have and options to fix this for extents (and possibly for any other symbols for that matter because extents are not alone in this issue).
@steakhal What do you mean by this?
The old discussion at https://discourse.llvm.org/t/keeping-the-extent-symbol-alive/76676/3 seems to be a fairly complete write-up of the current situation (which hadn't changed since then) and the options for fixing the lifetime issues of extent symbols.
I think that lifetime issues of other symbols are mostly irrelevant for this PR: we shouldn't postpone this concrete and clear solution to wait for vague hypothetical improvements in partially related areas.
Also, I strongly suspect that even a highly advanced symbol lifetime system will need the concrete logic for extent symbols (which is added here), so this commit will be a good step toward the complete solution (even if we don't see the complete solution yet).
return State->set<DynamicExtentMap>(MR->StripCasts(), Size); | ||
} | ||
|
||
void markAllDynamicExtentLive(ProgramStateRef State, SymbolReaper &SymReaper) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps rename this to transferLivenessToDynamicExtent
or something similar now that we no longer mark all the dynamic extent values as live?
Symbols used for dynamic extent information of memory regions are now kept as live as long as the memory region exists.