-
Notifications
You must be signed in to change notification settings - Fork 45
Prioritize executions with identical trace to minimize stateBefore
#2371
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
480d6b4 to
e5d64ef
Compare
| .with(ValueProvider.of(relevantRepositories.map { SavedEntityValueProvider(defaultIdGenerator, it) })) | ||
| .with(ValueProvider.of(generatedValueFieldIds.map { FieldValueProvider(defaultIdGenerator, it) })) | ||
| }.let(transform) | ||
| val coverageToMinStateBeforeSize = mutableMapOf<Coverage, Int>() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this map is inefficient, because it requires at least O(n) to calculate hash and it stores as much coverages as program can generate. In bad cases it generates too many coverages because of loops and recursions. The Trie was created exactly for solving this problem. Still has O(n) to add value but has lower memory consumption.
Please, consider using Trie or another more efficient collection to store coverages (traces).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaced key type with Trie.Node<Instruction>, but as far as I can see the use of the Trie shouldn't reduce memory consumption, since reference to Coverage is stored in every emitted UtExecution anyway, that was my original motivation for not using the Trie in the first place.
| // so we don't know the actual coverage for such executions | ||
|
|
||
| val filteredExecutions = executions.filterIndexed { idx, _ -> idx !in unknownCoverageExecutions } | ||
| val filteredExecutions = filterOutDuplicateCoverages(executions - unknownCoverageExecutions) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that this works as expected, because it compared indices (which are integers) then but now it compares UtExecutions for subtracting. But there's no hashCode and equals for UtExecution. Also, this implementation is less efficient when comparing values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should work as expected, because reference equality is used when hashCode and equals are not overridden.
For some context, I changed this part, because there was a bug in minimization. To be more exact, in the following code fragment indices referring to positions of UtExecutions in filteredExecutions list were saved into usedExecutionIndexes:
val (mapping, executionToPriorityMapping) = buildMapping(filteredExecutions)
val usedExecutionIndexes = (GreedyEssential.minimize(mapping, executionToPriorityMapping) + /* ... */).toSet()And later on these usedExecutionIndexes were used as if they were indices that refer to positions of UtExecutions in the whole executions list:
val usedMinimizedExecutions = executions.filterIndexed { idx, _ -> idx in usedExecutionIndexes }| } | ||
|
|
||
| private fun filterOutDuplicateCoverages(executions: List<UtExecution>): List<UtExecution> { | ||
| val (executionIdxToCoveredEdgesMap, _) = buildMapping(executions) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This operation can be heavy when number of executions is huge. Could it be verified that it works OK for a huge number of executions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it doesn't work OK for a huge number of executions, then it should have already been noticed on functions with loops and recursions, beside that inspecting buildMapping function body we can see that it only iterates over instructions once.
Although, it worth noting that buildMapping may have some performance related issues, since it uses allCoveredEdges, that can get quite large and cause a lot of cache misses, but I still think it's premature to optimize it now without any evidence (nor even known incidents) of it being a bottleneck.
// (inst1, instr2) -> edge id --- edge represents as a pair of instructions, which are connected by this edge
val allCoveredEdges = mutableMapOf<Pair<Long, Long?>, Int>()By the way, on the topic of filtering out duplicate coverages, is that intended that fuzzer and minimization do it differently? Right now, when engine produces multiple executions with identical coverage, but throwing different exceptions, all these executions are preserved by minimization, however fuzzer itself only keeps one execuction.
Executions can throw different exceptions, while having identical coverage, if exceptions are thrown from within the JRE itself or from some large libraries (e.g. Spring) that are not transformed for trace collection.
Description
Fixes #2219
See this comment for more details.
How to test
Standalone testing is not required, just as a part of full Spring integration test generation testing.
Self-check list