Prioritize executions with identical trace to minimize `stateBefore` #2371

IlyaMuravjov · 2023-07-06T10:24:42Z

Description

Fixes #2219

See this comment for more details.

How to test

Standalone testing is not required, just as a part of full Spring integration test generation testing.

Self-check list

I've set the proper labels for my PR (at least, for category and component).
PR title and description are clear and intelligible.
I've added enough comments to my code, particularly in hard-to-understand areas.
The functionality I've repaired, changed or added is covered with automated tests.
Manual tests have been provided optionally.
The documentation for the functionality I've been working on is up-to-date.

Markoutte · 2023-07-07T10:12:30Z

utbot-framework/src/main/kotlin/org/utbot/engine/UtBotSymbolicEngine.kt

                    .with(ValueProvider.of(relevantRepositories.map { SavedEntityValueProvider(defaultIdGenerator, it) }))
                    .with(ValueProvider.of(generatedValueFieldIds.map { FieldValueProvider(defaultIdGenerator, it) }))
            }.let(transform)
+        val coverageToMinStateBeforeSize = mutableMapOf<Coverage, Int>()


It looks like this map is inefficient, because it requires at least O(n) to calculate hash and it stores as much coverages as program can generate. In bad cases it generates too many coverages because of loops and recursions. The Trie was created exactly for solving this problem. Still has O(n) to add value but has lower memory consumption.

Please, consider using Trie or another more efficient collection to store coverages (traces).

Replaced key type with Trie.Node<Instruction>, but as far as I can see the use of the Trie shouldn't reduce memory consumption, since reference to Coverage is stored in every emitted UtExecution anyway, that was my original motivation for not using the Trie in the first place.

Markoutte · 2023-07-07T10:25:55Z

utbot-framework/src/main/kotlin/org/utbot/framework/minimization/Minimization.kt

    //     so we don't know the actual coverage for such executions

-    val filteredExecutions = executions.filterIndexed { idx, _ -> idx !in unknownCoverageExecutions }
+    val filteredExecutions = filterOutDuplicateCoverages(executions - unknownCoverageExecutions)


I'm not sure that this works as expected, because it compared indices (which are integers) then but now it compares UtExecutions for subtracting. But there's no hashCode and equals for UtExecution. Also, this implementation is less efficient when comparing values.

It should work as expected, because reference equality is used when hashCode and equals are not overridden.

For some context, I changed this part, because there was a bug in minimization. To be more exact, in the following code fragment indices referring to positions of UtExecutions in filteredExecutions list were saved into usedExecutionIndexes:

val (mapping, executionToPriorityMapping) = buildMapping(filteredExecutions) val usedExecutionIndexes = (GreedyEssential.minimize(mapping, executionToPriorityMapping) + /* ... */).toSet()

And later on these usedExecutionIndexes were used as if they were indices that refer to positions of UtExecutions in the whole executions list:

val usedMinimizedExecutions = executions.filterIndexed { idx, _ -> idx in usedExecutionIndexes }

Markoutte · 2023-07-07T10:44:47Z

utbot-framework/src/main/kotlin/org/utbot/framework/minimization/Minimization.kt

 }

+private fun filterOutDuplicateCoverages(executions: List<UtExecution>): List<UtExecution> {
+    val (executionIdxToCoveredEdgesMap, _) = buildMapping(executions)


This operation can be heavy when number of executions is huge. Could it be verified that it works OK for a huge number of executions?

If it doesn't work OK for a huge number of executions, then it should have already been noticed on functions with loops and recursions, beside that inspecting buildMapping function body we can see that it only iterates over instructions once.

Although, it worth noting that buildMapping may have some performance related issues, since it uses allCoveredEdges, that can get quite large and cause a lot of cache misses, but I still think it's premature to optimize it now without any evidence (nor even known incidents) of it being a bottleneck.

// (inst1, instr2) -> edge id --- edge represents as a pair of instructions, which are connected by this edge val allCoveredEdges = mutableMapOf<Pair<Long, Long?>, Int>()

By the way, on the topic of filtering out duplicate coverages, is that intended that fuzzer and minimization do it differently? Right now, when engine produces multiple executions with identical coverage, but throwing different exceptions, all these executions are preserved by minimization, however fuzzer itself only keeps one execuction.

Executions can throw different exceptions, while having identical coverage, if exceptions are thrown from within the JRE itself or from some large libraries (e.g. Spring) that are not transformed for trace collection.

IlyaMuravjov added ctg-enhancement New feature, improvement or change request comp-minimizer Issue is related to Minimization phase labels Jul 6, 2023

IlyaMuravjov requested a review from EgorkaKulikov July 6, 2023 10:24

Prioritize executions with identical trace to minimize stateBefore

e5d64ef

IlyaMuravjov force-pushed the ilya_m/minimize_state_before branch from 480d6b4 to e5d64ef Compare July 6, 2023 13:16

EgorkaKulikov requested a review from Markoutte July 7, 2023 09:53

Markoutte requested changes Jul 7, 2023

View reviewed changes

Use Trie.Node<Instruction> as key instead of Coverage

59aa0a1

IlyaMuravjov requested a review from Markoutte July 7, 2023 13:40

Markoutte approved these changes Jul 10, 2023

View reviewed changes

IlyaMuravjov merged commit f41d891 into main Jul 10, 2023

IlyaMuravjov deleted the ilya_m/minimize_state_before branch July 10, 2023 09:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Prioritize executions with identical trace to minimize `stateBefore` #2371

Prioritize executions with identical trace to minimize `stateBefore` #2371

Uh oh!

IlyaMuravjov commented Jul 6, 2023 •

edited

Loading

Uh oh!

Markoutte Jul 7, 2023

Uh oh!

IlyaMuravjov Jul 7, 2023

Uh oh!

Markoutte Jul 7, 2023

Uh oh!

IlyaMuravjov Jul 7, 2023

Uh oh!

Markoutte Jul 7, 2023

Uh oh!

IlyaMuravjov Jul 7, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Prioritize executions with identical trace to minimize stateBefore #2371

Prioritize executions with identical trace to minimize stateBefore #2371

Uh oh!

Conversation

IlyaMuravjov commented Jul 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

How to test

Self-check list

Uh oh!

Markoutte Jul 7, 2023

Choose a reason for hiding this comment

Uh oh!

IlyaMuravjov Jul 7, 2023

Choose a reason for hiding this comment

Uh oh!

Markoutte Jul 7, 2023

Choose a reason for hiding this comment

Uh oh!

IlyaMuravjov Jul 7, 2023

Choose a reason for hiding this comment

Uh oh!

Markoutte Jul 7, 2023

Choose a reason for hiding this comment

Uh oh!

IlyaMuravjov Jul 7, 2023

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Prioritize executions with identical trace to minimize `stateBefore` #2371

Prioritize executions with identical trace to minimize `stateBefore` #2371

IlyaMuravjov commented Jul 6, 2023 •

edited

Loading