Skip to content

Conversation

@IlyaMuravjov
Copy link
Collaborator

@IlyaMuravjov IlyaMuravjov commented Jul 6, 2023

Description

Fixes #2219

See this comment for more details.

How to test

Standalone testing is not required, just as a part of full Spring integration test generation testing.

Self-check list

  • I've set the proper labels for my PR (at least, for category and component).
  • PR title and description are clear and intelligible.
  • I've added enough comments to my code, particularly in hard-to-understand areas.
  • The functionality I've repaired, changed or added is covered with automated tests.
  • Manual tests have been provided optionally.
  • The documentation for the functionality I've been working on is up-to-date.

@IlyaMuravjov IlyaMuravjov added ctg-enhancement New feature, improvement or change request comp-minimizer Issue is related to Minimization phase labels Jul 6, 2023
@IlyaMuravjov IlyaMuravjov force-pushed the ilya_m/minimize_state_before branch from 480d6b4 to e5d64ef Compare July 6, 2023 13:16
@EgorkaKulikov EgorkaKulikov requested a review from Markoutte July 7, 2023 09:53
.with(ValueProvider.of(relevantRepositories.map { SavedEntityValueProvider(defaultIdGenerator, it) }))
.with(ValueProvider.of(generatedValueFieldIds.map { FieldValueProvider(defaultIdGenerator, it) }))
}.let(transform)
val coverageToMinStateBeforeSize = mutableMapOf<Coverage, Int>()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this map is inefficient, because it requires at least O(n) to calculate hash and it stores as much coverages as program can generate. In bad cases it generates too many coverages because of loops and recursions. The Trie was created exactly for solving this problem. Still has O(n) to add value but has lower memory consumption.

Please, consider using Trie or another more efficient collection to store coverages (traces).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced key type with Trie.Node<Instruction>, but as far as I can see the use of the Trie shouldn't reduce memory consumption, since reference to Coverage is stored in every emitted UtExecution anyway, that was my original motivation for not using the Trie in the first place.

// so we don't know the actual coverage for such executions

val filteredExecutions = executions.filterIndexed { idx, _ -> idx !in unknownCoverageExecutions }
val filteredExecutions = filterOutDuplicateCoverages(executions - unknownCoverageExecutions)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that this works as expected, because it compared indices (which are integers) then but now it compares UtExecutions for subtracting. But there's no hashCode and equals for UtExecution. Also, this implementation is less efficient when comparing values.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should work as expected, because reference equality is used when hashCode and equals are not overridden.

For some context, I changed this part, because there was a bug in minimization. To be more exact, in the following code fragment indices referring to positions of UtExecutions in filteredExecutions list were saved into usedExecutionIndexes:

val (mapping, executionToPriorityMapping) = buildMapping(filteredExecutions)
val usedExecutionIndexes = (GreedyEssential.minimize(mapping, executionToPriorityMapping) +  /* ... */).toSet()

And later on these usedExecutionIndexes were used as if they were indices that refer to positions of UtExecutions in the whole executions list:

val usedMinimizedExecutions = executions.filterIndexed { idx, _ -> idx in usedExecutionIndexes }

}

private fun filterOutDuplicateCoverages(executions: List<UtExecution>): List<UtExecution> {
val (executionIdxToCoveredEdgesMap, _) = buildMapping(executions)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This operation can be heavy when number of executions is huge. Could it be verified that it works OK for a huge number of executions?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it doesn't work OK for a huge number of executions, then it should have already been noticed on functions with loops and recursions, beside that inspecting buildMapping function body we can see that it only iterates over instructions once.

Although, it worth noting that buildMapping may have some performance related issues, since it uses allCoveredEdges, that can get quite large and cause a lot of cache misses, but I still think it's premature to optimize it now without any evidence (nor even known incidents) of it being a bottleneck.

// (inst1, instr2) -> edge id --- edge represents as a pair of instructions, which are connected by this edge
val allCoveredEdges = mutableMapOf<Pair<Long, Long?>, Int>()

By the way, on the topic of filtering out duplicate coverages, is that intended that fuzzer and minimization do it differently? Right now, when engine produces multiple executions with identical coverage, but throwing different exceptions, all these executions are preserved by minimization, however fuzzer itself only keeps one execuction.

Executions can throw different exceptions, while having identical coverage, if exceptions are thrown from within the JRE itself or from some large libraries (e.g. Spring) that are not transformed for trace collection.

@IlyaMuravjov IlyaMuravjov requested a review from Markoutte July 7, 2023 13:40
@IlyaMuravjov IlyaMuravjov merged commit f41d891 into main Jul 10, 2023
@IlyaMuravjov IlyaMuravjov deleted the ilya_m/minimize_state_before branch July 10, 2023 09:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp-minimizer Issue is related to Minimization phase ctg-enhancement New feature, improvement or change request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve Spring integration tests minimization to minimize database content

3 participants