-
Notifications
You must be signed in to change notification settings - Fork 256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inlining overhaul for test generation #4255
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a great improvement. I have a couple of suggestions about error message wording, and a couple of questions, but nothing major.
@@ -26,7 +29,6 @@ public class Setup { | |||
|
|||
public static TheoryData<List<Action<DafnyOptions>>> OptionSettings() { | |||
var optionSettings = new TheoryData<List<Action<DafnyOptions>>>(); | |||
optionSettings.Add(new() { options => options.TypeEncodingMethod = CoreOptions.TypeEncoding.Arguments }); | |||
optionSettings.Add(new() { options => options.TypeEncodingMethod = CoreOptions.TypeEncoding.Predicates }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just leaving a note to have a record that we would like to figure out why it doesn't work well with Arguments
and move it over to that mode once we do that for Dafny more broadly.
Source/DafnyTestGeneration/Main.cs
Outdated
$"{coveredByTests} should be covered by tests " + | ||
$"(assuming no tests were found to be duplicates of each other). " + | ||
$"Moreover, {coveredByCounterexamples} locations have been found to be reachable " + | ||
$"(i.e. the verifier returned a counterexample and did not timeout). " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder whether a word other than "counterexample" would be more useful here to folks who don't know as much about the inner workings of how test generation uses SMT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done - paraphrased to "...the verifier did not timeout and produced example inputs to reach these locations"
} | ||
|
||
internal void PreResolve(Program program) { | ||
AddByMethod(program.DefaultModule); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this skip other modules?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It shouldn't - all modules are submodules of the default module
namespace DafnyTestGeneration.Inlining; | ||
|
||
/// <summary> | ||
/// Separates the bodies of function-by-methods into separate methods so that translator will process them accordingly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's cool that you can do this. I can think of some other places it could be useful.
@@ -131,25 +156,35 @@ private enum Status { Success, Failure, Untested } | |||
return counterexampleLog; | |||
} | |||
var log = writer.ToString(); | |||
// If Dafny finds several assertion violations (e.g. because of inlining one trap assertion multiple times), | |||
// pick the first execution trace and extract the counterexample from it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this always the right trace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The choice of the trace shouldn't matter since I replace all native assertions with assumptions and so the only assertion that can fail is the one that is being targeted. One can imagine an optimization that picks a trace covering the most blocks but in terms correctness the choice should not affect the result.
$"e.g. if branching is conditional on the result of a trait instance " + | ||
$"method call."); | ||
$"for {duplicate.uniqueId} - this may occur if the code under test is non-deterministic, " + | ||
$"if a method/function is not inlined, or if test generation cannot extract a value from a counterexample."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment about "counterexample" as before.
Those two failures aren't your fault. One will be fixed by #4317 and the other is non-deterministic. |
Got it, thank you @atomb! |
This PR overhauls implementation of inlining, on which test generation relies to generate system-level tests, and which can in the future be used for other purposes, such as bounded model checking. Inlining, in this context, means adjusting the Dafny to Boogie translation in such a way that the resulting Boogie program, when you call Boogie's `Inline` method on it, merges all specified Dafny functions and methods, including all corresponding well-formedness checks, into a single procedure. The key AST transformation necessary to make this happen is removal of short circuiting expressions from the original program. Most of what this PR does has to do with lifting this short-circuit-removal pass from Boogie to Dafny. This achieves three things: - The new implementation is much more efficient. The old implementation can generate unit tests and inline one or two methods but not entire programs. Once Boogie 3.0 is merged into Dafny, the speed up should increase further. - The new implementation preserves Dafny line/column information to report coverage in terms of Dafny source rather than its Boogie representation. Making this change has been one of the main asks pertaining to test generation. - The new code should be, I think, easier to maintain going forward. It is split into disjoined transformations over Dafny/Boogie ASTs (here is the [main coordinating method](https://github.com/Dargones/dafny/blob/a5004d6d62d324433b57937363605b8ba975dca6/Source/DafnyTestGeneration/Inlining/InliningTranslator.cs#L22-L47)) and makes as few assumptions about the translation process as possible (whereas the previous implementation would rely very heavily on position and existence of various constructs that Dafny translator creates) By submitting this pull request, I confirm that my contribution is made under the terms of the MIT license. --------- Co-authored-by: Aleksandr Fedchin <fedchina@amazon.com>
This PR overhauls implementation of inlining, on which test generation relies to generate system-level tests, and which can in the future be used for other purposes, such as bounded model checking. Inlining, in this context, means adjusting the Dafny to Boogie translation in such a way that the resulting Boogie program, when you call Boogie's
Inline
method on it, merges all specified Dafny functions and methods, including all corresponding well-formedness checks, into a single procedure. The key AST transformation necessary to make this happen is removal of short circuiting expressions from the original program. Most of what this PR does has to do with lifting this short-circuit-removal pass from Boogie to Dafny. This achieves three things:By submitting this pull request, I confirm that my contribution is made under the terms of the MIT license.