Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inlining overhaul for test generation #4255

Merged
merged 102 commits into from
Jul 26, 2023
Merged

Conversation

Dargones
Copy link
Collaborator

@Dargones Dargones commented Jul 6, 2023

This PR overhauls implementation of inlining, on which test generation relies to generate system-level tests, and which can in the future be used for other purposes, such as bounded model checking. Inlining, in this context, means adjusting the Dafny to Boogie translation in such a way that the resulting Boogie program, when you call Boogie's Inline method on it, merges all specified Dafny functions and methods, including all corresponding well-formedness checks, into a single procedure. The key AST transformation necessary to make this happen is removal of short circuiting expressions from the original program. Most of what this PR does has to do with lifting this short-circuit-removal pass from Boogie to Dafny. This achieves three things:

  • The new implementation is much more efficient. The old implementation can generate unit tests and inline one or two methods but not entire programs. Once Boogie 3.0 is merged into Dafny, the speed up should increase further.
  • The new implementation preserves Dafny line/column information to report coverage in terms of Dafny source rather than its Boogie representation. Making this change has been one of the main asks pertaining to test generation.
  • The new code should be, I think, easier to maintain going forward. It is split into disjoined transformations over Dafny/Boogie ASTs (here is the main coordinating method) and makes as few assumptions about the translation process as possible (whereas the previous implementation would rely very heavily on position and existence of various constructs that Dafny translator creates)

By submitting this pull request, I confirm that my contribution is made under the terms of the MIT license.

Copy link
Member

@atomb atomb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a great improvement. I have a couple of suggestions about error message wording, and a couple of questions, but nothing major.

@@ -26,7 +29,6 @@ public class Setup {

public static TheoryData<List<Action<DafnyOptions>>> OptionSettings() {
var optionSettings = new TheoryData<List<Action<DafnyOptions>>>();
optionSettings.Add(new() { options => options.TypeEncodingMethod = CoreOptions.TypeEncoding.Arguments });
optionSettings.Add(new() { options => options.TypeEncodingMethod = CoreOptions.TypeEncoding.Predicates });
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just leaving a note to have a record that we would like to figure out why it doesn't work well with Arguments and move it over to that mode once we do that for Dafny more broadly.

$"{coveredByTests} should be covered by tests " +
$"(assuming no tests were found to be duplicates of each other). " +
$"Moreover, {coveredByCounterexamples} locations have been found to be reachable " +
$"(i.e. the verifier returned a counterexample and did not timeout). " +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether a word other than "counterexample" would be more useful here to folks who don't know as much about the inner workings of how test generation uses SMT.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done - paraphrased to "...the verifier did not timeout and produced example inputs to reach these locations"

}

internal void PreResolve(Program program) {
AddByMethod(program.DefaultModule);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this skip other modules?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't - all modules are submodules of the default module

namespace DafnyTestGeneration.Inlining;

/// <summary>
/// Separates the bodies of function-by-methods into separate methods so that translator will process them accordingly.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's cool that you can do this. I can think of some other places it could be useful.

@@ -131,25 +156,35 @@ private enum Status { Success, Failure, Untested }
return counterexampleLog;
}
var log = writer.ToString();
// If Dafny finds several assertion violations (e.g. because of inlining one trap assertion multiple times),
// pick the first execution trace and extract the counterexample from it
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this always the right trace?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The choice of the trace shouldn't matter since I replace all native assertions with assumptions and so the only assertion that can fail is the one that is being targeted. One can imagine an optimization that picks a trace covering the most blocks but in terms correctness the choice should not affect the result.

$"e.g. if branching is conditional on the result of a trait instance " +
$"method call.");
$"for {duplicate.uniqueId} - this may occur if the code under test is non-deterministic, " +
$"if a method/function is not inlined, or if test generation cannot extract a value from a counterexample.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about "counterexample" as before.

@atomb
Copy link
Member

atomb commented Jul 20, 2023

Those two failures aren't your fault. One will be fixed by #4317 and the other is non-deterministic.

atomb
atomb previously approved these changes Jul 20, 2023
@Dargones
Copy link
Collaborator Author

Got it, thank you @atomb!

@atomb atomb enabled auto-merge (squash) July 25, 2023 21:32
@atomb atomb merged commit 22d5422 into dafny-lang:master Jul 26, 2023
18 checks passed
keyboardDrummer added a commit to keyboardDrummer/dafny that referenced this pull request Sep 15, 2023
This PR overhauls implementation of inlining, on which test generation
relies to generate system-level tests, and which can in the future be
used for other purposes, such as bounded model checking. Inlining, in
this context, means adjusting the Dafny to Boogie translation in such a
way that the resulting Boogie program, when you call Boogie's `Inline`
method on it, merges all specified Dafny functions and methods,
including all corresponding well-formedness checks, into a single
procedure. The key AST transformation necessary to make this happen is
removal of short circuiting expressions from the original program. Most
of what this PR does has to do with lifting this short-circuit-removal
pass from Boogie to Dafny. This achieves three things:

- The new implementation is much more efficient. The old implementation
can generate unit tests and inline one or two methods but not entire
programs. Once Boogie 3.0 is merged into Dafny, the speed up should
increase further.
- The new implementation preserves Dafny line/column information to
report coverage in terms of Dafny source rather than its Boogie
representation. Making this change has been one of the main asks
pertaining to test generation.
- The new code should be, I think, easier to maintain going forward. It
is split into disjoined transformations over Dafny/Boogie ASTs (here is
the [main coordinating
method](https://github.com/Dargones/dafny/blob/a5004d6d62d324433b57937363605b8ba975dca6/Source/DafnyTestGeneration/Inlining/InliningTranslator.cs#L22-L47))
and makes as few assumptions about the translation process as possible
(whereas the previous implementation would rely very heavily on position
and existence of various constructs that Dafny translator creates)

By submitting this pull request, I confirm that my contribution
is made under the terms of the MIT license.

---------

Co-authored-by: Aleksandr Fedchin <fedchina@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants