Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --test-assumptions option #3185

Merged
merged 5 commits into from
Dec 16, 2022

Conversation

keyboardDrummer
Copy link
Member

@keyboardDrummer keyboardDrummer commented Dec 14, 2022

Changes

Add the --test-assumptions option to all execution commands: run,build,translate,test

Note that I've written the description of the option to be aspirational, so that it indicates what we want this option to do, with the note that it's experimental and after that an explanation of what it currently does. I've also simplified the option so it's just an off/on boolean, instead of allowing using to choose under what conditions they want to add tests, since I don't yet see a use-case for the previous :TestExterns option. Feel free to discuss!

Testing

Add the new option in a littish test

By submitting this pull request, I confirm that my contribution is made under the terms of the MIT license.

@atomb
Copy link
Member

atomb commented Dec 14, 2022

I think this looks nice, though I think there will likely be a use for an argument for other modes. I agree that the current Externs option to /testContracts is likely the useful one, and I'm happy to leave out TestedExterns. However, I think having different modes, one that tests just externs and one that tests all assumptions, is still useful. The latter isn't implemented yet, but I've started working on it. I've also started working on the ability to test all contracts (at least those that are executable). I can understand more hesitation about including that last mode, but I do think it can be valuable at certain stages of development, allowing Dafny to work more like Eiffel, say. The anticipation of including that mode is why I named the legacy option /testContracts instead of /testAssumptions.

@robin-aws
Copy link
Member

robin-aws commented Dec 14, 2022

I think this looks nice, though I think there will likely be a use for an argument for other modes. I agree that the current Externs option to /testContracts is likely the useful one, and I'm happy to leave out TestedExterns. However, I think having different modes, one that tests just externs and one that tests all assumptions, is still useful. The latter isn't implemented yet, but I've started working on it. I've also started working on the ability to test all contracts (at least those that are executable). I can understand more hesitation about including that last mode, but I do think it can be valuable at certain stages of development, allowing Dafny to work more like Eiffel, say. The anticipation of including that mode is why I named the legacy option /testContracts instead of /testAssumptions.

One thing I'm excited about is a potential mode to even test assert statements that happen to be compilable. I think it would be an easy way to get more compilation coverage in the Dafny test suite.

Given that, I worry that "contracts" is too specific a noun for this, since for me it suggests only things like pre- and post-conditions. Since we talk about how everything Dafny verifies is ultimately some kind of assertion, perhaps --test-assertions is more accurate and general? The sub-options would then filter down which assertions are in scope, to just externs (which are just unverified assertions), or all contracts, or...

In fact I do wonder if this should be its own top-level verb, or an option on test? Never mind, you're right it needs to be on other verbs like translate as well. But perhaps it would be better to phrase it as configuring which kinds of assertions are checked at runtime, with a default value of "none".

@atomb
Copy link
Member

atomb commented Dec 14, 2022

I think this looks nice, though I think there will likely be a use for an argument for other modes. I agree that the current Externs option to /testContracts is likely the useful one, and I'm happy to leave out TestedExterns. However, I think having different modes, one that tests just externs and one that tests all assumptions, is still useful. The latter isn't implemented yet, but I've started working on it. I've also started working on the ability to test all contracts (at least those that are executable). I can understand more hesitation about including that last mode, but I do think it can be valuable at certain stages of development, allowing Dafny to work more like Eiffel, say. The anticipation of including that mode is why I named the legacy option /testContracts instead of /testAssumptions.

One thing I'm excited about is a potential mode to even test assert statements that happen to be compilable. I think it would be an easy way to get more compilation coverage in the Dafny test suite.

Given that, I worry that "contracts" is too specific a noun for this, since for me it suggests only things like pre- and post-conditions. Since we talk about how everything Dafny verifies is ultimately some kind of assertion, perhaps --test-assertions is more accurate and general? The sub-options would then filter down which assertions are in scope, to just externs (which are just unverified assertions), or all contracts, or...

In fact I do wonder if this should be its own top-level verb, or an option on test? Never mind, you're right it needs to be on other verbs like translate as well. But perhaps it would be better to phrase it as configuring which kinds of assertions are checked at runtime, with a default value of "none".

Yeah, I think maybe --test-assertions is a better name. It doesn't currently include assert statements, but in a "check everything" mode I think it should. And in a mode where it's just checking assumptions, it should check assert statements in any unverified code.

@keyboardDrummer
Copy link
Member Author

keyboardDrummer commented Dec 14, 2022

I'm confused about the suggestion to call it --test-assertions. Aren't assertions the things that the verifier already checks, and assumptions the things we're skipping to verify because they're either too difficult to prove or impossible because the implementation is not written in Dafny?

Do you call a postcondition on an externally implemented method an assertion? I would call that an assumption.

However, I think having different modes, one that tests just externs and one that tests all assumptions, is still useful.

What are the use-cases there? Isn't using externs exposing you to a similar risk as when you're using assume statements? Why would you want to differentiate between those two risks?

test all contracts

You mean even the ones that were statically verified? Interesting. Many of them won't be compilable though. I would put this under an option --test-assertions=none|compilable|all, where 'all' fails if any assertion cannot be compiled.

@atomb
Copy link
Member

atomb commented Dec 14, 2022

I'm confused about the suggestion to call it --test-assertions. Aren't assertions the things that the verifier already checks, and assumptions the things we're skipping to verify because they're either too difficult to prove or impossible because the implementation is not written in Dafny?

Do you call a postcondition on an externally implemented method an assertion? I would call that an assumption.

Postconditions, in general, are both assertions and assumptions, depending on what you're doing with them. Because external code isn't being verified, you don't use its postcondition as an assertion during verification. But that's exactly why you'd want to check it at runtime. So, yes, I'd call an extern postcondition an assertion in this contect.

However, I think having different modes, one that tests just externs and one that tests all assumptions, is still useful.

What are the use-cases there? Isn't using externs exposing you to a similar risk as when you're using assume statements? Why would you want to differentiate between those two risks?

The key use case I imagine is one where internal assumptions aren't effectively testable dynamically, because they heavily use non-executable specifications, but contracts on external code are effectively testable. Although both sorts of assumptions are, indeed, assumptions, I think that in practice they tend to have fairly different characteristics.

test all contracts

You mean even the ones that were statically verified? Interesting. Many of them won't be compilable though. I would put this under an option --test-assertions=none|compilable|all, where 'all' fails if any assertion cannot be compiled.

The use case I imagine here is that you'd disable verification entirely and then check all compilable assertions dynamically, warning about any that aren't. The goal is to provide a more gradual on-ramp to verification where you can start by simply stating what you think should be true, but don't have to put in any extra effort to verify it. And then, as development proceeds and you become more confident that your specifications are actually provable, you can switch to a mode where you prove everything except what's listed explicitly as an assumption (e.g., anything that's still too hard to prove).

@keyboardDrummer
Copy link
Member Author

keyboardDrummer commented Dec 14, 2022

The use case I imagine here is that you'd disable verification entirely and then check all compilable assertions dynamically, warning about any that aren't. The goal is to provide a more gradual on-ramp to verification where you can start by simply stating what you think should be true, but don't have to put in any extra effort to verify it. And then, as development proceeds and you become more confident that your specifications are actually provable, you can switch to a mode where you prove everything except what's listed explicitly as an assumption (e.g., anything that's still too hard to prove).

Sounds good

Postconditions, in general, are both assertions and assumptions, depending on what you're doing with them. Because external code isn't being verified, you don't use its postcondition as an assertion during verification. But that's exactly why you'd want to check it at runtime. So, yes, I'd call an extern postcondition an assertion in this context.

assert statements have conditions that are verified, while assume statements have conditions that are not verified, but afterwards are taken for granted by the verifier.

If you call an external method with a postcondition, then after the call the ensures clause is taken for granted by the verifier, even though it was never verified, so why would you not call the ensure clause an assumption in this case?

The key use case I imagine is one where internal assumptions aren't effectively testable dynamically, because they heavily use non-executable specifications, but contracts on external code are effectively testable. Although both sorts of assumptions are, indeed, assumptions, I think that in practice they tend to have fairly different characteristics.

Why would the characteristics be different? It could also be the other way around, that you have a compilable assume statement and an uncompilable ensures clause for an {:extern} method. Is there a reason why you want to dynamically test your extern clauses more than your assume statements? Is one less trustworthy than the other?

We could have a --test-assumptions=none|compilable|all option that does a best effort if you select compilable, so you could choose to skip the uncompilable things.

@atomb
Copy link
Member

atomb commented Dec 14, 2022

Postconditions, in general, are both assertions and assumptions, depending on what you're doing with them. Because external code isn't being verified, you don't use its postcondition as an assertion during verification. But that's exactly why you'd want to check it at runtime. So, yes, I'd call an extern postcondition an assertion in this context.

assert statements have conditions that are verified, while assume statements have conditions that are not verified, but afterwards are taken for granted by the verifier.

If you call an external method with a postcondition, then after the call the ensures clause is taken for granted by the verifier, even though it was never verified, so why would you not call the ensure clause an assumption in this case?

It's both. It's an assumption from the point of the caller, but it's an assertion from the point of the callee (the external code). Since it's not going to be verified and that external code isn't going to test it at runtime, though, it's nice to have the caller test instead.

The key use case I imagine is one where internal assumptions aren't effectively testable dynamically, because they heavily use non-executable specifications, but contracts on external code are effectively testable. Although both sorts of assumptions are, indeed, assumptions, I think that in practice they tend to have fairly different characteristics.

Why would the characteristics be different? It could also be the other way around, that you have a compilable assume statement and an uncompilable ensures clause for an {:extern} method. Is there a reason why you want to dynamically test your extern clauses more than your assume statements? Is one less trustworthy than the other?

The key difference is that contracts between internal code are often complex, because they can be (since there's a way to check them statically), and that's more likely to lead to introducing things like quantifiers and infinite mathematical objects. Due to the more limited support for checking, I tend to (and think others tend to) make contracts on externs as simple as possible, and design the code so that those contracts can be simple by avoiding as much coupling as possible.

@keyboardDrummer
Copy link
Member Author

It's both. It's an assumption from the point of the caller, but it's an assertion from the point of the callee (the external code).

I agree that from an implementation perspective that is true, since Dafny uses modular verification and when you do a call it won't verify the callee again just for that call.

However I don't think that's the case from a usage perspective. I would say that asserting a condition means asking Dafny to verify that it holds, while assuming a condition means asking Dafny to merely assume that it holds. The ensures clauses of the callee are verified to hold after the call, so in that sense they are asserted, not assumed, from the caller's perspective.

The key difference is that contracts between internal code are often complex, because they can be (since there's a way to check them statically), and that's more likely to lead to introducing things like quantifiers and infinite mathematical objects. Due to the more limited support for checking, I tend to (and think others tend to) make contracts on externs as simple as possible, and design the code so that those contracts can be simple by avoiding as much coupling as possible.

Makes sense, but these are only statistical differences. I think it would be better to allow customers to decide for which assume and {:externs} they want to generate runtime tests. I think we can re-use the {:axiom} annotation for this. {:axiom} will mean "I want to statically verify this, but I'm [currently] unable and I'm accepting that", and prevents a dynamic test from being generated. assumes without {:axiom} will generate a warning, as our users have requested, unless --test-assumptions is specified and they can be compiled.

How about we introduce --test-assumptions=none|all|compilable which does the following when using all:

  • assume without {:axiom}, error if it can not be compiled. Generates a regular warning unless --test-assumptions is specified.
  • assume with {:axiom}, no runtime test is generated.
  • calls to externally implemented {:extern} methods without {:axiom} with ensures clauses, error if they can not be compiled. Generates a regular warning unless --test-assumptions is specified.
  • calls to externally implemented {:extern} methods with {:axiom} with ensures clauses, no runtime test is generated. Generates a regular warning unless --test-assumptions is specified.
  • Dafny implemented {:extern} methods without {:axiom} with requires clauses, error if they can not be compiled.
  • Dafny implemented {:extern} methods with {:axiom} with requires clauses, no runtime test is generated.

When using compilable, no error is given when trying to create a runtime test for an not compilable condition.

@atomb
Copy link
Member

atomb commented Dec 15, 2022

How about we introduce --test-assumptions=none|all|compilable which does the following when using all:

* `assume` without `{:axiom}`, error if it can not be compiled.  Generates a regular warning unless `--test-assumptions` is specified.

* `assume` with `{:axiom}`, no runtime test is generated.

* calls to externally implemented `{:extern}` methods without {:axiom} with ensures clauses, error if they can not be compiled. Generates a regular warning unless `--test-assumptions` is specified.

* calls to externally implemented `{:extern}` methods with {:axiom} with ensures clauses, no runtime test is generated. Generates a regular warning unless `--test-assumptions` is specified.

* Dafny implemented `{:extern}` methods without {:axiom} with requires clauses, error if they can not be compiled.

* Dafny implemented `{:extern}` methods with {:axiom} with requires clauses, no runtime test is generated.

I think there's more design work we'll need to do to choose the precise set of arguments to include. For the purposes of this PR, I think the key conclusion is that we'll support more than one mode, so an argument is necessary. Since Externs is currently implemented, it seems like it makes sense to support externs and none (the default) in this PR, and add others later. What do you think?

@keyboardDrummer
Copy link
Member Author

keyboardDrummer commented Dec 16, 2022

I think there's more design work we'll need to do to choose the precise set of arguments to include. For the purposes of this PR, I think the key conclusion is that we'll support more than one mode, so an argument is necessary. Since Externs is currently implemented, it seems like it makes sense to support externs and none (the default) in this PR, and add others later. What do you think?

Well, I think we could go with --test-assumptions=true|false as well, and only show warnings not errors if assumptions are not compilable. That will allow the user to runtime test their assumptions while they're happily coding away.

Note that this option is marked as experimental, and we'll be free to change it after we add it, so I've made the change you suggested.

@@ -1,5 +1,6 @@
using System.Collections.Generic;
using System.CommandLine;
using System.Diagnostics.Tracing;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the code using this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope ^,^ would prefer not to have another review/build cycle again for this though. Maybe we can find a better way of catching this in the future, maybe through dotnet-format.

@@ -38,6 +38,7 @@ public interface ICommandSpec {
BoogieOptionBag.NoVerify,
CommonOptionBag.EnforceDeterminism,
CommonOptionBag.OptimizeErasableDatatypeWrapper,
CommonOptionBag.TestAssumptions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still slightly new the the modern options architecture we have here: will this option be available for dafny translate, too? I suspect it'll be used most often there, though also during dafny run and dafny test.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Execution options are available for run,build,translate and test

@keyboardDrummer keyboardDrummer merged commit 96042ff into dafny-lang:master Dec 16, 2022
@keyboardDrummer keyboardDrummer deleted the test-assumptions branch December 16, 2022 17:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants