Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Show steps for successful proof #30

Closed
treiher opened this issue Jun 17, 2020 · 6 comments · Fixed by #41
Closed

Show steps for successful proof #30

treiher opened this issue Jun 17, 2020 · 6 comments · Fixed by #41
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@treiher
Copy link

treiher commented Jun 17, 2020

Is your feature request related to a problem? Please describe.
While reducing --timeout to improve the proof time is a valid option, it is not very reliable when GNATprove is executed on machines with very different computing power. A reproducible proof setting is especially important when GNATprove is regularly run on external machines with unknown resources (e.g., a CI provider). That is why I usually prefer to use --steps instead (or in combination with a high timeout). Unfortunately, SPAT doesn't analyze proof steps.

Describe the solution you'd like
The steps could be shown along with the time. For example:

RFLX.RFLX_Types.U64_Insert => 120.0 s/1.6 ks 87391 steps
`-VC_PRECONDITION rflx-rflx_generic_types.adb:221:39 => 120.0 s/491.0 s 46034 steps
  -altergo: 188.2 ms 46034 steps (Valid)
[...]

It should be also possible to sort the output by the number of steps, so that the most difficult units of the whole project can be easily seen.

@treiher treiher added the enhancement New feature or request label Jun 17, 2020
@Jellix
Copy link
Member

Jellix commented Jun 17, 2020

Indeed, I recently added the number of steps to the output (see xref_#26 branch), but no sorting yet.

The problem with steps is that the number is very prover specific, so you can't really just add them up to a total number or otherwise mix them, that was also the reason why I started focusing on the time values reported, they are far more reliable for comparisons even though they are very machine specific.

For example:

`-VC_LOOP_INVARIANT_PRESERV sparknacl-sign.adb:896:16 => 51.5 s/115.1 s
 `-CVC4: 51.5 s (Unknown (unknown), 505013 steps)
  -Z3: 8.2 s (Valid, 11655208 steps)
 `-CVC4: 51.1 s (Unknown (unknown), 505021 steps)
  -Z3: 4.3 s (Valid, 5928053 steps)

While Z3 is reporting about 11.6 M and 6 M steps (that's roundabout 1.4 M steps/s), CVC4 only reports 505 k each (so roughly 9 k steps/s), but the time taken can be magnitudes more than the number of steps let you believe.

If you just add them, the number of steps reported for Z3 would be 17.6 M, with the steps added from CVC4 the total number would increase to about 18.6 M, which - due to the fact that the steps/s value is vastly different - doesn't say much about the actual difference in computing time.

To solve that, I'd probably need to introduce some kind of scaling factor and report a number of "virtual steps" instead. But then again, you could just use the time value, because that's what it will be then, just recalculated as steps.

@treiher
Copy link
Author

treiher commented Jun 17, 2020

The problem with steps is that the number is very prover specific, so you can't really just add them up to a total number or otherwise mix them

Yes, it is unfortunate that the steps are very prover specific. I think it would still make sense to determine the maximum number of required steps over all provers. I suppose there will not always be an optimal order of provers for an unit, so that no prover fails. In such a case knowing the maximum number of steps allows to add a steps limit to the unit which may lead to an earlier failing prover.

For example:

`-VC_PRECONDITION test.adb:87:40 => 18.5 s/23.5 s 5847 steps
 `-CVC4: 18.5 s (Failure (steps:43390))
  -altergo: 4.7 s (Valid, 5847 steps)

Adding --steps=5847 to test.adb would lead to an earlier failing CVC4 and therefore less wasted time.

@Jellix Jellix added this to the V1.0.0 milestone Jun 17, 2020
@Jellix
Copy link
Member

Jellix commented Jun 17, 2020

I link that to #26 for now, I don't see why finding the optimal configuration should not take the steps into account. At the very least it could output the number of steps that need to be configured for any configuration it finds.

@Jellix Jellix mentioned this issue Jun 20, 2020
@Jellix Jellix linked a pull request Jun 20, 2020 that will close this issue
@treiher
Copy link
Author

treiher commented Jul 14, 2020

Is there a reason for showing the steps only with --details? It would be nice to have them already with --report-mode and --summary, and allow sorting them by steps.

@Jellix Jellix reopened this Jul 14, 2020
@Jellix
Copy link
Member

Jellix commented Jul 14, 2020

No real reason other than that I initially focused on reported times and added the steps later in development.

@Jellix
Copy link
Member

Jellix commented Jul 14, 2020

Issue #61 opened for the latest specific request to properly keep track of changes.

@Jellix Jellix closed this as completed Jul 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants