Show steps for successful proof #30

treiher · 2020-06-17T09:52:46Z

Is your feature request related to a problem? Please describe.
While reducing --timeout to improve the proof time is a valid option, it is not very reliable when GNATprove is executed on machines with very different computing power. A reproducible proof setting is especially important when GNATprove is regularly run on external machines with unknown resources (e.g., a CI provider). That is why I usually prefer to use --steps instead (or in combination with a high timeout). Unfortunately, SPAT doesn't analyze proof steps.

Describe the solution you'd like
The steps could be shown along with the time. For example:

RFLX.RFLX_Types.U64_Insert => 120.0 s/1.6 ks 87391 steps
`-VC_PRECONDITION rflx-rflx_generic_types.adb:221:39 => 120.0 s/491.0 s 46034 steps
  -altergo: 188.2 ms 46034 steps (Valid)
[...]

It should be also possible to sort the output by the number of steps, so that the most difficult units of the whole project can be easily seen.

The text was updated successfully, but these errors were encountered:

Jellix · 2020-06-17T11:44:22Z

Indeed, I recently added the number of steps to the output (see xref_#26 branch), but no sorting yet.

The problem with steps is that the number is very prover specific, so you can't really just add them up to a total number or otherwise mix them, that was also the reason why I started focusing on the time values reported, they are far more reliable for comparisons even though they are very machine specific.

For example:

`-VC_LOOP_INVARIANT_PRESERV sparknacl-sign.adb:896:16 => 51.5 s/115.1 s
 `-CVC4: 51.5 s (Unknown (unknown), 505013 steps)
  -Z3: 8.2 s (Valid, 11655208 steps)
 `-CVC4: 51.1 s (Unknown (unknown), 505021 steps)
  -Z3: 4.3 s (Valid, 5928053 steps)

While Z3 is reporting about 11.6 M and 6 M steps (that's roundabout 1.4 M steps/s), CVC4 only reports 505 k each (so roughly 9 k steps/s), but the time taken can be magnitudes more than the number of steps let you believe.

If you just add them, the number of steps reported for Z3 would be 17.6 M, with the steps added from CVC4 the total number would increase to about 18.6 M, which - due to the fact that the steps/s value is vastly different - doesn't say much about the actual difference in computing time.

To solve that, I'd probably need to introduce some kind of scaling factor and report a number of "virtual steps" instead. But then again, you could just use the time value, because that's what it will be then, just recalculated as steps.

treiher · 2020-06-17T12:42:18Z

The problem with steps is that the number is very prover specific, so you can't really just add them up to a total number or otherwise mix them

Yes, it is unfortunate that the steps are very prover specific. I think it would still make sense to determine the maximum number of required steps over all provers. I suppose there will not always be an optimal order of provers for an unit, so that no prover fails. In such a case knowing the maximum number of steps allows to add a steps limit to the unit which may lead to an earlier failing prover.

For example:

`-VC_PRECONDITION test.adb:87:40 => 18.5 s/23.5 s 5847 steps
 `-CVC4: 18.5 s (Failure (steps:43390))
  -altergo: 4.7 s (Valid, 5847 steps)

Adding --steps=5847 to test.adb would lead to an earlier failing CVC4 and therefore less wasted time.

Jellix · 2020-06-17T21:41:05Z

I link that to #26 for now, I don't see why finding the optimal configuration should not take the steps into account. At the very least it could output the number of steps that need to be configured for any configuration it finds.

treiher · 2020-07-14T10:19:46Z

Is there a reason for showing the steps only with --details? It would be nice to have them already with --report-mode and --summary, and allow sorting them by steps.

Jellix · 2020-07-14T13:02:12Z

No real reason other than that I initially focused on reported times and added the steps later in development.

Jellix · 2020-07-14T13:07:57Z

Issue #61 opened for the latest specific request to properly keep track of changes.

treiher added the enhancement New feature or request label Jun 17, 2020

treiher assigned Jellix Jun 17, 2020

Jellix added this to the V1.0.0 milestone Jun 17, 2020

Jellix mentioned this issue Jun 20, 2020

--suggest switch #41

Merged

Jellix linked a pull request Jun 20, 2020 that will close this issue

--suggest switch #41

Merged

Jellix closed this as completed in #41 Jun 20, 2020

Jellix reopened this Jul 14, 2020

Jellix mentioned this issue Jul 14, 2020

Show max/success steps also in summary and less detailed report modes #61

Closed

Jellix closed this as completed Jul 14, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Show steps for successful proof #30

Show steps for successful proof #30

treiher commented Jun 17, 2020

Jellix commented Jun 17, 2020

treiher commented Jun 17, 2020 •

edited

Loading

Jellix commented Jun 17, 2020

treiher commented Jul 14, 2020

Jellix commented Jul 14, 2020

Jellix commented Jul 14, 2020

Show steps for successful proof #30

Show steps for successful proof #30

Comments

treiher commented Jun 17, 2020

Jellix commented Jun 17, 2020

treiher commented Jun 17, 2020 • edited Loading

Jellix commented Jun 17, 2020

treiher commented Jul 14, 2020

Jellix commented Jul 14, 2020

Jellix commented Jul 14, 2020

treiher commented Jun 17, 2020 •

edited

Loading