Print runnable scenario in case of test failure #204

n-osborne · 2024-02-22T15:05:02Z

This PR is the continuation of, and is based on, #202.
Please consider only the six last commits.

This PR makes the failure message print a runnable scenario with information about returned and expected values.

It uses the same mechanism as introduced in #202 for communicating with Qcheck.fail_reportf.

For printing the commands with their sut argument in the right place, only the generated show_cmd has been modified.

Printing the user-provided value for init_sut is quite straightforward, we just have to pass the information to the runtime.

Printing the expected value is a bit more complicated.
First, we look at the Gospel specifications to see if there is enough information to compute the expected returned value (some functions may be specified in a way that their contract doesn't include a computation of the returned value).
If there is, we wrap this value in a STM.res type and send it to the runtime through ortac_postond along all the other information.
It there is not, we send a dummy value, to inform the printer that the expected returned value is unknown.
A warning will be displayed to the user in the second case, but this will not affect whether the function is tested or not.

This PR only deals with normal behaviour, meaning that the excpetional branches in ortac_postond don't send an expected returned value to the runtime for now.
I believe that handling exceptional behaviour can be done in another PR (this is me trying to keep PRs short).

The ortac_runtime now handle this extra information to print a runnable scenario to the user in case of test failure.

For now, I've made the choice to print the function calls in an ignore and to print the returned value and the optional expected returned value in comments.
We've discussed elsewhere about placing the function calls in an assert, but we don't always have an expected returned value

If we introduce a bug in plugins/qcheck-stm/test/array.ml, we have now:

λ dune build @launchtests
File "plugins/qcheck-stm/test/dune.inc", line 49, characters 0-75:
49 | (rule
50 |  (alias launchtests)
51 |  (action
52 |   (run %{dep:array_stm_tests.exe} -v)))
random seed: 515405686
generated error fail pass / total     time test name
[✗]    1    0    1    0 / 1000     0.0s Array STM tests (generating)

--- Failure --------------------------------------------------------------------

Test Array STM tests failed (14 shrink steps):

   set sut (-2603392982468686691) '\190'
   to_list sut


+++ Messages ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Messages for test Array STM tests:

Gospel specification violation in function to_list

  File "array.mli", line 36, characters 12-26:
    l = t.contents
  
when executing the following sequence of operations:

  let sut = make 16 'a' in
  ignore (set sut (-2603392982468686691) '\190'); (* returned Error (Invalid_argument("index ou t of bounds")) *)
  ignore (to_list sut); (* returned ['a'; 'a'; 'a'; 'a'; 'a'; 'a'; 'a'; '\190'; 'a'; 'a'; 'a'; 'a'; 'a'; 'a'; 'a'; 'a']
                           expected ['a'; 'a'; 'a'; 'a'; 'a'; 'a'; 'a'; 'a'; 'a'; 'a'; 'a'; 'a' ; 'a'; 'a'; 'a'; 'a'] *)
  

================================================================================
failure (1 tests failed, 0 tests errored, ran 1 tests)
                                       
[1]

jmid · 2024-02-23T14:16:36Z

Nice work Nicolas 😃🎉 I particularly liked the (aligned) returned and expected results! ❤️

I suppose we could save a few characters (the shorter output the better IMO) by using

top-level let-forms

let _ = instead of ignore( ... ):

let sut = make 16 'a'
let _ = set sut (-2603392982468686691) '\190' (* returned Error (Invalid_argument("index ou t of bounds")) *)
let _ = to_list sut (* returned ['a'; 'a'; 'a'; 'a'; 'a'; 'a'; 'a'; '\190'; 'a'; 'a'; 'a'; 'a'; 'a'; 'a'; 'a'; 'a']
                       expected ['a'; 'a'; 'a'; 'a'; 'a'; 'a'; 'a'; 'a'; 'a'; 'a'; 'a'; 'a' ; 'a'; 'a'; 'a'; 'a'] *)

As a runnable program, the example seems to place itself between two chairs.
It is both informative and runnable, yet requires manual changes:

it will stop with an exception on line 2 before reaching the model-mismatch in line 3.
the mismatch in line 3 needs, e.g., an assert to express the expectation as runnable code,
otherwise it will just succeed silently.
as a very minor nit I prefixed it with open Array to directly copy-paste it into a top-level

Short-circuit booleans means the underlying QCheck property always stops at the first model-mismatch.
I think this means that generally

we can ignore all results up until the last line (annotating them as comments is however very informative!)
exception raising calls underway need some kind of try ... with or protect

only the last line needs an assert, e.g., by outputting a two-line thing:

let r = to_list sut
let _ = assert(r = ['a'; 'a'; 'a'; 'a'; 'a'; 'a'; 'a'; 'a'; 'a'; 'a'; 'a'; 'a' ; 'a'; 'a'; 'a'; 'a']

These are just my immediate thoughts, YMMV. 🤷
We can also leave further improvements for the summer... (hint hint 😃)

jmid · 2024-02-23T14:22:12Z

Sorry if this didn't come across clearly above:
The output is already a nice improvement over the default qcheck-stm one! 👍

n-osborne · 2024-02-26T15:28:50Z

Thank you for your comments :-)

I've modified the PR so that it includes an open of the tested module at the beginning of the ouptuted scenario.

Regarding the use of an assert for the failing function call, it is worth noting that we don't always have an expected value to show: specifications can be written in a way that we can know when something goes wrong but we can't compute the expected value.

That simply means that we can't guarantee to the user that we output a failing scenario for a failing test.

shym

I must say this took quite some time to review, maybe some comments could help.
I wrote my comments on specific lines except for:

commit “Print cmds with their sut argument”: the message could mention that it also adds open X at the beginning of the generated log
commit “Collect term containing a computation of the returned value”: I must admit I had to read the code to understand the commit message; I would suggest adding a small comment to the function returned_value (maybe it would be better named returned_value_specification?)

shym · 2024-03-15T15:14:48Z

plugins/qcheck-stm/src/stm_of_ir.ml

+      match (ty.ptyp_desc, args) with
+      | Ptyp_arrow (_, l, r), xs when Cfg.is_sut config l ->
+          let* fmt, pps = aux r xs in
+          ok ("%s" :: fmt, estring "sut" :: pps)


If fmt directly contained sut instead of %s, we could keep pp_args unchanged.
On the other hand, maybe your proposition is an altogether better approach, to build fmt and pp_args at the same time.

We can indeed spare the %s and estring "sut"

shym · 2024-03-15T15:14:48Z

plugins/qcheck-stm/src/stm_of_ir.ml

+      | _, [] -> ok ([], [])
+      | _, _ ->
+          failwith
+            "shouldn't happen (list of arguments should be consistent with \


I wonder if we shouldn’t create an error value ShouldNotHappen (in a separate PR if at all), as it appears in many already monadic contexts.

plugins/qcheck-stm/src/ir_of_gospel.ml

shym · 2024-03-15T15:14:48Z

plugins/qcheck-stm/src/stm_of_ir.ml

+  in
+  let ty_show = to_option @@ exp_of_core_type value.inst ty_ret in
+  match (ty_show, ret_val) with
+  | Some ty_show, Some ret_values ->


ret_values: this is not a list, shouldn't this be singular?

shym · 2024-03-15T15:14:48Z

plugins/qcheck-stm/src/stm_of_ir.ml

+        ok @@ esome dummy
+    | Some e -> ok @@ esome e
+  in
+  let wrap_check ?(normal = false) t e =


What does normal really mean here? I see it set to true for (normal) postconditions and invariants, but anyway, I don’t have a clear understanding of the impact of setting it to true.

I've added some comments to explain.
Thanks!

This allow to output a runnable scenario in case of test failure. The runnable scenario will be of the form: ```ocaml open Module_under_test let sut = init_sut ... ```

These descriptions will be used to print the expected behaviour when possible. Functions returning tuples are still not supported, but in the same way that we've prepared the way by storing a list a returned values, we store a list of list of computation of returned value.

Also adapt the runtime so that the commit build. When a returned value can't be computed based on the information contained in the Gospel specification, a warning is displayed to the user at generation and a dummy one is placed to inform the message printer that the expected returned value is unknown.

n-osborne · 2024-03-18T09:09:16Z

Green light to merge has been given in another chat.

n-osborne force-pushed the print-runnable-scenario branch 3 times, most recently from 0d75a32 to efddee7 Compare February 23, 2024 09:33

n-osborne force-pushed the print-runnable-scenario branch 2 times, most recently from a39c209 to 25a80af Compare February 26, 2024 13:56

n-osborne force-pushed the print-runnable-scenario branch from 25a80af to b7af91c Compare February 27, 2024 15:36

n-osborne mentioned this pull request Mar 6, 2024

Add support for exceptional behaviour to printing runnable scenario #206

Merged

n-osborne added this to the 0.2 milestone Mar 6, 2024

n-osborne force-pushed the print-runnable-scenario branch 2 times, most recently from 7503aed to b59417f Compare March 14, 2024 10:25

shym reviewed Mar 15, 2024

View reviewed changes

n-osborne added 3 commits March 15, 2024 17:23

Print cmds with their sut argument

ac34a51

This allow to output a runnable scenario in case of test failure. The runnable scenario will be of the form: ```ocaml open Module_under_test let sut = init_sut ... ```

Add a warning for incomplete computation of returned value

145b440

n-osborne force-pushed the print-runnable-scenario branch from bb5ccb9 to 38e396c Compare March 15, 2024 16:29

n-osborne added 3 commits March 15, 2024 17:51

Print expression for init_sut in failure message

c6b13f6

Update Changelog

3c1d638

n-osborne force-pushed the print-runnable-scenario branch from 38e396c to 3c1d638 Compare March 15, 2024 16:52

n-osborne merged commit 83db7d2 into ocaml-gospel:main Mar 18, 2024
3 checks passed

n-osborne deleted the print-runnable-scenario branch March 18, 2024 09:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Print runnable scenario in case of test failure #204

Print runnable scenario in case of test failure #204

n-osborne commented Feb 22, 2024

jmid commented Feb 23, 2024

jmid commented Feb 23, 2024 •

edited

Loading

n-osborne commented Feb 26, 2024

shym left a comment

shym Mar 15, 2024

n-osborne Mar 15, 2024

shym Mar 15, 2024

shym Mar 15, 2024

shym Mar 15, 2024

n-osborne Mar 15, 2024

n-osborne commented Mar 18, 2024

Print runnable scenario in case of test failure #204

Print runnable scenario in case of test failure #204

Conversation

n-osborne commented Feb 22, 2024

jmid commented Feb 23, 2024

jmid commented Feb 23, 2024 • edited Loading

n-osborne commented Feb 26, 2024

shym left a comment

Choose a reason for hiding this comment

shym Mar 15, 2024

Choose a reason for hiding this comment

n-osborne Mar 15, 2024

Choose a reason for hiding this comment

shym Mar 15, 2024

Choose a reason for hiding this comment

shym Mar 15, 2024

Choose a reason for hiding this comment

shym Mar 15, 2024

Choose a reason for hiding this comment

n-osborne Mar 15, 2024

Choose a reason for hiding this comment

n-osborne commented Mar 18, 2024

jmid commented Feb 23, 2024 •

edited

Loading