Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sub-optimal QCheck2 function shrinkers #163

Open
jmid opened this issue Aug 18, 2021 · 0 comments
Open

Sub-optimal QCheck2 function shrinkers #163

jmid opened this issue Aug 18, 2021 · 0 comments

Comments

@jmid
Copy link
Collaborator

jmid commented Aug 18, 2021

Sub-optimal function shrinkers

A number of the function tests illustrate a difference in the function shrinker's output (terminal cuts off at width 160):

--- Failure -----------------------------------------------------------------	--- Failure -----------------------------------------------------------------

Test fail_pred_map_commute failed (127 shrink steps):			      |	Test fail_pred_map_commute failed (16 shrink steps):

([3], {_ -> 0}, {3 -> false; _ -> true})				      |	([2], {_ -> 0}, {1 -> false; 2 -> true; _ -> false})

...

--- Failure -----------------------------------------------------------------	--- Failure -----------------------------------------------------------------

Test fold_left fold_right failed (25 shrink steps):			      |	Test fold_left fold_right failed (22 shrink steps):

(0, [1], {(1, 0) -> 1; _ -> 0})						      |	(0, [1], {(1, 0) -> 1; (8, 0) -> 0; (8, 8) -> 0; (8, 93) -> 0; (7, 7) -> 0; (

...

--- Failure -----------------------------------------------------------------	--- Failure -----------------------------------------------------------------

Test fold_left fold_right uncurried failed (111 shrink steps):		      |	Test fold_left fold_right uncurried failed (325 shrink steps):

({(5, 7) -> 0; _ -> 7}, 0, [5; 0])					      |	({(23, 62) -> 0; (9, 42) -> 0; (8, 61) -> 0; (8, 5) -> 0; (30, 5) -> 0; (9, 6

--- Failure -----------------------------------------------------------------	--- Failure -----------------------------------------------------------------

Test fold_left fold_right uncurried fun last failed (26 shrink steps):	      |	Test fold_left fold_right uncurried fun last failed (25 shrink steps):

(0, [1], {(0, 1) -> 1; _ -> 0})						      |	(0, [1], {(0, 2) -> 0; (8, 80) -> 0; (93, 9) -> 0; (7, 24) -> 0; (8, 0) -> 0;

The steps spent vary between the two approaches. The last two are perhaps most saying.
First off, here's the counterexamples reported by QCheck2:

--- Failure --------------------------------------------------------------------

Test fold_left fold_right uncurried failed (325 shrink steps):

({(23, 62) -> 0; (9, 42) -> 0; (8, 61) -> 0; (8, 5) -> 0; (30, 5) -> 0; (9, 6) -> 0; (76, 6) -> 0; (19, 31) -> 0; (7, 62) -> 0; (0, 7) -> 1; (7, 1) -> 0; (78, 4) -> 0; (8, 2) -> 0; (78, 0) -> 0; (3, 47) -> 0; (4, 8) -> 0; (98, 9) -> 0; (1, 38) -> 0; (0, 26) -> 0; (1, 7) -> 0; (86, 3) -> 0; (9, 37) -> 0; (8, 1) -> 0; (79, 9) -> 0; (3, 5) -> 0; (56, 8) -> 0; (2, 5) -> 0; (8, 8) -> 0; (56, 67) -> 0; (5, 60) -> 0; (2, 31) -> 0; (61, 6) -> 0; (12, 5) -> 0; (76, 2) -> 0; (78, 8) -> 0; (1, 1) -> 0; (8, 9) -> 0; (7, 8) -> 0; (2, 9) -> 0; (29, 7) -> 0; (5, 8) -> 0; (28, 6) -> 0; (1, 4) -> 0; (9, 79) -> 0; (0, 1) -> 0; (1, 41) -> 0; (82, 98) -> 0; (6, 79) -> 0; (7, 6) -> 0; (4, 3) -> 0; (8, 12) -> 0; (5, 1) -> 0; (39, 1) -> 0; (3, 6) -> 0; (1, 2) -> 0; (76, 31) -> 0; (4, 1) -> 0; (6, 5) -> 0; (0, 8) -> 0; (8, 7) -> 0; (2, 6) -> 0; (52, 5) -> 0; (8, 47) -> 0; (5, 3) -> 0; (7, 9) -> 0; (13, 13) -> 0; (0, 87) -> 0; (82, 0) -> 0; (34, 8) -> 0; (1, 14) -> 0; (2, 71) -> 0; (52, 4) -> 0; (1, 3) -> 0; (85, 6) -> 0; (8, 19) -> 0; (3, 13) -> 0; (69, 1) -> 0; (5, 62) -> 0; (0, 15) -> 0; (34, 0) -> 0; (9, 4) -> 0; (0, 6) -> 0; (1, 8) -> 0; (86, 6) -> 0; (4, 5) -> 0; (3, 1) -> 0; (57, 2) -> 0; (3, 3) -> 0; (4, 0) -> 0; (30, 6) -> 0; (5, 34) -> 0; (0, 4) -> 0; (2, 3) -> 0; (5, 6) -> 0; (5, 7) -> 0; (5, 0) -> 0; (4, 4) -> 0; (7, 5) -> 0; (78, 2) -> 0; (9, 8) -> 0; (7, 70) -> 0; (35, 1) -> 0; (64, 7) -> 0; (60, 0) -> 0; (1, 9) -> 0; _ -> 0}, 0, [0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 7])

--- Failure --------------------------------------------------------------------

Test fold_left fold_right uncurried fun last failed (25 shrink steps):

(0, [1], {(0, 2) -> 0; (8, 80) -> 0; (93, 9) -> 0; (7, 24) -> 0; (8, 0) -> 0; (9, 7) -> 0; (0, 24) -> 0; (0, 7) -> 0; (7, 1) -> 0; (8, 9) -> 0; (24, 0) -> 0; (5, 8) -> 0; (1, 0) -> 1; (4, 8) -> 0; (7, 0) -> 0; (5, 7) -> 0; (8, 4) -> 0; (24, 5) -> 0; (0, 1) -> 0; (2, 8) -> 0; (9, 1) -> 0; (8, 8) -> 0; _ -> 0})

It is clear that generating functions last in a tuple rather than first (a hard-learned lesson from issue #8) is still a clear win for integrated shrinking. Still, there is quite a few more bindings reported in the above counterexample compared to QCheck.
Looking at the function generator and shrinker I can see a list of bindings is reduced using Tree.applicative_take which takes at 0-len bindings:

    let shrinks : (k, v) t Tree.t Seq.t = fun () ->
      (* This only gets evaluated *after* the test was run for [tbl], meaning it is correctly
         populated with bindings recorded during the test already *)
      let current_bindings : (k * v Tree.t) list = List.rev !(root.p_tree_bindings_rev) in
      let take_at_most_tree : int Tree.t = Tree.make_primitive (Shrink.int_towards 0) (List.length current_bindings) in
      let current_tree_bindings : (k * v) Tree.t list = List.map (fun (k, tree) -> Tree.map (fun v -> (k, v)) tree) current_bindings in
      let shrunk_bindings_tree : (k * v) list Tree.t = Tree.bind take_at_most_tree (fun take_at_most -> Tree.applicative_take take_at_most current_tree_bindings) in
      (* During shrinking, we don't want to record/add bindings, so [~extend:false]. *)
      let shrunk_poly_tbl_tree : (k, v) t Tree.t = Tree.map (fun bindings -> List.to_seq bindings |> T.of_seq |> make ~extend:false) shrunk_bindings_tree in
      (* [shrunk_poly_tbl_tree] is a bit misleading: its root *should* be the same as [root] but because of the required laziness
         induced by the mutation of bindings, we don't use it, only graft its children to the original [root]. *)
      Tree.children shrunk_poly_tbl_tree ()
    in
    Tree.Tree (root, shrinks)

This looks like a quick algorithmic list shrinker to me - also with opportunities for improvements.

It puzzled me that the take_at_most wasn't reducing the above list of bindings further. After all, the outlier binding (1, 0) -> 1 is in the middle!

I therefore added a shrink log of successful shrink attempts by replacing l.355 of QCheck2_expect_test.ml with:

  QCheck_base_runner.run_tests ~colors:false ~debug_shrink:(Some (open_out "funshrinklog.txt")) ~debug_shrink_list:["fold_left fold_right uncurried fun last"] ([

After rerunning the tests, this revealed the following in _build/default/test/core/funshrinklog.txt:


~~~ Shrink ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Test fold_left fold_right uncurried fun last sucessfully shrunk counter example (step 0) to:

(5, [7; 8; 9; 24], {(8, 80) -> 0; (7, 24) -> 7; (8, 9) -> 7; (7, 0) -> 3; (5, 7) -> 8; (24, 5) -> 1; (9, 1) -> 80; (8, 8) -> 8; _ -> 25})

~~~ Shrink ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Test fold_left fold_right uncurried fun last sucessfully shrunk counter example (step 1) to:

(0, [7; 8; 9; 24], {(8, 80) -> 0; (93, 9) -> 0; (7, 24) -> 7; (9, 7) -> 4; (0, 24) -> 1; (0, 7) -> 5; (7, 1) -> 8; (8, 9) -> 7; (24, 0) -> 7; (5, 8) -> 93; (7, 0) -> 3; (5, 7) -> 8; (8, 4) -> 1; (24, 5) -> 1; (9, 1) -> 80; (8, 8) -> 8; _ -> 25})

~~~ Shrink ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Test fold_left fold_right uncurried fun last sucessfully shrunk counter example (step 2) to:

(0, [2; 8], {(0, 2) -> 4; (8, 80) -> 0; (93, 9) -> 0; (7, 24) -> 7; (8, 0) -> 8; (9, 7) -> 4; (0, 24) -> 1; (0, 7) -> 5; (7, 1) -> 8; (8, 9) -> 7; (24, 0) -> 7; (5, 8) -> 93; (4, 8) -> 18; (7, 0) -> 3; (5, 7) -> 8; (8, 4) -> 1; (24, 5) -> 1; (2, 8) -> 1; (9, 1) -> 80; (8, 8) -> 8; _ -> 25})

~~~ Shrink ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Test fold_left fold_right uncurried fun last sucessfully shrunk counter example (step 3) to:

(0, [1], {(0, 2) -> 4; (8, 80) -> 0; (93, 9) -> 0; (7, 24) -> 7; (8, 0) -> 8; (9, 7) -> 4; (0, 24) -> 1; (0, 7) -> 5; (7, 1) -> 8; (8, 9) -> 7; (24, 0) -> 7; (5, 8) -> 93; (1, 0) -> 6; (4, 8) -> 18; (7, 0) -> 3; (5, 7) -> 8; (8, 4) -> 1; (24, 5) -> 1; (0, 1) -> 0; (2, 8) -> 1; (9, 1) -> 80; (8, 8) -> 8; _ -> 25})

~~~ Shrink ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Test fold_left fold_right uncurried fun last sucessfully shrunk counter example (step 4) to:

(0, [1], {(0, 2) -> 4; (8, 80) -> 0; (93, 9) -> 0; (7, 24) -> 7; (8, 0) -> 8; (9, 7) -> 4; (0, 24) -> 1; (0, 7) -> 5; (7, 1) -> 8; (8, 9) -> 7; (24, 0) -> 7; (5, 8) -> 93; (1, 0) -> 6; (4, 8) -> 18; (7, 0) -> 3; (5, 7) -> 8; (0, 0) -> 3; (8, 4) -> 1; (24, 5) -> 1; (0, 1) -> 0; (2, 8) -> 1; (9, 1) -> 80; (8, 8) -> 8; _ -> 0})

[...]

This looks fishy to me: while the first elements of the tuple are reduced, the last one (the function table) actually gets longer in each attempt!
I think this is actually due to how Gen.triple is currently defined (and thus shrinks). I'll therefore add some tests for it and report separately.

Originally posted by @jmid in #153 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant