Generator reproducability with function generators #236

jmid · 2022-04-08T10:23:22Z

One nice property of a QuickCheck setup is the ability to reproduce a failing test's input when providing the same seed.
We were just bitten by a situation where this doesn't hold under special circumstances that involve

a function generator
a non-deterministic property

When started from the same seed, a test sequence generating 10 triples may differ (note the difference in the last 4 triples):

(7,fun,50) (60,fun,9) (4,fun,8) (8,fun,5) (8,fun,7) (1,fun,7) (5,fun,0) (8,fun,1) (0,fun,1) (4,fun,6)
or
(7,fun,50) (60,fun,9) (4,fun,8) (8,fun,5) (8,fun,7) (1,fun,7) (4,fun,1) (2,fun,8) (3,fun,0) (0,fun,4)

Here is a minimal repro:

open QCheck

(* [apply_n f x n] computes f(f(...f(x))) with n applications of f *)
let rec apply_n f x n =
  if n=0
  then x
  else apply_n f (f x) (pred n)

let t =
  Test.make ~count:10 ~name:"t1"
    (set_shrink Shrink.nil (triple small_int (fun1 Observable.int small_int) small_int))
    (fun (i,f,j) ->
       Printf.printf "(%i,fun,%i) %!" i j;
       if 1 = Float.to_int (Unix.time ()) mod 2
       then
         (ignore(apply_n (Fn.apply f) i j > 0); true)
       else
         (ignore(apply_n (Fn.apply f) i i > 0); true)
    )

let sleep_time = 0.175;;
QCheck_runner.run_tests ~rand:(Random.State.make [|1234|]) [t];;
Unix.sleepf sleep_time;;
QCheck_runner.run_tests ~rand:(Random.State.make [|1234|]) [t];;
Unix.sleepf sleep_time;;
QCheck_runner.run_tests ~rand:(Random.State.make [|1234|]) [t];;
Unix.sleepf sleep_time;;
QCheck_runner.run_tests ~rand:(Random.State.make [|1234|]) [t];;
Unix.sleepf sleep_time;;
QCheck_runner.run_tests ~rand:(Random.State.make [|1234|]) [t];;

This (admittedly ridiculous) test generates a function and two small_ints, then tests a vacuously true property, that happens to perform one function application or another, depending on Unix.time.
As another side-effect in prints the generated triple - for our viewing pleasure.

When I simply #use the file repeatedly from utop I'm able to observe changes in the generated triples, despite starting the test from fresh RNGs seeded identically:

val apply_n : ('a -> 'a) -> 'a -> int -> 'a = <fun>
val t : Test.t = QCheck2.Test.Test <abstr>
val sleep_time : float = 0.175
(7,fun,50) (60,fun,9) (4,fun,8) (8,fun,5) (8,fun,7) (1,fun,7) (5,fun,0) (8,fun,1) (0,fun,1) (4,fun,6) ================================================================================
success (ran 1 tests)
- : int = 0
- : unit = ()
(7,fun,50) (60,fun,9) (4,fun,8) (8,fun,5) (8,fun,7) (1,fun,7) (5,fun,0) (8,fun,1) (0,fun,1) (4,fun,6) ================================================================================
success (ran 1 tests)
- : int = 0
- : unit = ()
(7,fun,50) (60,fun,9) (4,fun,8) (8,fun,5) (8,fun,7) (1,fun,7) (5,fun,0) (8,fun,1) (0,fun,1) (4,fun,6) ================================================================================
success (ran 1 tests)
- : int = 0
- : unit = ()
(7,fun,50) (60,fun,9) (4,fun,8) (8,fun,5) (8,fun,7) (1,fun,7) (4,fun,1) (2,fun,8) (3,fun,0) (0,fun,4) ================================================================================
success (ran 1 tests)
- : int = 0
- : unit = ()
(7,fun,50) (60,fun,9) (4,fun,8) (8,fun,5) (8,fun,7) (1,fun,7) (4,fun,1) (2,fun,8) (3,fun,0) (0,fun,4) ================================================================================
success (ran 1 tests)

In retrospect, given how we generate functions (lazily inhabiting an underlying function table by generation, IIUC), this is probably not too surprising.

Nevertheless, by doing so, we lose a key property of QuickCheck, namely that the test value generation is independent of the usage.
For testing multicore programs, where behavior can be non-deterministic depending on scheduling, we depend and build on that property and would therefore like to see it restored.

I have an initial, minimal fix at https://github.com/jmid/qcheck/tree/splittable-qcheck1
but this makes the test fail_pred_strings in the test suite succeed (it was supposed to fail).
We probably want to use a splittable PRNG to restore independence.

The text was updated successfully, but these errors were encountered:

jmid added a commit to jmid/qcheck that referenced this issue Apr 19, 2022

add test for issue c-cube#236

5d7fdae

This was referenced Apr 19, 2022

Fix function generation reproducability #240

Merged

Switch to QCheck w/splitting fun generator ocaml-multicore/multicoretests#48

Closed

jmid closed this as completed in #240 Apr 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generator reproducability with function generators #236

Generator reproducability with function generators #236

jmid commented Apr 8, 2022

Generator reproducability with function generators #236

Generator reproducability with function generators #236

Comments

jmid commented Apr 8, 2022