You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
One nice property of a QuickCheck setup is the ability to reproduce a failing test's input when providing the same seed.
We were just bitten by a situation where this doesn't hold under special circumstances that involve
a function generator
a non-deterministic property
When started from the same seed, a test sequence generating 10 triples may differ (note the difference in the last 4 triples):
openQCheck(* [apply_n f x n] computes f(f(...f(x))) with n applications of f *)letrec apply_nfxn=if n=0then x
else apply_n f (f x) (pred n)
let t =Test.make ~count:10~name:"t1"
(set_shrink Shrink.nil (triple small_int (fun1 Observable.int small_int) small_int))
(fun (i,f,j) ->
Printf.printf "(%i,fun,%i) %!" i j;
if1=Float.to_int (Unix.time ()) mod2then
(ignore(apply_n (Fn.apply f) i j >0); true)
else
(ignore(apply_n (Fn.apply f) i i >0); true)
)
let sleep_time =0.175;;
QCheck_runner.run_tests ~rand:(Random.State.make [|1234|]) [t];;
Unix.sleepf sleep_time;;
QCheck_runner.run_tests ~rand:(Random.State.make [|1234|]) [t];;
Unix.sleepf sleep_time;;
QCheck_runner.run_tests ~rand:(Random.State.make [|1234|]) [t];;
Unix.sleepf sleep_time;;
QCheck_runner.run_tests ~rand:(Random.State.make [|1234|]) [t];;
Unix.sleepf sleep_time;;
QCheck_runner.run_tests ~rand:(Random.State.make [|1234|]) [t];;
This (admittedly ridiculous) test generates a function and two small_ints, then tests a vacuously true property, that happens to perform one function application or another, depending on Unix.time.
As another side-effect in prints the generated triple - for our viewing pleasure.
When I simply #use the file repeatedly from utop I'm able to observe changes in the generated triples, despite starting the test from fresh RNGs seeded identically:
val apply_n : ('a -> 'a) -> 'a -> int -> 'a = <fun>
val t : Test.t = QCheck2.Test.Test <abstr>
val sleep_time : float = 0.175
(7,fun,50) (60,fun,9) (4,fun,8) (8,fun,5) (8,fun,7) (1,fun,7) (5,fun,0) (8,fun,1) (0,fun,1) (4,fun,6) ================================================================================
success (ran 1 tests)
- : int = 0
- : unit = ()
(7,fun,50) (60,fun,9) (4,fun,8) (8,fun,5) (8,fun,7) (1,fun,7) (5,fun,0) (8,fun,1) (0,fun,1) (4,fun,6) ================================================================================
success (ran 1 tests)
- : int = 0
- : unit = ()
(7,fun,50) (60,fun,9) (4,fun,8) (8,fun,5) (8,fun,7) (1,fun,7) (5,fun,0) (8,fun,1) (0,fun,1) (4,fun,6) ================================================================================
success (ran 1 tests)
- : int = 0
- : unit = ()
(7,fun,50) (60,fun,9) (4,fun,8) (8,fun,5) (8,fun,7) (1,fun,7) (4,fun,1) (2,fun,8) (3,fun,0) (0,fun,4) ================================================================================
success (ran 1 tests)
- : int = 0
- : unit = ()
(7,fun,50) (60,fun,9) (4,fun,8) (8,fun,5) (8,fun,7) (1,fun,7) (4,fun,1) (2,fun,8) (3,fun,0) (0,fun,4) ================================================================================
success (ran 1 tests)
In retrospect, given how we generate functions (lazily inhabiting an underlying function table by generation, IIUC), this is probably not too surprising.
Nevertheless, by doing so, we lose a key property of QuickCheck, namely that the test value generation is independent of the usage.
For testing multicore programs, where behavior can be non-deterministic depending on scheduling, we depend and build on that property and would therefore like to see it restored.
I have an initial, minimal fix at https://github.com/jmid/qcheck/tree/splittable-qcheck1
but this makes the test fail_pred_strings in the test suite succeed (it was supposed to fail).
We probably want to use a splittable PRNG to restore independence.
The text was updated successfully, but these errors were encountered:
jmid
added a commit
to jmid/qcheck
that referenced
this issue
Apr 19, 2022
One nice property of a QuickCheck setup is the ability to reproduce a failing test's input when providing the same seed.
We were just bitten by a situation where this doesn't hold under special circumstances that involve
When started from the same seed, a test sequence generating 10 triples may differ (note the difference in the last 4 triples):
Here is a minimal repro:
This (admittedly ridiculous) test generates a function and two
small_int
s, then tests a vacuouslytrue
property, that happens to perform one function application or another, depending onUnix.time
.As another side-effect in prints the generated triple - for our viewing pleasure.
When I simply
#use
the file repeatedly fromutop
I'm able to observe changes in the generated triples, despite starting the test from fresh RNGs seeded identically:In retrospect, given how we generate functions (lazily inhabiting an underlying function table by generation, IIUC), this is probably not too surprising.
Nevertheless, by doing so, we lose a key property of QuickCheck, namely that the test value generation is independent of the usage.
For testing multicore programs, where behavior can be non-deterministic depending on scheduling, we depend and build on that property and would therefore like to see it restored.
I have an initial, minimal fix at https://github.com/jmid/qcheck/tree/splittable-qcheck1
but this makes the test
fail_pred_strings
in the test suite succeed (it was supposed to fail).We probably want to use a splittable PRNG to restore independence.
The text was updated successfully, but these errors were encountered: