Weak test revisions #365

jmid · 2023-06-09T16:33:28Z

This PR revises our Weak tests...

The Weak module depends on the state of the GC.
As such, we have to be extra careful to start from "as clean a starting point as possible".
Just having run sequential tests pollutes the heap state - so better run a Gc.full_major () initially - and a cheaper Gc.minor () in between attempts.

Secondly, the PR switches away from strings to more easy-to-shrink int64s.

Finally I think running Lin tests of Weak is misguided, as I don't think it makes sense to perform a fragile GC-dependent parallel run and compare its outputs against more fragile GC-dependent sequential runs.
Parallel Weak STM tests however work well as stress tests - we previously caught run-time bugs that way (e.g., #181)

Hopefully this will solve #299.

I can see a 5.2 s390x run even completed within the timelimit with this: 🎉
https://ocaml-multicoretests.ci.dev:8100/job/2023-06-09/150708-ci-ocluster-build-44e4ee

- to use easier to shrink int64s - doing a full major GC to have a clean starting point - doing a minor GC in between attempts - increasing the frequency of cmds that make us observe the state

- also adding minor and major GC calls for reproducability - increasing the frequency of state observing cmds - adding a simple cmd shrinker

shym

This looks very good! 😄

The only thing I would have done differently is the support for 5.0. I’ve used this in the past: shym/ortac@cecf0c5 which has the nice side-effect that it doesn’t override the standard module when not needed. But I’m not sure this makes such a difference.

jmid · 2023-06-09T17:00:41Z

Good point about overriding 👍
Actually, now that you mention it, I realize by reordering the include I would use the 5.1 and 5.2 bindings when they are available. I've previously used that trick in QCheck but forgotten about it: 😅
https://github.com/c-cube/qcheck/blob/9424d94c8fa682ef90ad8bf2d487c852d8c4f38d/src/core/QCheck.ml#L12-L19

shym · 2023-06-09T18:01:04Z

Nice trick 👍

jmid · 2023-06-12T07:58:34Z

The shadowing is handled in 9b85211.
I've also gone for removing the Lin tests entirely as I don't see us needing them in the future.
We may have to disable the parallel src/weak/stm_tests_hashset test on Windows bytecode where it doesn't seem to consistently trigger.

jmid · 2023-06-12T08:40:02Z

CI summary for 720028c

a failure to trigger a parallel weak/stm_tests_hashset on Windows bytecode 5.1.0~alpha2
a timeout due to excessive Lin Ephemeron shrinking on Cygwin part 1 5.1.0~alpha2
a timeout due to excessive Lin Out_channel shrinking on Windows bytecode trunk 5.2
a timeout due to threadomain livelock on Windows 5.1.0~alpha2 and on Windows trunk 5.2 [ocaml5-issue] Windows failures on threadomain #203
a timeout due to a on domain_spawntree with Atomic on Windows trunk 5.2 [ocaml5-issue] Windows trunk bytecode domain_spawntree crash or deadlock #354
a timeout due to long-running int ref STM and Thread tests (several 1500+ secs) on Cygwin part 2 trunk 5.2
a failure to trigger a Lin Out_channel test on Linux arm64 5.1 Reliability of parallel Lin Out_channel tests #360

Modulo the first one, none of this is related to this PR.

shym · 2023-06-12T08:46:58Z

Small suggestion for the last commit: rather than disabling the warning globally, it can be disabled only where it might trigger, it’s easier to understand why it is required.

@@ -41,6 +41,7 @@ struct
 
   module Int64 =
   struct
+    [@@@warning "-unused-value-declaration"]
     (* support Int64.hash added in 5.1 *)
     external seeded_hash_param :
       int -> int -> int -> 'a -> int = "caml_hash" [@@noalloc]

jmid · 2023-06-12T21:13:13Z

CI summary of latest run

2 Windows bytecode runs on 5.1.0~alpha2 and 5.2 trunk timed out due to threadomain [ocaml5-issue] Windows failures on threadomain #203
2 Cygwin part 1 runs (1 on 5.1.0~alpha2 and 1 trunk) were cancelled before running 🤔 potentially getting confused by the midway push...

jmid · 2023-06-12T21:15:31Z

Small suggestion for the last commit: [...]

Fair enough. No need to silence more than necessary. Should be addressed in f3d0160 👍

jmid · 2023-06-13T15:12:16Z

CI summary for the latest run:

1 Windows bytecode 5.1.0~alpha2 threadomain timeout and 1 crash - with a new variant [ocaml5-issue] Windows failures on threadomain #203
1 Windows bytecode trunk 5.2 unexpected Thread Queue counterexample Lin Thread Queue test reliability #363
2 timeouts on Cygwin part 1 5.1 and 1 Linux s390x 5.1 due to several (primary Lin) tests taking 1000+ seconds
1 macOS arm64 5.2 failure to trigger STM int ref test parallel asymmetric 'STM _ ref test parallel asymmetric' failure to trigger #364

I'll go ahead and merge.

jmid added 6 commits June 9, 2023 15:25

Revise src/weak/stm_tests_hashset.ml

066f6a3

- to use easier to shrink int64s - doing a full major GC to have a clean starting point - doing a minor GC in between attempts - increasing the frequency of cmds that make us observe the state

Disable meaningless src/weak/lin_tests_dsl_hashset.ml Lin test

5d11dba

Similarly revise src/weak/stm_tests.ml

4a54073

- also adding minor and major GC calls for reproducability - increasing the frequency of state observing cmds - adding a simple cmd shrinker

Disable src/weak/lin_tests_dsl too

b150273

Increase negative test count

9bd79ac

Support Weak Hashset test on OCaml5 too

720028c

shym approved these changes Jun 9, 2023

View reviewed changes

jmid added 2 commits June 12, 2023 09:21

Shadow Int64.hash work-around for OCaml 5.1+

9b85211

Remove Weak Lin tests

e24cf47

Disable unused hash error on OCaml 5.1+

f3d0160

jmid force-pushed the weak-hashset-predictable branch from d45824f to f3d0160 Compare June 12, 2023 21:13

jmid merged commit c08b00a into main Jun 13, 2023

jmid deleted the weak-hashset-predictable branch June 13, 2023 15:12

jmid linked an issue Jun 13, 2023 that may be closed by this pull request

Lin HashSet tests shrinking taking very long #299

Closed

jmid mentioned this pull request Jun 13, 2023

Ephemeron test revision #367

Merged

jmid mentioned this pull request Jun 21, 2023

Switch STM_domain.agree_prop_par_asym to use Atomics #368

Merged

shym mentioned this pull request Jun 21, 2023

Add the possibility to compute statistics #369

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Weak test revisions #365

Weak test revisions #365

jmid commented Jun 9, 2023

shym left a comment

jmid commented Jun 9, 2023

shym commented Jun 9, 2023

jmid commented Jun 12, 2023

jmid commented Jun 12, 2023

shym commented Jun 12, 2023

jmid commented Jun 12, 2023

jmid commented Jun 12, 2023

jmid commented Jun 13, 2023

Weak test revisions #365

Weak test revisions #365

Conversation

jmid commented Jun 9, 2023

shym left a comment

Choose a reason for hiding this comment

jmid commented Jun 9, 2023

shym commented Jun 9, 2023

jmid commented Jun 12, 2023

jmid commented Jun 12, 2023

shym commented Jun 12, 2023

jmid commented Jun 12, 2023

jmid commented Jun 12, 2023

jmid commented Jun 13, 2023