Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial tests of the Weak module #214

Merged
merged 14 commits into from Jan 31, 2023
Merged

Initial tests of the Weak module #214

merged 14 commits into from Jan 31, 2023

Conversation

jmid
Copy link
Collaborator

@jmid jmid commented Nov 23, 2022

I realized we didn't have tests of Weak yet, so I played last night with writing some.
Because the GC may kick in, the interface is generally not sequentially consistent (nor is it intended to be safe to use in parallel).

Nevertheless, this was producing strange, out-of-thin-air(?) values:

Messages for test Lin DSL Weak test with Domain:

  Results incompatible with sequential execution

                                               |                    
                                Weak.set t 9 Some (9) : Ok (())     
                                               |                    
                         .------------------------------------------.
                         |                                          |                    
            Weak.set t 9 None : Ok (())         Weak.get t 9 : Ok (Some (93946051950632))

@jmid jmid changed the title Initial tests of the low-level Weak interface Initial tests of the Weak module Nov 24, 2022
@jmid
Copy link
Collaborator Author

jmid commented Dec 13, 2022

Rebased on main and simplified src/weak/dune-rules to (test ...)s.
Curious to see if the new debug mode can help to reproduce ocaml/ocaml#11800 since the failing test case originates from this PR.

@jmid
Copy link
Collaborator Author

jmid commented Dec 21, 2022

Rebased on latest main

@jmid
Copy link
Collaborator Author

jmid commented Jan 13, 2023

In an attempt to get these into a mergable state,

  • I updated the Lin Hashset tests to allow exception throwing since most of these functions can raise Invalid_argument when used in parallel
  • I wrote STM tests of Weak Hashsets. Here I managed to extend @OlivierNicole's extensible ty type with a Tup6 constructor - and as a side-benefit figure out how we can add tuple combinators to STM 🙂

Locally both of these now succeed. I'm hoping this is also the case for the CI targets.

Finally we don't need to run both Lin and STM tests in the CI. Here I'm leaning towards the STM ones - as they test stricter properties. This means the Lin tests should be disabled before merging (assuming CI actually is green... 🤞)

@jmid
Copy link
Collaborator Author

jmid commented Jan 13, 2023

Ouch, STM tests trigger a BUS error on the Weak Hashset tests on macOS 5.0.0
https://github.com/ocaml-multicore/multicoretests/actions/runs/3913104092/jobs/6688588995

random seed: 470692489
generated error fail pass / total     time test name

[ ]    0    0    0    0 / 1000     0.0s STM Weak test sequential
[ ]    0    0    0    0 / 1000     0.0s STM Weak test sequential (generating)
[✓] 1000    0    0 1000 / 1000     0.3s STM Weak test sequential

[ ]    0    0    0    0 / 1000     0.0s STM Weak test parallel
[ ]   53    0    0   53 / 1000    63.2s STM Weak test parallel
[ ]  141    0    0  141 / 1000   124.4s STM Weak test parallel
[ ]  196    0    0  196 / 1000   186.8s STM Weak test parallel
[ ]  271    0    0  271 / 1000   252.0s STM Weak test parallel
[ ]  375    0    0  375 / 1000   315.7s STM Weak test parallel
[✓]  383    0    1  382 / 1000   336.4s STM Weak test parallel

[ ]    0    0    0    0 / 1000     0.0s STM Weak HashSet test sequential
[✓] 1000    0    0 1000 / 1000     0.4s STM Weak HashSet test sequential

[ ]    0    0    0    0 / 1000     0.0s STM Weak HashSet test parallel
[ ]   51    0    0   51 / 1000    38.9s STM Weak HashSet test parallel (shrinking:   19.0052)
[ ]   51    0    0   51 / 1000    98.9s STM Weak HashSet test parallel (shrinking:   23.0034)
[ ]   51    0    0   51 / 1000   159.0s STM Weak HashSet test parallel (shrinking:   25.0124)
[ ]   51    0    0   51 / 1000   219.1s STM Weak HashSet test parallel (shrinking:   27.0081)
[ ]   51    0    0   51 / 1000   279.1s STM Weak HashSet test parallel (shrinking:   29.0113)
[ ]   51    0    0   51 / 1000   339.2s STM Weak HashSet test parallel (shrinking:   32.0109)
[ ]   51    0    0   51 / 1000   399.3s STM Weak HashSet test parallel (shrinking:   35.0062)
[ ]   51    0    0   51 / 1000   459.3s STM Weak HashSet test parallel (shrinking:   38.0082)
[ ]   51    0    0   51 / 1000   519.5s STM Weak HashSet test parallel (shrinking:   41.0050)
File "src/weak/dune", line 4, characters 7-16:
4 |  (name stm_tests)
           ^^^^^^^^^
(cd _build/default/src/weak && ./stm_tests.exe --verbose)
Command got signal BUS.
[ ]   51    0    0   51 / 1000   579.6s STM Weak HashSet test parallel (shrinking:   44.0023)

@jmid
Copy link
Collaborator Author

jmid commented Jan 13, 2023

STM tests updated to reflect that find_all on Hashsets also may raise Invalid_argument

@jmid
Copy link
Collaborator Author

jmid commented Jan 16, 2023

The previous CI runs had time outs due to excessive shrinking.
7ea00cb therefore adjusts the string argument shrinker to only reduce the string length (not the individual chars).
Even that may be too much, as Weak Hashsets depend on the GC which the tests have no control over.
To further reduce shrinking time, this is rebased on the stm-reduce-retries branch from #282.

@jmid
Copy link
Collaborator Author

jmid commented Jan 17, 2023

Since Weak depends on the GC (which is out of our hands) the latest commit splits the test files in two, in an attempt to minimize the effects of a longer-running execution (with more GC calls).

@jmid
Copy link
Collaborator Author

jmid commented Jan 20, 2023

Focusing on the lower-level API didn't trigger any crashes.
It timed out due to excessively long runs. The winner was on a Linux bytecode trunk run
which took 325min - or almost 5.5 hours 😬
https://github.com/ocaml-multicore/multicoretests/actions/runs/3957933654/jobs/6778887520

Starting 2-th run

random seed: 429518331
generated error fail pass / total     time test name

[ ]    0    0    0    0 / 1000     0.0s STM Weak test sequential
[ ]    0    0    0    0 / 1000     0.0s STM Weak test sequential (generating)
[✓] 1000    0    0 1000 / 1000     2.1s STM Weak test sequential

[ ]    0    0    0    0 / 1000     0.0s STM Weak test parallel
[ ]    8    0    0    8 / 1000    84.8s STM Weak test parallel
[ ]   21    0    0   21 / 1000   146.4s STM Weak test parallel
[ ]   56    0    0   56 / 1000   208.0s STM Weak test parallel
[ ]   78    0    0   78 / 1000   268.7s STM Weak test parallel
[ ]   82    0    0   82 / 1000   358.6s STM Weak test parallel (shrinking:    0)
[ ]   82    0    0   82 / 1000  1410.1s STM Weak test parallel (shrinking:    0.0002)
[ ]   82    0    0   82 / 1000  2457.6s STM Weak test parallel (shrinking:    0.0003)
[ ]   82    0    0   82 / 1000  3505.2s STM Weak test parallel (shrinking:    0.0004)
[ ]   82    0    0   82 / 1000  3843.3s STM Weak test parallel (shrinking:    1)
[ ]   82    0    0   82 / 1000  4894.7s STM Weak test parallel (shrinking:    1.0002)
[ ]   82    0    0   82 / 1000  5141.1s STM Weak test parallel (shrinking:    2)
[ ]   82    0    0   82 / 1000  6192.7s STM Weak test parallel (shrinking:    2.0002)
[ ]   82    0    0   82 / 1000  6551.5s STM Weak test parallel (shrinking:    3)
[ ]   82    0    0   82 / 1000  7603.1s STM Weak test parallel (shrinking:    3.0002)
[ ]   82    0    0   82 / 1000  7786.8s STM Weak test parallel (shrinking:    4)
[ ]   82    0    0   82 / 1000  8838.3s STM Weak test parallel (shrinking:    4.0002)
[ ]   82    0    0   82 / 1000  9885.8s STM Weak test parallel (shrinking:    4.0003)
[ ]   82    0    0   82 / 1000 10933.1s STM Weak test parallel (shrinking:    4.0004)
[ ]   82    0    0   82 / 1000 11980.6s STM Weak test parallel (shrinking:    4.0005)
[ ]   82    0    0   82 / 1000 12623.3s STM Weak test parallel (shrinking:    5)
[ ]   82    0    0   82 / 1000 13675.0s STM Weak test parallel (shrinking:    5.0002)
[ ]   82    0    0   82 / 1000 14722.3s STM Weak test parallel (shrinking:    5.0003)
[ ]   82    0    0   82 / 1000 14926.9s STM Weak test parallel (shrinking:    6)
[ ]   82    0    0   82 / 1000 15978.6s STM Weak test parallel (shrinking:    6.0002)
[ ]   82    0    0   82 / 1000 17026.1s STM Weak test parallel (shrinking:    6.0003)
[ ]   82    0    0   82 / 1000 17163.9s STM Weak test parallel (shrinking:    7)
[ ]   82    0    0   82 / 1000 18215.4s STM Weak test parallel (shrinking:    7.0002)
[ ]   82    0    0   82 / 1000 18290.6s STM Weak test parallel (shrinking:    8)
[ ]   82    0    0   82 / 1000 19341.9s STM Weak test parallel (shrinking:    8.0002)
[ ]   82    0    0   82 / 1000 19521.4s STM Weak test parallel (shrinking:    9)
Error: The operation was canceled.

This indicates that the crashing is caused by the higher-level Weak Hashset API.
Since Weak.t values live in nested arrays there, they probably live longer - which may explain why the timing improvements from ocaml/ocaml#11743 make them less likely (or eliminate them).

@jmid
Copy link
Collaborator Author

jmid commented Jan 30, 2023

The parallel STM tests are not consistently triggering a failure.
I'm therefore trying a combination of sequential STM tests + parallel Lin tests.
(also rebased on latest main)

@jmid
Copy link
Collaborator Author

jmid commented Jan 31, 2023

OK, this seems to work reasonably. There were 2 Windows threadomain failures:

  • one trunk bytecode crash and
  • one 5.0.0 timeout

I'll clean up the commits, rerun the CI and then hope to merge 🤞

@jmid
Copy link
Collaborator Author

jmid commented Jan 31, 2023

CI is happy Weak-test wise, so I'll merge.

The CI is however failing with the following unrelated errors:

  • a parallel Dynlink Windows 5.0.0 crash
  • 2 threadomain timeouts on Windows trunk native+bytecode
  • a macOS trunk timeout on Out_channel tests

@jmid jmid merged commit 63a1752 into main Jan 31, 2023
@jmid jmid deleted the add-weak-tests branch January 31, 2023 15:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant