Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lin.Internal Thread int CList test finds unexpected counterexample under bytecode #358

Closed
jmid opened this issue Jun 7, 2023 · 4 comments · Fixed by #435
Closed

Lin.Internal Thread int CList test finds unexpected counterexample under bytecode #358

jmid opened this issue Jun 7, 2023 · 4 comments · Fixed by #435
Labels
test suite reliability Issue concerns tests that should behave more predictably

Comments

@jmid
Copy link
Collaborator

jmid commented Jun 7, 2023

The test src/neg_tests/lin_tests_thread_conclist.ml using the Lin Thread mode to exercise int CList is expected not to trigger any issues.

However under bytecode I have now observed more than once that it does find a counterexample.

Here's a run on Win bytecode trunk from 2 days ago:
https://github.com/ocaml-multicore/multicoretests/actions/runs/5173787112/jobs/9319394100

random seed: 124760788
generated error fail pass / total     time test name

[ ]    0    0    0    0 / 1000     0.0s Lin CList int test with Thread
[ ]    0    0    0    0 / 1000     0.0s Lin CList int test with Thread (generating)
[ ]  269    0    0  269 / 1000    60.3s Lin CList int test with Thread (shrinking:    0.0073)
[✗]  270    0    1  269 / 1000   117.1s Lin CList int test with Thread

--- Failure --------------------------------------------------------------------

[...]

Messages for test Lin CList int test with Thread:

  Results incompatible with sequential execution

                                       |                   
                         Add_node (97) : RAdd_node true    
                           Member (2) : RMember false      
                          Member (79) : RMember false      
                         Add_node (88) : RAdd_node true    
                        Add_node (578) : RAdd_node true    
                          Member (665) : RMember false     
                         Add_node (54) : RAdd_node true    
                          Member (61) : RMember false      
                           Member (3) : RMember false      
                         Add_node (47) : RAdd_node true    
                           Member (97) : RMember true      
                         Add_node (21) : RAdd_node true    
                         Add_node (8) : RAdd_node true     
                          Member (429) : RMember false     
                         Add_node (5) : RAdd_node true     
                         Add_node (47) : RAdd_node true    
                         Add_node (8) : RAdd_node true     
                                       |                   
                    .------------------------------------.
                    |                                    |                   
        Member (6) : RMember false           Member (4) : RMember false      
        Member (6) : RMember false         Add_node (0) : RAdd_node true     
     Add_node (637) : RAdd_node true       Add_node (58) : RAdd_node true    
     Add_node (1312) : RAdd_node true       Member (560) : RMember false     
      Add_node (0) : RAdd_node true        Add_node (3) : RAdd_node true     
      Add_node (89) : RAdd_node true       Add_node (1) : RAdd_node true     
       Member (281) : RMember false          Member (6) : RMember false      
       Member (227) : RMember false          Member (6) : RMember false      
      Add_node (88) : RAdd_node true       Add_node (9) : RAdd_node true     
      Add_node (8) : RAdd_node true          Member (1) : RMember false      
      Add_node (1) : RAdd_node true          Member (6) : RMember false      

================================================================================
failure (1 tests failed, 0 tests errored, ran 1 tests)

Here's another from last week on Linux bytecode trunk:
https://github.com/ocaml-multicore/multicoretests/actions/runs/5131082066/jobs/9230704532

random seed: 124224253
generated error fail pass / total     time test name

[ ]    0    0    0    0 / 1000     0.0s Lin CList int test with Thread
[ ]    0    0    0    0 / 1000     0.0s Lin CList int test with Thread (generating)
[ ]  224    0    0  224 / 1000    60.0s Lin CList int test with Thread (shrinking:    0.0032)
[ ]  224    0    0  224 / 1000   120.1s Lin CList int test with Thread (shrinking:    0.0112)
[✗]  225    0    1  224 / 1000   170.2s Lin CList int test with Thread

--- Failure --------------------------------------------------------------------

[...]

Messages for test Lin CList int test with Thread:

  Results incompatible with sequential execution

                                       |                   
                          Member (96) : RMember false      
                        Add_node (722) : RAdd_node true    
                         Add_node (0) : RAdd_node true     
                        Add_node (4820) : RAdd_node true   
                         Add_node (55) : RAdd_node true    
                         Add_node (98) : RAdd_node true    
                           Member (8) : RMember false      
                          Member (89) : RMember false      
                                       |                   
                    .------------------------------------.
                    |                                    |                   
     Add_node (627) : RAdd_node true         Member (8) : RMember false      
        Member (0) : RMember true          Add_node (2) : RAdd_node true     
     Add_node (143) : RAdd_node true        Member (910) : RMember false     
      Add_node (82) : RAdd_node true       Add_node (76) : RAdd_node true    
        Member (8) : RMember false         Add_node (24) : RAdd_node true    
      Add_node (84) : RAdd_node true       Add_node (6) : RAdd_node true     
      Add_node (11) : RAdd_node true        Member (93) : RMember false      
       Member (633) : RMember false        Add_node (1) : RAdd_node true     
       Member (83) : RMember false           Member (9) : RMember false      
      Add_node (10) : RAdd_node true       Add_node (94) : RAdd_node true    
        Member (4) : RMember false           Member (2) : RMember false      
      Add_node (94) : RAdd_node true      Add_node (924) : RAdd_node true    

================================================================================
failure (1 tests failed, 0 tests errored, ran 1 tests)
@jmid
Copy link
Collaborator Author

jmid commented Jun 16, 2023

The test unexpectedly failed under 32-bit Linux trunk 5.2:
https://github.com/ocaml-multicore/multicoretests/actions/runs/5283892318/jobs/9560725349

random seed: 13303944
generated error fail pass / total     time test name

[ ]    0    0    0    0 / 1000     0.0s Lin CList int test with Thread
[ ]    0    0    0    0 / 1000     0.0s Lin CList int test with Thread (generating)
[ ]  496    0    0  496 / 1000    63.6s Lin CList int test with Thread
[ ]  686    0    0  686 / 1000   123.6s Lin CList int test with Thread (shrinking:    0.0081)
[✗]  687    0    1  686 / 1000   168.3s Lin CList int test with Thread

--- Failure --------------------------------------------------------------------
File "src/neg_tests/dune", line 105, characters 29-54:
105 |  (names lin_tests_thread_ref lin_tests_thread_conclist)
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^
(cd _build/default/src/neg_tests && ./lin_tests_thread_conclist.exe --verbose)
Command exited with code 1.
File "src/neg_tests/dune", line 105, characters 29-54:
105 |  (names lin_tests_thread_ref lin_tests_thread_conclist)
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^
(cd _build/default/src/neg_tests && ./lin_tests_thread_conclist.exe --verbose)
Command exited with code 1.

Test Lin CList int test with Thread failed (0 shrink steps):

                           |             
                      Add_node (5)       
                      Add_node (6)       
                     Add_node (838)      
                     Add_node (21)       
                      Member (89)        
                      Member (98)        
                      Member (747)       
                      Add_node (8)       
                      Add_node (2)       
                      Member (81)        
                       Member (5)        
                      Member (609)       
                      Add_node (4)       
                      Member (11)        
                      Member (605)       
                           |             
                .---------------------.
                |                     |             
            Member (3)           Add_node (6)       
          Add_node (904)         Member (327)       
          Member (6581)           Member (4)        
            Member (5)          Add_node (845)      
           Add_node (9)           Member (1)        
          Add_node (331)        Member (3257)       
            Member (9)          Add_node (77)       
           Member (85)           Add_node (2)       
          Add_node (95)           Member (6)        
            Member (6)          Add_node (48)       
           Add_node (8)          Member (48)        
           Member (190)           Member (1)        


+++ Messages ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Messages for test Lin CList int test with Thread:

  Results incompatible with sequential execution

                                       |                   
                         Add_node (5) : RAdd_node true     
                         Add_node (6) : RAdd_node true     
                        Add_node (838) : RAdd_node true    
                        Add_node (21) : RAdd_node true     
                          Member (89) : RMember false      
                          Member (98) : RMember false      
                         Member (747) : RMember false      
                         Add_node (8) : RAdd_node true     
                         Add_node (2) : RAdd_node true     
                          Member (81) : RMember false      
                           Member (5) : RMember true       
                         Member (609) : RMember false      
                         Add_node (4) : RAdd_node true     
                          Member (11) : RMember false      
                         Member (605) : RMember false      
                                       |                   
                    .------------------------------------.
                    |                                    |                   
       Member (3) : RMember false          Add_node (6) : RAdd_node true     
     Add_node (904) : RAdd_node true       Member (327) : RMember false      
      Member (6581) : RMember false          Member (4) : RMember true       
        Member (5) : RMember true         Add_node (845) : RAdd_node true    
      Add_node (9) : RAdd_node true         Member (1) : RMember false       
     Add_node (331) : RAdd_node true       Member (3257) : RMember false     
        Member (9) : RMember true         Add_node (77) : RAdd_node true     
       Member (85) : RMember false         Add_node (2) : RAdd_node true     
     Add_node (95) : RAdd_node true          Member (6) : RMember true       
        Member (6) : RMember true         Add_node (48) : RAdd_node true     
      Add_node (8) : RAdd_node true         Member (48) : RMember false      
      Member (190) : RMember false          Member (1) : RMember false       

================================================================================
failure (1 tests failed, 0 tests errored, ran 1 tests)
Error: Process completed with exit code 1.

@jmid
Copy link
Collaborator Author

jmid commented Jun 16, 2023

Also triggered a counterexample under Windows bytecode trunk 5.2:
https://github.com/ocaml-multicore/multicoretests/actions/runs/5283892319/jobs/9560725030

random seed: 495053076
generated error fail pass / total     time test name

[ ]    0    0    0    0 / 1000     0.0s Lin CList int test with Thread
[ ]    0    0    0    0 / 1000     0.0s Lin CList int test with Thread (generating)
[ ]  744    0    0  744 / 1000    60.4s Lin CList int test with Thread (shrinking:    0.0012)
[ ]  744    0    0  744 / 1000   120.6s Lin CList int test with Thread (shrinking:    0.0150)
[✗]  745    0    1  744 / 1000   125.0s Lin CList int test with Thread

--- Failure --------------------------------------------------------------------

Test Lin CList int test with Thread failed (0 shrink steps):

                            |            
                       Member (0)        
                      Add_node (4)       
                       Member (4)        
                      Add_node (85)      
                      Add_node (85)      
                      Add_node (8)       
                      Add_node (8)       
                      Add_node (5)       
                      Add_node (30)      
                       Member (3)        
                       Member (99)       
                      Add_node (8)       
                       Member (9)        
                       Member (6)        
                            |            
                 .---------------------.
                 |                     |            
           Member (5704)         Add_node (82)      
           Add_node (9)           Member (21)       
           Add_node (1)         Add_node (448)      
            Member (1)           Add_node (6)       
           Add_node (15)          Member (85)       
          Add_node (7460)         Member (7)        
          Add_node (600)         Add_node (58)      
            Member (7)           Add_node (56)      
           Add_node (8)           Member (6)        
           Add_node (2)         Add_node (968)      


+++ Messages ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Messages for test Lin CList int test with Thread:

  Results incompatible with sequential execution

                                       |                   
                           Member (0) : RMember true       
                         Add_node (4) : RAdd_node true     
                           Member (4) : RMember true       
                         Add_node (85) : RAdd_node true    
                         Add_node (85) : RAdd_node true    
                         Add_node (8) : RAdd_node true     
                         Add_node (8) : RAdd_node true     
                         Add_node (5) : RAdd_node true     
                         Add_node (30) : RAdd_node true    
                           Member (3) : RMember false      
                          Member (99) : RMember false      
                         Add_node (8) : RAdd_node true     
                           Member (9) : RMember false      
                           Member (6) : RMember false      
                                       |                   
                    .------------------------------------.
                    |                                    |                   
      Member (5704) : RMember false        Add_node (82) : RAdd_node true    
      Add_node (9) : RAdd_node true         Member (21) : RMember false      
      Add_node (1) : RAdd_node true       Add_node (448) : RAdd_node true    
        Member (1) : RMember false         Add_node (6) : RAdd_node true     
      Add_node (15) : RAdd_node true         Member (85) : RMember true      
     Add_node (7460) : RAdd_node true        Member (7) : RMember false      
     Add_node (600) : RAdd_node true       Add_node (58) : RAdd_node true    
        Member (7) : RMember false         Add_node (56) : RAdd_node true    
      Add_node (8) : RAdd_node true          Member (6) : RMember true       
      Add_node (2) : RAdd_node true       Add_node (968) : RAdd_node true    

================================================================================
failure (1 tests failed, 0 tests errored, ran 1 tests)
File "src/neg_tests/dune", line 105, characters 29-54:
105 |  (names lin_tests_thread_ref lin_tests_thread_conclist)
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^
(cd _build/default/src/neg_tests && ./lin_tests_thread_conclist.exe --verbose)
Command exited with code 1.

@jmid jmid added the test suite reliability Issue concerns tests that should behave more predictably label Jul 14, 2023
@jmid
Copy link
Collaborator Author

jmid commented Aug 14, 2023

Observed again on Windows bytecode 5.0.0
https://github.com/ocaml-multicore/multicoretests/actions/runs/5703723247/job/15456641444

random seed: 133807954
generated error fail pass / total     time test name

[ ]    0    0    0    0 / 1000     0.0s Lin CList int test with Thread
[ ]    0    0    0    0 / 1000     0.0s Lin CList int test with Thread (generating)
[ ]  826    0    0  826 / 1000    60.0s Lin CList int test with Thread
[✗]  863    0    1  862 / 1000    87.2s Lin CList int test with Thread

@jmid jmid changed the title Lin Thread int CList test under bytecode Lin Thread int CList test finds unexpected counterexample under bytecode Sep 20, 2023
@jmid jmid changed the title Lin Thread int CList test finds unexpected counterexample under bytecode Lin.Internal Thread int CList test finds unexpected counterexample under bytecode Nov 14, 2023
@jmid
Copy link
Collaborator Author

jmid commented Nov 14, 2023

I updated the issue title to reflect #396 and #408 renaming

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
test suite reliability Issue concerns tests that should behave more predictably
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant