You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
assignee=Noneclosed_at=<Date2020-05-18.15:51:14.255>created_at=<Date2020-03-18.15:48:15.705>labels= ['tests', '3.9']
title='test.regrtest: add an option to run test.bisect_cmd on failed tests, use it on Refleaks buildbots'updated_at=<Date2020-05-18.15:51:14.254>user='https://github.com/vstinner'
There are some tests which fail randomly in general, but fail in a deterministic way on some specific buildbot workers.
bpo-39932 is a good example: test_multiprocessing_fork fails with "test_multiprocessing_fork leaked [0, 2, 0] file descriptors". The test fails while run in paralle, but it also fails when re-run sequentially. Except that when I connect to the buildbot worker, it does not fail anymore.
test_multiprocessing_fork contains 356 test methods, the test file (Lib/test/_test_multiprocessing.py) has 5741 lines of Python code, and the multiprocessing is made of 8149 lines of Python code and 1133 lines of C code. It's hard to audit such code. The multiprocessing uses multiple proceses, pipes, signals, etc. It's really hard to debug.
I propose to add an --bisect-failed option to test.regrtest to run test.bisect_cmd on failed tests. We can start to experiment it on Refleaks buildbots. Regular tests (not Refleaks tests) are easier to reproduce in general.
It should speedup analysis of reference leak and "altered environment" test failures. Having less test methods to audit is way simpler.
The implement should be that at the end of regrtest, after tests are re-run, run each failed test in test.bisect_cmd with the same command line arguments than test.regrtest.
test.bisect_cmd uses 100 iterations by default. It's ok if the bisection fails to reduce the number of test methods. At least, it should reduce the list in some cases.