Currently the NCI tests in t/pmc/nci.t are fragile and are blocking the merging of the very important threads branch.
Originally it was thought that the threads branch on Darwin/PPC was the only platform that broke the NCI tests, but it has come to light that the test failures can be reproduced on various "slow" machines as well as clang with address sanitation turned on.
The fact that the NCI tests use "sleep" currently is very broken and fragile and only works on "faster" machines.
For more details, see http://lists.parrot.org/pipermail/parrot-dev/2012-August/007106.html
The NCI tests should be refactored so they do not depend on peppering various "sleep" calls in each test.
(done in 56c96dd)
I'm with Andy that sleep() is the real problem here.
1st the tests are way too fragile, and
2nd sleep() itself got more instable in the threads branch.
Esp. this section loops in while. How can we ensure that the alarm signal
is ever delivered, that the timer thread is not also waiting? Is sleep disturbed by another alarm?
thread.c- while (interp->wake_up == 0)
thread.c: COND_WAIT(interp->sleep_cond, interp->sleep_mutex);
thread.c- interp->wake_up = 0;
The backtraces of the sleep-alarm deadlock are:
e.g. If I remove the sleep 0.01 line in t/pmc/nci_37.pasm the nci tests succeed.
On linux amd64 also repro with tsan. See http://pastebin.com/gKRDTkh3
chmod +x tsan-r4356-amd64-linux-self-contained.sh
cd - # threads branch
tsan-r4356-amd64-linux-self-contained.sh ./parrot t/pmc/nci_37.pasm
tsan-r4356-amd64-linux-self-contained.sh ./parrot t/pmc/task.t
So the sleep implementation with threads is racy.
rurban assumes the sleep thread blocks signals which arrive during the sleep, and the sleep loop never finishes.
I think that there's a design limitation that only one signal can be accepted per thread.
#808 t/pmc/nci_37.pasm: Avoid sleep deadlock with parrot threads
[GH #808] Remove sleep calls in nci.t, because of signal deadlocks wi…
…th parrot threads
Even without sleep calls the tests succeed. But since it loops until the resuilt arrives, let
it busy loop a bit longer.
Note: This is a hack. sleep on threads should be fixed instead.
remaining deadlock fixed with e1d4c06 in the threads branch.