Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
DTrace's process-control loop for victim processes spends most of its time sleeping on a waitpid(). When a proxy request comes in, the proxy_call code hits this waitpid() with a dedicated (realtime) signal to cause it to exit with -EINTR. But if a proxy request comes in while the thread is doing something other than waiting on a proxy_call, we want to know about it before we hit waitpid(). We do this by having the signal handler set a variable (in dt_proc_loop, waitpid_interrupted: passed to Pwaitpid() as a parameter, return_early) which is then checked before we enter waitpid() to tell if we need to go back and handle another proxy request before blocking in waitpid() again. At the top of the process-control loop, we set waitpid_interrupted back to 0 again because we're just about to handle whatever proxy request came in. Because we only send a signal from proxy_call, and proxy_call is only invoked under the dpr_lock, and all the proxy call machinery in dt_proc_loop *also* happens under the dpr_lock, it seemed safe to check that waitpid_interrupted was still zero before we entered Pwaitpid(). ... but it isn't. Proxy calling doesn't just hit the thread with *one* signal: it sets up a timer (the 'pinger') to hit it over and over again, just in case the first signal hit when we had checked waitpid_interrupted but not yet got far enough into waitpid() to return with -EINTR. So in fact waitpid_interrupted can get set at any time, even in the middle of another proxy call, and in fact if the proxy call is slow enough it'll get set *while we are processing the proxy request it relates to*. So tear the assertion out. It's harmless if the assertion fires anyway: all that will happen is that we'll get one spurious early exit from Pwaitpid(), one whip round the process control loop (with poll() confirming that there are in fact no proxy requests waiting for us), and then we'll block on waitpid() again. (When I added the assertion, its firing would have meant that we were about to block on read() forever, but I took that code out without ever committing it.) Survived 500 invocations of test/unittest/proc/tst.grab-exit.d so far: will give it another few thousand rounds overnight. Signed-off-by: Nick Alcock <nick.alcock@oracle.com> Reviewed-by: Kris Van Hees <kris.van.hees@oracle.com>
- Loading branch information