It seems like Stream.poll() did not retry select() in cases of signal interruptions
Handle EINTR in Stream.poll()
I think you forgot get_exc_errno. Btw at what constellations do you get EINTR?
I'm not sure what do you mean by "forgot get_exc_errno", I think it's pretty much there...
Anyways, this happened to me several times when running processes through RPyC connections. In these cases the SIGCHLD handler set by the server is called, and the select gets interrupted.
oh, i forget that get_exc_errno is already defined in rpyc. about SIGCHLD, i thought it was already handled by the forking server, but i guess i rewrote this code and forgot to cover it. thanks
here's the handling of EINTR:
it's strange that your get EINTR in clients, as i'm restoring the original SIGCHLD handler in
I have a specific traceback which leads to that select call specifically. From what I can tell I don't use signals in this specific example so it's pretty strange indeed. I still think it's good practice to handle EINTR. I vaguely remember more cases in which the behavior with signal handlers was a bit shady...
Ok - found the issue. It appears that once you set a signal handler (at least once) - python implicitly calls siginterrupt(1) for you (meaning system calls are not automatically restarted). It appears that setting SIG_IGN or SIG_DFL again does not undo that, which sucks.
It's also sort of implied from the documentation of the signal module, but looking at the source code (of PyOS_setsig in pythonrun.c) makes it clear... It seems the real solution is to call siginterrupt(False) after your handler restoration. I still think it's a good practice to catch EINTR but that's a side note...
#76: call signal.siginterrupt(False) in forked child
okay, i added a call to signal.siginterrupt in the forked child, so this should solve the issue.
i agree it's a good practice to handle the possible EINTR, but because it's in the "critical path", i'm a bit more cautious about it. besides, it's a bit complicated to cover all cases of EINTR, as almost any syscall can return with this errno. if the problem persists, i'll incorporate the patch, but if not, i think its best to leave it this way. thanks for the research!
i'm closing the issue now, let me now if the problem persists