Skip to content

Conversation

gpshead
Copy link
Member

@gpshead gpshead commented Dec 29, 2017

It is only enabled on single threaded processes as it manipulates global signal state.

This PR is for discussion purposes. I suspect we do not want to go forward with this if we can create a better one using signalfd support (https://bugs.python.org/issue32443) that works regardless of threads.

Windows APIs for wait timeout are already great.
macOS and BSDs would each need their own set of improvements as nobody has standard APIs around this.

It only works on single threaded processes as it manipulates global
signal state.
@gpshead gpshead added DO-NOT-MERGE type-feature A feature request or enhancement performance Performance or resource usage labels Dec 29, 2017
@njsmith
Copy link
Contributor

njsmith commented Jan 7, 2018

There is one way to put a timeout on waitpid: run waitpid in a dedicated thread, and if you get tired of waiting, use pthread_kill to send the waitpid thread a signal, so that it returns with EINTR. You do need to claim an otherwise unused signal for this purpose and register an empty signal handler. (Linux's signal(7) claims that SIGSTKFLT and SIGLOST aren't used...) So maybe that's a no-go. But this trick does allow multiple threads to call waitpid-with-timeout at the same time, and once you have a signal dedicated to this you can even use it to solve some other similar problems, like doing async I/O on stdin/out/err. (The obvious approach of setting stdin/out/err as non-blocking turns out to be highly problematic, because it also sets them non-blocking for all the other processes that share those fds.)

For extra fun, you do still need a polling timeout loop in this version, because there's a potential lost-wakeup race condition: if you try to wake up the sleeping thread just before it goes to sleep, the wakeup gets lost. This is not too hard to deal with though, just annoying – once you've decided to cancel the thread, you have to keep sending the signal until it confirms that it got it.

Waiting for children is my absolute least favorite part of Linux's I/O APIs. I don't understand how this wasn't fixed years ago.

@gpshead gpshead closed this Nov 14, 2018
@gpshead gpshead deleted the wait_timeout_no_polling branch November 14, 2018 01:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting merge DO-NOT-MERGE performance Performance or resource usage type-feature A feature request or enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants