New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle EINTR when closing sockets #56
Comments
So originally I filed this bug in the other direction: that Looking even further, to be safe, we may need to do different things based on operating system...
Making things even trickier is glibc, see this comment https://code.google.com/p/chromium/issues/detail?id=269623#c13 |
On Mon, Feb 23, 2015 at 06:40:00PM -0800, daurnimator wrote:
I read this thread awhile ago
but forgot that OS X is complicated. |
ps. i have edited/improved the original post, check it out on github :) |
So that fixes the mac case; what about linux+glibc; should we use the raw syscall there? |
On Linux EINTR can be safely ignored. close will always close the descriptor as long as it's valid. See http://linux.derkeiler.com/Mailing-Lists/Kernel/2005-09/3000.html and the comments at http://austingroupbugs.net/view.php?id=529. AIX behaves the way as Linux. A comment in the Austin Group thread suggests that Solaris behaves like Linux, but the Solaris 11.1 man page says the state of the descriptor is unspecified. Regardless, except for the hack for OS X, there's no API on any platform that allows to work around the issue in a thread-safe manner. If the platform does the wrong thing, then we're forced to simply leak the descriptor. |
Yes.. however glibc wraps it to make it non-safe again ==> it calls |
I think there are two completely separate issues here. The first is what to do if close has been interrupted by a signal and returns EINTR. The second is properly handling thread cancellation. Note that in the second case close will never return to the application. For the first case, Linux and OS X already do the correct thing: if close returns EINTR to the application because of a signal then the file descriptor has already been closed. Even on OS X there's no need to use close_nocancel. For the second case, both glibc and OS X have bugs where if a thread is marked for cancellation, the cancellation might happen before or after the descriptor has actually been closed, which means it's impossible for applications to know whether the descriptor has already been closed when running their cancellation handlers. For glibc[1] this is entirely a userspace bug because glibc tests for cancellation before calling the close syscall; but cancellation could also happen after the kernel has closed the descriptor. For OS X it's mostly a kernel issue[2]: the kernel close first tests to see if the thread was cancelled, and if so returns EINTR before the descriptor was invalidated. Otherwise, if cancellation happens during the close operation, it will return EINTR after the descriptor was invalidated. In either event, upon seeing EINTR the userspace wrapper tests for cancellation. It seems many of the comments out in the wild are confusing the two issues. If and when we want to address the cancellation problem, the solution for both[3] platforms (and theoretically all) is easy: just disable cancellations with pthread_setcancelstate before calling close. But currently cqueues cannot handle cancellations at all. It doesn't setup cancellation handlers to destroy the Lua context, and it doesn't disable cancellations in critical sections that might call into a cancellation point. We're going to leak everything, anyhow. That's definitely a worthwhile feature, but we should open a whole new issue tracker for that. I think we can close this issue out because (AFAICT) on all the supported platforms we can safely ignore EINTR from close and assume the descriptor has been closed. [1] See this bug report, which describes everything in detail: https://sourceware.org/bugzilla/show_bug.cgi?id=12683 [2] It took me forever to track down exactly what OS X was doing. Here's the relevant code:
[3] Assuming the code is running in "conforming" mode (which is the default for 64-bit), where pthread_setcancelstate calls into the kernel (__pthread_canceled). Otherwise, we would have to use close_nocancel to avoid the race between closing the file descriptor and cancellation causing EINTR, where we can't know which happened first. |
Consider this closed then :) |
btw, this still exists in the (unused) kpoll library: https://github.com/wahern/cqueues/blob/master/src/lib/kpoll.c#L118 I suppose that's issue #24 though. |
Would it make sense to block all signals prior to any call to close and unblock the signals after close has returned? Would this be a perfect workaround to the EINTR possibility? |
In
cqs_closefd
,close
is retried onEINTR
.You should not do this (from man page):
Related links:
The text was updated successfully, but these errors were encountered: