Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Endless loop in EMFILE #690

Closed
bobrik opened this Issue · 3 comments

3 participants

Ian Babrou Ben Noordhuis Saúl Ibarra Corretgé
Ian Babrou

I have node.js program that is spinning in the next state (strace output):

Process 4444 attached with 6 threads - interrupt to quit
[pid  4451] 21:32:44 futex(0x761d04, FUTEX_WAIT_PRIVATE, 67370356, NULL <unfinished ...>
[pid  4450] 21:32:44 futex(0x761d04, FUTEX_WAIT_PRIVATE, 67370356, NULL <unfinished ...>
[pid  4449] 21:32:44 futex(0x761d04, FUTEX_WAIT_PRIVATE, 67370356, NULL <unfinished ...>
[pid  4447] 21:32:44 futex(0x761d04, FUTEX_WAIT_PRIVATE, 67370356, NULL <unfinished ...>
[pid  4446] 21:32:44 futex(0x8378c8, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid  4444] 21:32:44 accept4(10, 0, NULL, SOCK_CLOEXEC|SOCK_NONBLOCK) = -1 EMFILE (Too many open files) <0.000008>
[pid  4444] 21:32:44 close(7)           = 0 <0.000007>
[pid  4444] 21:32:44 accept4(10, 0, NULL, SOCK_CLOEXEC|SOCK_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable) <0.000009>
[pid  4444] 21:32:44 open("/", O_RDONLY|O_CLOEXEC) = 7 <0.000007>
[pid  4444] 21:32:44 accept4(10, 0, NULL, SOCK_CLOEXEC|SOCK_NONBLOCK) = -1 EMFILE (Too many open files) <0.000006>
[pid  4444] 21:32:44 close(7)           = 0 <0.000006>
[pid  4444] 21:32:44 accept4(10, 0, NULL, SOCK_CLOEXEC|SOCK_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable) <0.000009>
[pid  4444] 21:32:44 open("/", O_RDONLY|O_CLOEXEC) = 7 <0.000005>
[pid  4444] 21:32:44 accept4(10, 0, NULL, SOCK_CLOEXEC|SOCK_NONBLOCK) = -1 EMFILE (Too many open files) <0.000006>
[pid  4444] 21:32:44 close(7)           = 0 <0.000007>
[pid  4444] 21:32:44 accept4(10, 0, NULL, SOCK_CLOEXEC|SOCK_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable) <0.000007>
[pid  4444] 21:32:44 open("/", O_RDONLY|O_CLOEXEC) = 7 <0.000007>
[pid  4444] 21:32:44 accept4(10, 0, NULL, SOCK_CLOEXEC|SOCK_NONBLOCK) = -1 EMFILE (Too many open files) <0.000007>
[pid  4444] 21:32:44 close(7)           = 0 <0.000006>
[pid  4444] 21:32:44 accept4(10, 0, NULL, SOCK_CLOEXEC|SOCK_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable) <0.000007>
[pid  4444] 21:32:44 open("/", O_RDONLY|O_CLOEXEC) = 7 <0.000008>
[pid  4444] 21:32:44 accept4(10, 0, NULL, SOCK_CLOEXEC|SOCK_NONBLOCK) = -1 EMFILE (Too many open files) <0.000007>
[pid  4444] 21:32:44 close(7)           = 0 <0.000006>
[pid  4444] 21:32:44 accept4(10, 0, NULL, SOCK_CLOEXEC|SOCK_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable) <0.000008>
[pid  4444] 21:32:44 open("/", O_RDONLY|O_CLOEXEC) = 7 <0.000006>
[pid  4444] 21:32:44 accept4(10, 0, NULL, SOCK_CLOEXEC|SOCK_NONBLOCK) = -1 EMFILE (Too many open files) <0.000006>
[pid  4444] 21:32:44 close(7)           = 0 <0.000006>
[pid  4444] 21:32:44 accept4(10, 0, NULL, SOCK_CLOEXEC|SOCK_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable) <0.000007>
[pid  4444] 21:32:44 open("/", O_RDONLY|O_CLOEXEC) = 7 <0.000006>
[pid  4444] 21:32:44 accept4(10, 0, NULL, SOCK_CLOEXEC|SOCK_NONBLOCK) = -1 EMFILE (Too many open files) <0.000005>

gdb says:

(gdb) bt
#0  0x00007fb0b9c0d439 in syscall () from /lib64/libc.so.6
#1  0x000000000048247b in uv__accept4 ()
#2  0x000000000046dfac in uv__accept ()
#3  0x000000000047c952 in uv__server_io ()
#4  0x0000000000471b62 in ev_invoke_pending ()
#5  0x000000000046d9ff in ?? ()
#6  0x000000000046dc70 in uv_run ()
#7  0x0000000000429b57 in node::Start(int, char**) ()
#8  0x00007fb0b9b5a3dd in __libc_start_main () from /lib64/libc.so.6
#9  0x0000000000420b8d in _start ()

uname:

Linux web326 3.3.1-gentoo #1 SMP Tue Apr 10 12:58:57 MSK 2012 x86_64 Intel(R) Xeon(R) CPU E31230 @ 3.20GHz GenuineIntel GNU/Linux

node.js version 0.9.2

It seems to be wrong behaviour to spin in kernel space instead of bubbling error to node.js.

Is it known bug that is fixed in newer versions?

Ben Noordhuis

Can you try this patch? It seems accept4() checks for EMFILE before EAGAIN.

diff --git a/src/unix/stream.c b/src/unix/stream.c
index f4ed002..fcaaf91 100644
--- a/src/unix/stream.c
+++ b/src/unix/stream.c
@@ -526,7 +526,7 @@ void uv__server_io(uv_loop_t* loop, uv__io_t* w, unsigned int events) {
         if (use_emfile_trick) {
           SAVE_ERRNO(r = uv__emfile_trick(loop, uv__stream_fd(stream)));
           if (r == 0)
-            continue;
+            return;
         }

         /* Fall through. */

It seems to be wrong behaviour to spin in kernel space instead of bubbling error to node.js.

Maybe. There's not much that you can do in EMFILE situations: either you exit or you do what libuv attempts to do. See 4f5c8da for the rationale.

Ian Babrou

I've patched deps/uv in node.js sources, compiled and deployed it to one of problem servers. Let's see if it helps.

Saúl Ibarra Corretgé
Owner

No followup, closing.

Saúl Ibarra Corretgé saghul closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.