fdpoll-epoll.c:113: epoll_ctl(6, EPOLL_CTL_ADD, 3): 'File exists' #319

Closed
Borkason opened this Issue Mar 24, 2013 · 5 comments

Comments

Projects
None yet
1 participant
Member

Borkason commented Mar 24, 2013

Original author: ste...@konink.de (December 21, 2008 18:14:31)

I think Cherokee is not aware of its own EPOLLs; And tries to add
everything to epoll. When this is already present EPOLL fails. I wrote a
short workaround but this is a fix that is potentially a bug in itself.

I think Cherokee must maintain a resources list and increase a resource
count for every polled object. And remove the resource from the polled
stuff only if the resource count is zero. Since I have no clue why cherokee
adds the same resource twice I wonder what is wrong.

The bug reproduces the best if I start Cherokee under valgrind. And run the
static benchmark.

Original issue: http://code.google.com/p/cherokee/issues/detail?id=285

Member

Borkason commented Mar 24, 2013

From alobbs on December 23, 2008 09:16:25
After quite a few tries, I haven't been able to reproduce this issue.

Stefan, could you please summarize the facts around this issue?
As far as I have understood it seems to be related to file descriptors exhaustion.

Member

Borkason commented Mar 24, 2013

From ste...@konink.de on December 23, 2008 10:27:19
Kernels currently used and affected;

Linux 2.6.27-gentoo-r1 (Xeon / 2GB)
Linux 2.6.27-gentoo; SMP PREEMPT (AMD64X2 / 8GB)

cat /proc/sys/fs/file-nr
1165 0 1048576

cat /proc/sys/fs/file-nr
3064 0 757696

How I can reproduce it;
Start:

  • valgrind cherokee-worker
  • directly when the banner is visible start the static benchmark tool
  • after less than 150 request cherokee pukes.
  • the amount of threads request to get it this far should be 1 higher than the
    amount of connections
  • some 'load' seems to be required

The procedure can be delayed if the 1024 fds is increased by ulimit -n XXXX the point
is that sometimes cherokee complains abouth the system fds being too small.

Member

Borkason commented Mar 24, 2013

From ste...@konink.de on December 23, 2008 10:57:54
Extra knowledge; when I disable TLS no bug occurs.

Member

Borkason commented Mar 24, 2013

From ste...@konink.de on December 23, 2008 13:36:03
==18957== by 0x5280006: start_thread (in /lib64/libpthread-2.9.so)
==18957== by 0x5D2334C: clone (in /lib64/libc-2.9.so)
==18957== Old state: shared-modified by threads #​8 (Github: #87), #​10 (Github: #89)
==18957== New state: shared-modified by threads #​8 (Github: #87), #​10 (Github: #89)
==18957== Reason: this thread, #​10 (Github: #89), holds no consistent locks
==18957== Last consistently used lock for 0x6020B2C was first observed
==18957== at 0x4C275BD: pthread_mutex_init (in /usr/lib64/valgrind/amd64-linu
x/vgpreload_helgrind.so)
==18957== by 0x506DA3D: cherokee_thread_new (thread.c:205)
==18957== by 0x50682A0: cherokee_server_initialize (server.c:775)
==18957== by 0x4017CF: main (main_worker.c:246)
fdpoll-epoll.c:113: epoll_ctl(14, EPOLL_CTL_ADD, 3): 'File exists'
fdpoll-epoll.c:113: epoll_ctl(14, EPOLL_CTL_ADD, 3): 'File exists'
==18957==

Member

Borkason commented Mar 24, 2013

From ste...@konink.de on January 02, 2009 23:10:51
Has been tackled!

Borkason closed this Mar 24, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment