Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segfault eventmachine-1.0.8/lib/eventmachine.rb:193 #646

Closed
carols10cents opened this issue Oct 15, 2015 · 7 comments
Closed

segfault eventmachine-1.0.8/lib/eventmachine.rb:193 #646

carols10cents opened this issue Oct 15, 2015 · 7 comments

Comments

@carols10cents
Copy link

Hi, I've seen this a few times while running our tests on our jenkins ubuntu servers now, but not consistently. It might be the same as #511, but I'm not sure-- I don't know if it's because we're opening a lot of connections or not. Here's what I get:

/usr/local/rvm/gems/ruby-2.1.7@apangea/gems/eventmachine-1.0.8/lib/eventmachine.rb:193: [BUG] Segmentation fault at 0x00000000000018
ruby 2.1.7p400 (2015-08-18 revision 51632) [x86_64-linux]

-- Control frame information -----------------------------------------------
c:0009 p:---- s:0041 e:000040 CFUNC  :run_machine
c:0008 p:0307 s:0038 e:000037 METHOD /usr/local/rvm/gems/ruby-2.1.7@apangea/gems/eventmachine-1.0.8/lib/eventmachine.rb:193
c:0007 p:0059 s:0031 E:000560 METHOD /usr/local/rvm/gems/ruby-2.1.7@apangea/gems/thin-1.6.4/lib/thin/backends/base.rb:73
c:0006 p:0111 s:0027 E:000498 METHOD /usr/local/rvm/gems/ruby-2.1.7@apangea/gems/thin-1.6.4/lib/thin/server.rb:162
c:0005 p:0175 s:0024 e:000023 METHOD /usr/local/rvm/gems/ruby-2.1.7@apangea/gems/rack-1.6.4/lib/rack/handler/thin.rb:19
c:0004 p:0049 s:0013 e:000012 BLOCK  /mnt/jenkins/workspace/Apangea-test/test/test_helper.rb:28 [FINISH]
c:0003 p:---- s:0009 e:000008 CFUNC  :call
c:0002 p:0021 s:0004 e:000003 BLOCK  /usr/local/rvm/gems/ruby-2.1.7@apangea/gems/capybara-2.4.4/lib/capybara/server.rb:70 [FINISH]
c:0001 p:---- s:0002 e:000001 TOP    [FINISH]

-- Ruby level backtrace information ----------------------------------------
/usr/local/rvm/gems/ruby-2.1.7@apangea/gems/capybara-2.4.4/lib/capybara/server.rb:70:in `block in boot'
/usr/local/rvm/gems/ruby-2.1.7@apangea/gems/capybara-2.4.4/lib/capybara/server.rb:70:in `call'
/mnt/jenkins/workspace/Apangea-test/test/test_helper.rb:28:in `block in <top (required)>'
/usr/local/rvm/gems/ruby-2.1.7@apangea/gems/rack-1.6.4/lib/rack/handler/thin.rb:19:in `run'
/usr/local/rvm/gems/ruby-2.1.7@apangea/gems/thin-1.6.4/lib/thin/server.rb:162:in `start'
/usr/local/rvm/gems/ruby-2.1.7@apangea/gems/thin-1.6.4/lib/thin/backends/base.rb:73:in `start'
/usr/local/rvm/gems/ruby-2.1.7@apangea/gems/eventmachine-1.0.8/lib/eventmachine.rb:193:in `run'
/usr/local/rvm/gems/ruby-2.1.7@apangea/gems/eventmachine-1.0.8/lib/eventmachine.rb:193:in `run_machine'

-- C level backtrace information -------------------------------------------
Build timed out (after 45 minutes). Marking the build as aborted.

Sooooo no C backtrace. Is there something I can do to enable more useful logging or get other useful information? Let me know and I'm happy to try :)

@khadzhinov
Copy link

Hi, same problem now

/home/deploy/apps/enroute/shared/bundle/ruby/2.1.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:193: [BUG] Segmentation fault at 0x00000000000018
ruby 2.1.7p400 (2015-08-18 revision 51632) [x86_64-linux-gnu]

-- Control frame information -----------------------------------------------
c:0011 p:---- s:0041 e:000040 CFUNC :run_machine
c:0010 p:0307 s:0038 e:000037 METHOD /home/deploy/apps/enroute/shared/bundle/ruby/2.1.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:193
c:0009 p:0059 s:0031 E:000af0 METHOD /home/deploy/apps/enroute/shared/bundle/ruby/2.1.0/gems/thin-1.6.3/lib/thin/backends/base.rb:73
c:0008 p:0111 s:0027 E:0007c8 METHOD /home/deploy/apps/enroute/shared/bundle/ruby/2.1.0/gems/thin-1.6.3/lib/thin/server.rb:162
c:0007 p:0451 s:0024 E:000640 METHOD /home/deploy/apps/enroute/shared/bundle/ruby/2.1.0/gems/thin-1.6.3/lib/thin/controllers/controller.rb:87
c:0006 p:0270 s:0020 e:000019 METHOD /home/deploy/apps/enroute/shared/bundle/ruby/2.1.0/gems/thin-1.6.3/lib/thin/runner.rb:200
c:0005 p:0021 s:0015 e:000014 METHOD /home/deploy/apps/enroute/shared/bundle/ruby/2.1.0/gems/thin-1.6.3/lib/thin/runner.rb:156
c:0004 p:0030 s:0012 e:000011 TOP /home/deploy/apps/enroute/shared/bundle/ruby/2.1.0/gems/thin-1.6.3/bin/thin:6 [FINISH]
c:0003 p:---- s:0010 e:000009 CFUNC :load
c:0002 p:0135 s:0006 E:000d48 EVAL /home/deploy/apps/enroute/shared/bundle/ruby/2.1.0/bin/thin:23 [FINISH]
c:0001 p:0000 s:0002 E:001748 TOP [FINISH]

-- Ruby level backtrace information ----------------------------------------
/home/deploy/apps/enroute/shared/bundle/ruby/2.1.0/bin/thin:23:in <main>' /home/deploy/apps/enroute/shared/bundle/ruby/2.1.0/bin/thin:23:inload'
/home/deploy/apps/enroute/shared/bundle/ruby/2.1.0/gems/thin-1.6.3/bin/thin:6:in <top (required)>' /home/deploy/apps/enroute/shared/bundle/ruby/2.1.0/gems/thin-1.6.3/lib/thin/runner.rb:156:inrun!'
/home/deploy/apps/enroute/shared/bundle/ruby/2.1.0/gems/thin-1.6.3/lib/thin/runner.rb:200:in run_command' /home/deploy/apps/enroute/shared/bundle/ruby/2.1.0/gems/thin-1.6.3/lib/thin/controllers/controller.rb:87:instart'
/home/deploy/apps/enroute/shared/bundle/ruby/2.1.0/gems/thin-1.6.3/lib/thin/server.rb:162:in start' /home/deploy/apps/enroute/shared/bundle/ruby/2.1.0/gems/thin-1.6.3/lib/thin/backends/base.rb:73:instart'
/home/deploy/apps/enroute/shared/bundle/ruby/2.1.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:193:in run' /home/deploy/apps/enroute/shared/bundle/ruby/2.1.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:193:inrun_machine'

@sodabrew
Copy link
Contributor

Can you run your server with ulimit -c unlimited to allow a core dump? You can get a C backtrace then with the core dump and gdb.

@khadzhinov
Copy link

No C level backtrace info...
We also have this:

Writing PID to /home/deploy/apps/enroute/shared/tmp/pids/thin.0.pid
Changing process privilege to deploy:deploy
Thin web server (v1.6.3 codename Protein Powder)
Maximum connections set to 2048
Listening on /home/deploy/apps/enroute/shared/sockets/faye.0.sock, CTRL+C to stop
*** Error in `thin server (/home/deploy/apps/enroute/shared/sockets/faye.0.sock)': realloc(): invalid next size: 0x0000000001087840 ***

We have this problem after linode update, before that everything was correct

@cbeckr
Copy link

cbeckr commented Dec 27, 2015

We also encounter this issue when load testing with 1.0.8 (running with Passenger 4.0.46 under nginx).
Here's the post mortem including a C level backtrace:

/srv/www/app/vendor/bundle/ruby/2.1.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:193: [BUG] Segmentation fault at 0x00000000000000
ruby 2.1.6p336 (2015-04-13 revision 50298) [x86_64-linux]

-- Control frame information -----------------------------------------------
c:0004 p:---- s:0014 e:000013 CFUNC  :run_machine
c:0003 p:0307 s:0011 e:000010 METHOD /srv/www/app/vendor/bundle/ruby/2.1.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:193
c:0002 p:0013 s:0004 e:000003 BLOCK  /srv/www/app/vendor/bundle/ruby/2.1.0/gems/faye-1.0.3/lib/faye/engines/proxy.rb:14 [FINISH]
c:0001 p:---- s:0002 e:000001 TOP    [FINISH]

-- Ruby level backtrace information ----------------------------------------
/srv/www/app/vendor/bundle/ruby/2.1.0/gems/faye-1.0.3/lib/faye/engines/proxy.rb:14:in `block in ensure_reactor_running!'
/srv/www/app/vendor/bundle/ruby/2.1.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:193:in `run'
/srv/www/app/vendor/bundle/ruby/2.1.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:193:in `run_machine'

-- C level backtrace information -------------------------------------------
/usr/local/lib/libruby.so.2.1(+0x1e2aec) [0x7f4e7646faec] vm_dump.c:690
/usr/local/lib/libruby.so.2.1(+0x77453) [0x7f4e76304453] error.c:312
/usr/local/lib/libruby.so.2.1(rb_bug+0xb3) [0x7f4e763050a3] error.c:339
/usr/local/lib/libruby.so.2.1(+0x15a283) [0x7f4e763e7283] signal.c:824
/lib/x86_64-linux-gnu/libc.so.6(+0x36d40) [0x7f4e75efed40] array.c:2911

Let me know if you require something more verbose.

EDIT:
I enabled core dumps and managed to reproduce the issue several times. The most common trace looks as follows (may slightly differ from the details posted above since those were from an earlier occurence):

#0  0x00007f05cf2f6cc9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00007f05cf2fa218 in __GI_abort () at abort.c:118
#2  0x00007f05cf6fd0a8 in rb_bug (fmt=fmt@entry=0x7f05cf8a765c "Segmentation fault at %p") at error.c:346
#3  0x00007f05cf7df283 in sigsegv (sig=<optimized out>, info=0x7f05adc05670, ctx=<optimized out>) at signal.c:824
#4  <signal handler called>
#5  0x00007f05cf33f12f in _int_free (av=0x7f05ac000020, p=<optimized out>, have_lock=0) at malloc.c:3996
#6  0x00007f05cf723be1 in objspace_xfree (old_size=<optimized out>, ptr=0x7f05ac00ccb0, objspace=0x6349a0) at gc.c:6150
#7  ruby_sized_xfree (size=0, x=0x7f05ac00ccb0) at gc.c:6237
#8  ruby_xfree (x=0x7f05ac00ccb0) at gc.c:6244
#9  0x00007f05cf877642 in rb_fd_term (fds=0x7f05ac00cbd8) at thread.c:3236
#10 0x00007f05ca133fd4 in SelectData_t::~SelectData_t() () from /srv/www/app/vendor/bundle/ruby/2.1.0/extensions/x86_64-linux/2.1.0/eventmachine-1.0.8/rubyeventmachine.so
#11 0x00007f05ca132624 in EventMachine_t::~EventMachine_t() () from /srv/www/app/vendor/bundle/ruby/2.1.0/extensions/x86_64-linux/2.1.0/eventmachine-1.0.8/rubyeventmachine.so
#12 0x00007f05ca132786 in EventMachine_t::~EventMachine_t() () from /srv/www/app/vendor/bundle/ruby/2.1.0/extensions/x86_64-linux/2.1.0/eventmachine-1.0.8/rubyeventmachine.so
#13 0x00007f05ca14dfcb in evma_release_library () from /srv/www/app/vendor/bundle/ruby/2.1.0/extensions/x86_64-linux/2.1.0/eventmachine-1.0.8/rubyeventmachine.so
#14 0x00007f05ca141d18 in t_release_machine(unsigned long) () from /srv/www/app/vendor/bundle/ruby/2.1.0/extensions/x86_64-linux/2.1.0/eventmachine-1.0.8/rubyeventmachine.so
#15 0x00007f05cf85e9d1 in vm_call_cfunc_with_frame (ci=<optimized out>, reg_cfp=0x7f05c3af9f20, th=0x7f05ac5979f0) at vm_insnhelper.c:1510
#16 vm_call_cfunc (ci=<optimized out>, reg_cfp=0x7f05c3af9f20, th=0x7f05ac5979f0) at vm_insnhelper.c:1600
#17 vm_call_method (th=0x7f05ac5979f0, cfp=0x7f05c3af9f20, ci=<optimized out>) at vm_insnhelper.c:1788
#18 0x00007f05cf8559b4 in vm_exec_core (th=th@entry=0x7f05ac5979f0, initial=initial@entry=0) at insns.def:1028
#19 0x00007f05cf85966f in vm_exec (th=th@entry=0x7f05ac5979f0) at vm.c:1398
#20 0x00007f05cf85cb2a in invoke_block_from_c (th=0x7f05ac5979f0, block=<optimized out>, self=<optimized out>, argc=<optimized out>, argv=<optimized out>, 
    blockptr=<optimized out>, cref=0x0, defined_class=75138240) at vm.c:817
#21 0x00007f05cf85d30b in vm_invoke_proc (th=th@entry=0x7f05ac5979f0, proc=proc@entry=0x7f05ad786040, self=75139400, defined_class=75138240, argc=0, argv=0x7f05ad0035d0, 
    blockptr=blockptr@entry=0x0) at vm.c:881
#22 0x00007f05cf85d3ba in rb_vm_invoke_proc (th=th@entry=0x7f05ac5979f0, proc=proc@entry=0x7f05ad786040, argc=<optimized out>, argv=<optimized out>, blockptr=blockptr@entry=0x0)
    at vm.c:900
#23 0x00007f05cf8732b0 in thread_start_func_2 (th=th@entry=0x7f05ac5979f0, stack_start=<optimized out>) at thread.c:535
#24 0x00007f05cf87371b in thread_start_func_1 (th_ptr=0x7f05ac5979f0) at thread_pthread.c:840
#25 0x00007f05cf0aa182 in start_thread (arg=0x7f05c39f9700) at pthread_create.c:312
#26 0x00007f05cf3ba47d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

The related EventMachine code looks sane so I have a feeling that the issue may lie within Ruby's garbage collection. I updated from Ruby 2.1.6 to 2.2.4 since there were several related changes and was able to provoke only one SEGFAULT so far, which didn't point to EventMachine (it was some stray string).
I'll report back if I run into any further issues with the updated Ruby version.

@cbeckr
Copy link

cbeckr commented Jan 5, 2016

Further thoughts: When EventMachine detects that it was forked, e.g. when using Passenger, release_machine is called to clean up the reactor state within eventmachine.rb (added in #213).
This eventually invokes SelectData_t::~SelectData_t() (added in #586), as seen in the trace above.
rb_fd_term should not cause issues because it only frees memory on the (copied) heap.

@sodabrew
Copy link
Contributor

Please try the new EventMachine 1.2.3 release, it should resolve this crash!

@zjxpsetp
Copy link

@sodabrew hi, we use 1.2.3 and Ruby 2.1.9. But still find the segment fault.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants