New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random segfaults on Ruby 1.9 get worse with Ruby 2 #1746

Closed
bbergstrom opened this Issue Feb 9, 2016 · 1 comment

Comments

Projects
None yet
3 participants
@bbergstrom
Copy link

bbergstrom commented Feb 9, 2016

We run a Rails application and are in the process of upgrading it from Ruby 1.9 to Ruby 2. During this process we noticed segmentation faults in the passenger log. We are not sure what is causing them and are unable to reproduce them outside of sending production traffic requests at the server. The traces that get dumped by passenger have a lot of information in them, but we are having difficulty narrowing down the issue. The traces happen on a variety of different files/gems/extensions/calls without a obvious pattern.

We tried Ruby 2.1, 2.2 and 2.3 and all of them significantly increase the rate of segmentation faults over 1.9. We have upgrade all of our gems with C extensions, and the passenger gem we are using to ensure we have all patches. We have tried smart and direct spawning. Despite all this, the large number of random segfaults persists.

We disabled garbage collection and this significantly reduced the segfaults, but we quickly ran out of memory and can't run in this configuration outside of a testing scenario.

Our stack:

  • Rails 3.2
  • Passenger 5.0.24 open source edition
  • Nginx 1.8.0
  • Amazon Linux (CentOS/RHEL compatible)

Our C extension gems:

algorithms-0.5.0
bcrypt-3.1.10
bcrypt-ruby-3.1.2
ffi-1.9.10
hpricot-0.8.6
iconv-1.0.4
image_science-1.3.2.1.Asynchrony
json-1.8.3
kgio-2.10.0
libxml-ruby-2.8.0
mysql2-0.3.20
nokogiri-1.6.7.2
oj-2.14.3
redcarpet-3.3.4
unf_ext-0.0.7.2
yajl-ruby-1.2.1

Sample of segfaults:

$ sudo grep Segmentation -A10 /var/log/nginx/error.log 
App 63995 stderr: Segmentation fault
App 63995 stderr: ruby 1.9.3p551 (2014-11-13 revision 48407) [x86_64-linux]
App 63995 stderr: 
App 63995 stderr: -- Control frame information -----------------------------------------------
App 63995 stderr: c:0133 p:---- s:0770 b:0770 l:000769 d:000769 CFUNC  :<=>
App 63995 stderr: c:0132 p:---- s:0768 b:0768 l:000767 d:000767 CFUNC  :<=>
App 63995 stderr: c:0131 p:0063 s:0764 b:0764 l:000763 d:000763 METHOD /srv/www/ngin/shared/bundle/ruby/1.9.1/gems/ri_cal-0.8.8/lib/ri_cal/fast_date_time.rb:74
App 63995 stderr: c:0130 p:0015 s:0760 b:0760 l:000759 d:000759 METHOD /srv/www/ngin/shared/bundle/ruby/1.9.1/gems/ri_cal-0.8.8/lib/ri_cal/property_value/date_time.rb:190
App 63995 stderr: c:0129 p:0015 s:0756 b:0756 l:000755 d:000755 METHOD /srv/www/ngin/shared/bundle/ruby/1.9.1/gems/ri_cal-0.8.8/lib/ri_cal/property_value/date_time.rb:186
App 63995 stderr: c:0128 p:---- s:0752 b:0752 l:000751 d:000751 FINISH
App 63995 stderr: c:0127 p:---- s:0750 b:0750 
--
App 63995 stderr: /srv/www/ngin/shared/bundle/ruby/1.9.1/gems/arel-3.0.3/lib/arel/visitors/to_sql.rb:131: [BUG] Segmentation fault
App 63995 stderr: 
App 63995 stderr: ruby 1.9.3p551 (2014-11-13 revision 48407) [x86_64-linux]
App 63995 stderr: 
App 63995 stderr: -- Control frame information -----------------------------------------------
App 63995 stderr: c:0261 p:0341 s:1441 b:1432 l:001431 d:001431 METHOD /srv/www/ngin/shared/bundle/ruby/1.9.1/gems/arel-3.0.3/lib/arel/visitors/to_sql.rb:131
App 63995 stderr: c:0260 p:0056 s:1428 b:1428 l:001427 d:001427 METHOD /srv/www/ngin/shared/bundle/ruby/1.9.1/gems/arel-3.0.3/lib/arel/visitors/mysql.rb:41
App 63995 stderr: c:0259 p:0012 s:1424 b:1424 l:001414 d:001423 BLOCK  /srv/www/ngin/shared/bundle/ruby/1.9.1/gems/arel-3.0.3/lib/arel/visitors/to_sql.rb:121
App 63995 stderr: c:0258 p:---- s:1421 b:1421 l:001420 d:001420 FINISH
App 63995 stderr: c:0257 p:---- s:1419 b:1419 l:001418 d:001418 CFUNC  :map
App 63995 stderr: c:0256 p:0048 s:1416 b:1415 l:001414 d:001414 METHOD /srv/www/ngin/shared/bundle/ruby/1.9.1/gems/arel-3.0.3/lib/arel/visitors/to_sql.rb:121

sudo grep Segmentation -A10 /var/log/nginx/error.log 
App 64108 stderr: /srv/www/ngin/shared/bundle/ruby/1.9.1/gems/json_pure-1.8.3/lib/json/common.rb:223: [BUG] Segmentation fault
App 64108 stderr: ruby 1.9.3p551 (2014-11-13 revision 48407) [x86_64-linux]
App 64108 stderr: 
App 64108 stderr: -- Control frame information -----------------------------------------------
App 64108 stderr: c:0042 p:---- s:0166 b:0166 l:000165 d:000165 CFUNC  :encode
App 64108 stderr: c:0041 p:---- s:0164 b:0164 l:000163 d:000163 CFUNC  :generate
App 64108 stderr: c:0040 p:0170 s:0160 b:0160 l:000159 d:000159 METHOD /srv/www/ngin/shared/bundle/ruby/1.9.1/gems/json_pure-1.8.3/lib/json/common.rb:223
App 64108 stderr: c:0039 p:---- s:0154 b:0154 l:000153 d:000153 FINISH
App 64108 stderr: c:0038 p:---- s:0152 b:0152 l:000151 d:000151 CFUNC  :call
App 64108 stderr: c:0037 p:0017 s:0147 b:0147 l:002600 d:000146 BLOCK  /srv/www/ngin/shared/bundle/ruby/1.9.1/gems/newrelic_rpm-3.14.2.312/lib/new_relic/json_wrapper.rb:26
App 64108 stderr: c:0036 p:---- s:0144 b:0144 l:000143 d:000143 FINISH
--
App 64108 stderr: /srv/www/ngin/shared/bundle/ruby/1.9.1/gems/activesupport-3.2.22.1/lib/active_support/dependencies.rb:251: [BUG] Segmentation fault
App 64108 stderr: 
App 64108 stderr: ruby 1.9.3p551 (2014-11-13 revision 48407) [x86_64-linux]
App 64108 stderr: 
App 64108 stderr: -- Control frame information -----------------------------------------------
App 64108 stderr: c:0128 p:---- s:0763 b:0763 l:000762 d:000762 CFUNC  :require
App 64108 stderr: c:0127 p:0010 s:0759 b:0759 l:000751 d:000758 BLOCK  /srv/www/ngin/shared/bundle/ruby/1.9.1/gems/activesupport-3.2.22.1/lib/active_support/dependencies.rb:251
App 64108 stderr: c:0126 p:0071 s:0757 b:0757 l:000756 d:000756 METHOD /srv/www/ngin/shared/bundle/ruby/1.9.1/gems/activesupport-3.2.22.1/lib/active_support/dependencies.rb:236
App 64108 stderr: c:0125 p:0019 s:0752 b:0752 l:000751 d:000751 METHOD /srv/www/ngin/shared/bundle/ruby/1.9.1/gems/activesupport-3.2.22.1/lib/active_support/dependencies.rb:251
App 64108 stderr: c:0124 p:0013 s:0747 b:0747 l:000746 d:000746 METHOD /srv/www/ngin/shared/bundle/ruby/1.9.1/gems/lograge-0.3.1/lib/lograge/formatters/logstash.rb:13
App 64108 stderr: c:0123 p:0011 s:0744 b:0744 l:000743 d:000743 METHOD /srv/www/ngin/shared/bundle/ruby/1.9.1/gems/lograge-0.3.1/lib/lograge/formatters/logstash.rb:5

crash-watch output on app process that segfaulted:

$ sudo crash-watch 12852
Found gdb at: /usr/bin/gdb
Monitoring PID 12852...
Process exited at 2016-02-08 12:42:04 -0800.
Backtrace:
    Thread 1 (Thread 0x7f957aea9740 (LWP 12852)):
    #0  0x00007f9579b5c150 in _exit () from /lib64/libc.so.6
    No symbol table info available.
    #1  0x00007f9579ad6e2b in __run_exit_handlers () from /lib64/libc.so.6
    No symbol table info available.
    #2  0x00007f9579ad6eb5 in exit () from /lib64/libc.so.6
    No symbol table info available.
    #3  0x00007f9579abfb1c in __libc_start_main () from /lib64/libc.so.6
    No symbol table info available.
    #4  0x0000000000400899 in _start ()
    No symbol table info available.

Not sure if this is a passenger issue, but I wanted to seek advice on how to troubleshoot these issues. Is there a way to get better information out of crash-watch? Are there parts of the segfault dumps we should focus on to narrow down the issue?

TIA

@bbergstrom bbergstrom changed the title Unreproducable segfaults on Ruby 1.9 get wrose with Ruby 2 Random segfaults on Ruby 1.9 get wrose with Ruby 2 Feb 9, 2016

@bbergstrom bbergstrom changed the title Random segfaults on Ruby 1.9 get wrose with Ruby 2 Random segfaults on Ruby 1.9 get worse with Ruby 2 Feb 9, 2016

@OnixGH

This comment has been minimized.

Copy link
Contributor

OnixGH commented Feb 9, 2016

This doesn't look like a Passenger issue. The crash output is coming from the gems/app; Passenger just forwards their stdout/stderr and is not crashing itself (so its crash protection and backtrace logging are not triggered).

You can use our support forum for these kinds of questions.

This post has some information on tracing similar issues, my best guess would be some kind of compiler type or flag issue.

@OnixGH OnixGH closed this Feb 9, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment