New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation faults and memory corruption from using libxml-ruby with nokogiri #116

Closed
bbergstrom opened this Issue Feb 16, 2016 · 5 comments

Comments

Projects
None yet
4 participants
@bbergstrom
Copy link

bbergstrom commented Feb 16, 2016

We have been experiencing memory corruption in our Rails application which depends on nokogiri (1.6.7.2) and libxml-ruby (2.8.0). This memory corruption manifested itself in seemingly random segmentation faults with stack traces to nearly every part of code in our application and its dependencies.

After upgrading or removing nearly every gem with a C extension we were able to verify it would go away when we removed a gem that dependent on libxml-ruby. Upon investigation into libxml-ruby segmentation faults we came across similar issues of sparklemotion/nokogiri#895 and sparklemotion/nokogiri#881 and sparklemotion/nokogiri#1364 and #62 . That issue was patched some versions ago, but it appears that a similar issue still exists.

I am able to reproduce on Amazon Linux (RHEL/CentOS based distro) but not on OSX with this script 3/4 of the time it executes.

$ cat nokogiri-libxml-segfault.rb 
#!/usr/bin/env ruby
gem 'nokogiri', '1.6.7.2'
gem 'libxml-ruby', '2.8.0'
require 'nokogiri'
require 'libxml'

message = "<h2>BOOM!</h2>"
20_000.times do
  node = Nokogiri::HTML::DocumentFragment.parse(message)
  node.add_previous_sibling(Nokogiri::XML::Text.new(' ', node.document))
  node.add_next_sibling(Nokogiri::XML::Text.new(' ', node.document))
end

Here is a sample of the stack traces that result.

$ ./nokogiri-libxml-segfault.rb
/usr/local/lib/ruby/gems/2.2.0/gems/nokogiri-1.6.7.2/lib/nokogiri/html/document_fragment.rb:10: [BUG] Segmentation fault at 0x00000000000070
ruby 2.2.4p230 (2015-12-16 revision 53155) [x86_64-linux]

-- Control frame information -----------------------------------------------
c:0006 p:---- s:0021 e:000020 CFUNC  :encoding=
c:0005 p:0062 s:0017 e:000016 METHOD /usr/local/lib/ruby/gems/2.2.0/gems/nokogiri-1.6.7.2/lib/nokogiri/html/document_fragment.rb:10
c:0004 p:0019 s:0011 e:000010 BLOCK  ./nokogiri-libxml-segfault.rb:12 [FINISH]
c:0003 p:---- s:0008 e:000007 CFUNC  :times
c:0002 p:0048 s:0005 E:002568 EVAL   ./nokogiri-libxml-segfault.rb:11 [FINISH]
c:0001 p:0000 s:0002 E:001ba0 TOP    [FINISH]

-- Ruby level backtrace information ----------------------------------------
./nokogiri-libxml-segfault.rb:11:in `<main>'
./nokogiri-libxml-segfault.rb:11:in `times'
./nokogiri-libxml-segfault.rb:12:in `block in <main>'
/usr/local/lib/ruby/gems/2.2.0/gems/nokogiri-1.6.7.2/lib/nokogiri/html/document_fragment.rb:10:in `parse'
/usr/local/lib/ruby/gems/2.2.0/gems/nokogiri-1.6.7.2/lib/nokogiri/html/document_fragment.rb:10:in `encoding='
$ ./nokogiri-libxml-segfault.rb
/usr/local/lib/ruby/gems/2.2.0/gems/nokogiri-1.6.7.2/lib/nokogiri/xml/node.rb:107: [BUG] Segmentation fault at 0x00000000000040
ruby 2.2.4p230 (2015-12-16 revision 53155) [x86_64-linux]

-- Control frame information -----------------------------------------------
c:0009 p:---- s:0033 e:000032 CFUNC  :document
c:0008 p:0007 s:0030 e:000029 METHOD /usr/local/lib/ruby/gems/2.2.0/gems/nokogiri-1.6.7.2/lib/nokogiri/xml/node.rb:107 [FINISH]
c:0007 p:---- s:0027 e:000026 CFUNC  :add_next_sibling_node
c:0006 p:0142 s:0023 e:000022 METHOD /usr/local/lib/ruby/gems/2.2.0/gems/nokogiri-1.6.7.2/lib/nokogiri/xml/node.rb:780
c:0005 p:0048 s:0015 e:000014 METHOD /usr/local/lib/ruby/gems/2.2.0/gems/nokogiri-1.6.7.2/lib/nokogiri/xml/node.rb:197
c:0004 p:0072 s:0011 e:000010 BLOCK  ./nokogiri-libxml-segfault.rb:14 [FINISH]
c:0003 p:---- s:0008 e:000007 CFUNC  :times
c:0002 p:0048 s:0005 E:000b78 EVAL   ./nokogiri-libxml-segfault.rb:11 [FINISH]
c:0001 p:0000 s:0002 E:000160 TOP    [FINISH]

-- Ruby level backtrace information ----------------------------------------
./nokogiri-libxml-segfault.rb:11:in `<main>'
./nokogiri-libxml-segfault.rb:11:in `times'
./nokogiri-libxml-segfault.rb:14:in `block in <main>'
/usr/local/lib/ruby/gems/2.2.0/gems/nokogiri-1.6.7.2/lib/nokogiri/xml/node.rb:197:in `add_next_sibling'
/usr/local/lib/ruby/gems/2.2.0/gems/nokogiri-1.6.7.2/lib/nokogiri/xml/node.rb:780:in `add_sibling'
/usr/local/lib/ruby/gems/2.2.0/gems/nokogiri-1.6.7.2/lib/nokogiri/xml/node.rb:780:in `add_next_sibling_node'
/usr/local/lib/ruby/gems/2.2.0/gems/nokogiri-1.6.7.2/lib/nokogiri/xml/node.rb:107:in `decorate!'
/usr/local/lib/ruby/gems/2.2.0/gems/nokogiri-1.6.7.2/lib/nokogiri/xml/node.rb:107:in `document'
$ ./nokogiri-libxml-segfault.rb
/usr/local/lib/ruby/gems/2.2.0/gems/nokogiri-1.6.7.2/lib/nokogiri/xml/node.rb:780: [BUG] Segmentation fault at 0x00000000000040
ruby 2.2.4p230 (2015-12-16 revision 53155) [x86_64-linux]

-- Control frame information -----------------------------------------------
c:0007 p:---- s:0027 e:000026 CFUNC  :add_next_sibling_node
c:0006 p:0142 s:0023 e:000022 METHOD /usr/local/lib/ruby/gems/2.2.0/gems/nokogiri-1.6.7.2/lib/nokogiri/xml/node.rb:780
c:0005 p:0048 s:0015 e:000014 METHOD /usr/local/lib/ruby/gems/2.2.0/gems/nokogiri-1.6.7.2/lib/nokogiri/xml/node.rb:197
c:0004 p:0072 s:0011 e:000010 BLOCK  ./nokogiri-libxml-segfault.rb:14 [FINISH]
c:0003 p:---- s:0008 e:000007 CFUNC  :times
c:0002 p:0048 s:0005 E:0014c8 EVAL   ./nokogiri-libxml-segfault.rb:11 [FINISH]
c:0001 p:0000 s:0002 E:0009c0 TOP    [FINISH]

-- Ruby level backtrace information ----------------------------------------
./nokogiri-libxml-segfault.rb:11:in `<main>'
./nokogiri-libxml-segfault.rb:11:in `times'
./nokogiri-libxml-segfault.rb:14:in `block in <main>'
/usr/local/lib/ruby/gems/2.2.0/gems/nokogiri-1.6.7.2/lib/nokogiri/xml/node.rb:197:in `add_next_sibling'
/usr/local/lib/ruby/gems/2.2.0/gems/nokogiri-1.6.7.2/lib/nokogiri/xml/node.rb:780:in `add_sibling'
/usr/local/lib/ruby/gems/2.2.0/gems/nokogiri-1.6.7.2/lib/nokogiri/xml/node.rb:780:in `add_next_sibling_node'

We hope to eventually remove our dependency on libxml-ruby as we use nokogiri in our codebase, but a required dependency currently forces libxml-ruby into our project as well. A patch would be great for compatibility and for anyone else that may encounter this convoluted issue. We had to spend a lot of time troubleshooting this issue as none of the segmentation faults that happened in our systems pointed to nokogiri or libxml-ruby.

TIA

@bbergstrom

This comment has been minimized.

Copy link

bbergstrom commented Feb 16, 2016

Related nokogiri issue: sparklemotion/nokogiri#1426

@stoivo

This comment has been minimized.

Copy link

stoivo commented Apr 11, 2016

I had the same issue but I went through a journey to find solution.

It started with our Passenger worker jamming and not responding. After a lot of logging and testing we discovered that Passenger jams while we are parsing XML and returns 502 status code and leaves a segmentation fault in logs. We contacted Passenger Development Team, and with their help we concluded that libxml-ruby was the issue. We rewrote our code only to use nokogiri and now it works.

Special thanks to Passenger Development Team.

Note: Wrote this to make it easier for other people with similar issue to find this solution faster.

Keywords:

libxml-ruby, nokogiri, passenger worker jams, xml parsing

@jacobbednarz

This comment has been minimized.

Copy link

jacobbednarz commented May 18, 2016

@bbergstrom Just in case you're still looking for some reprieve, an application I work on is experiencing extremely similar issues to the ones you have mentioned here (segmentation faults, nil:NilClass, false:FalseClass errors) and we're currently evaluating a patch in our production environment to libxml-ruby which has seen our segmentation faults go from 25-ish an hour to zero. We are also using a combination of Nokogiri and libxml-ruby.

Once we finish evaluating the patch I'll be sure to push it upstream and cc you on the pull request to test it out.

@jacobbednarz

This comment has been minimized.

Copy link

jacobbednarz commented May 18, 2016

the patch we are evaluating (and looking pretty good so far!) is up at #118

abrasive added a commit to abrasive/libxml-ruby that referenced this issue May 25, 2016

Store libxml -> Ruby object mappings in a hashtable
Maintain a hashtable of mappings from libxml objects to Ruby objects by
address. This replaces the mechanism of using the _private pointer in
the libxml objects.

As libxml-ruby registers a global callback with libxml for node
deletions, when a node's _private field was not NULL, libxml-ruby
would reach through the pointer and clobber fields.
This interacts badly with other consumers of the underlying libxml
library, such as nokogiri, which use the _private field for their own
objects.

Addresses Github issue xml4r#116.

abrasive added a commit to abrasive/libxml-ruby that referenced this issue May 25, 2016

Store libxml -> Ruby object mappings in a hashtable
Maintain a hashtable of mappings from libxml objects to Ruby objects by
address. This replaces the mechanism of using the _private pointer in
the libxml objects.

As libxml-ruby registers a global callback with libxml for node
deletions, when a node's _private field was not NULL, libxml-ruby
would reach through the pointer and clobber fields.
This interacts badly with other consumers of the underlying libxml
library, such as nokogiri, which use the _private field for their own
objects.

Addresses Github issue xml4r#116.
@cfis

This comment has been minimized.

Copy link
Member

cfis commented May 28, 2016

Fixed in #119.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment