Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermittent segfault on Nokogiri 1.13.6 -> 1.13.7 when processing Tempfile #105

Closed
bf4 opened this issue Jul 19, 2022 · 10 comments
Closed

Comments

@bf4
Copy link

bf4 commented Jul 19, 2022

I've isolated this to creek (2.5.3, rails 7.0.3, cruby 3.0.3) but haven't figured out exactly why..

we store some xlsx files inline as bytea columns and load them into the Creek gem when processing

      fh = new_file_handler(file)
      book = load_file(fh)

where new_file_handler is

    fh = Tempfile.new(["creek-file", ".xlsx"], encoding: file.encoding)
    fh.write(file)
    fh.close
    fh

and load_file is

    path = fh.path
    Creek::Book.new(path, check_file_extension: false)

I thought the problem might be the fh.close before reading the file via path could change the behavior with the new nokogiri code, and the resolution would be to have an ensure block with book&.close when all done.

I tried that and I tried swapping out fh.write(file) with File.binwrite(fh.path, file) like Creek does for remote files, but that didn't resolve the seg faults. Only rolling back Nokogiri seems to have resolved them.

In any case, leaving this issue here in case any one else notices it.

@d1ceward
Copy link

I have identified the same problem on the Nokogiri::XML::Reader block from sheet.rb since the 1.13.7 update of Nokogiri.

I think it's from Nokogiri because ironically their last change note for 1.13.7 talks about solving a possible segfault on XML::Node with low-level ruby and garbage collector.

My current solution is the same, rollback nokogiri with :

gem 'nokogiri', '= 1.13.6'

Interesting parts of my segfault backtrace

./home/d1ceward/.asdf/installs/ruby/3.1.2/lib/ruby/gems/3.1.0/gems/nokogiri-1.13.7-x86_64-linux/lib/nokogiri/xml/reader.rb:90: [BUG] Segmentation fault at 0x0000000100000004
ruby 3.1.2p20 (2022-04-12 revision 4491bb740a) [x86_64-linux]

-- Control frame information -----------------------------------------------
c:0073 p:---- s:0371 e:000370 CFUNC  :attribute_nodes
c:0072 p:0003 s:0367 e:000366 METHOD /home/d1ceward/.asdf/installs/ruby/3.1.2/lib/ruby/gems/3.1.0/gems/nokogiri-1.13.7-x86_64-linux/lib/nokogiri/xml/reader.rb:90
c:0071 p:0207 s:0361 e:000360 BLOCK  /home/d1ceward/.asdf/installs/ruby/3.1.2/lib/ruby/gems/3.1.0/bundler/gems/creek-d63c47fca129/lib/creek/sheet.rb:121
c:0070 p:0010 s:0356 e:000355 METHOD /home/d1ceward/.asdf/installs/ruby/3.1.2/lib/ruby/gems/3.1.0/gems/nokogiri-1.13.7-x86_64-linux/lib/nokogiri/xml/reader.rb:102
c:0069 p:0024 s:0351 e:000350 BLOCK  /home/d1ceward/.asdf/installs/ruby/3.1.2/lib/ruby/gems/3.1.0/bundler/gems/creek-d63c47fca129/lib/creek/sheet.rb:100
c:0068 p:0218 s:0347 e:000346 METHOD /home/d1ceward/.asdf/installs/ruby/3.1.2/lib/ruby/gems/3.1.0/gems/rubyzip-2.3.2/lib/zip/entry.rb:540
c:0067 p:0011 s:0339 e:000338 METHOD /home/d1ceward/.asdf/installs/ruby/3.1.2/lib/ruby/gems/3.1.0/gems/rubyzip-2.3.2/lib/zip/file.rb:259
c:0066 p:0014 s:0333 e:000332 METHOD /home/d1ceward/.asdf/installs/ruby/3.1.2/lib/ruby/gems/3.1.0/gems/rubyzip-2.3.2/lib/zip/filesystem.rb:580
c:0065 p:0085 s:0327 e:000326 METHOD /home/d1ceward/.asdf/installs/ruby/3.1.2/lib/ruby/gems/3.1.0/gems/rubyzip-2.3.2/lib/zip/filesystem.rb:245
c:0064 p:0032 s:0319 e:000318 BLOCK  /home/d1ceward/.asdf/installs/ruby/3.1.2/lib/ruby/gems/3.1.0/bundler/gems/creek-d63c47fca129/lib/creek/sheet.rb:99 [FINISH]
c:0063 p:---- s:0310 E:001130 CFUNC  :each
c:0062 p:---- s:0307 E:001ad0 CFUNC  :each_with_index

@flavorjones
Copy link

👀 Nokogiri maintainer here, I'll try to reproduce this. Any help you can provide would be great.

@flavorjones
Copy link

I could really use a reliable repro if either of you can provide one. Thanks.

@flavorjones
Copy link

Please open an issue upstream at https://github.com/sparklemotion/nokogiri with the information requested about your system and any kind of a repro you might have.

I know nothing about Creek so any help you can provide will help us diagnose what's going on faster.

@bf4
Copy link
Author

bf4 commented Jul 19, 2022

@flavorjones Thanks for looking ✨ I'll share what I can. I haven't been able to reproduce it outside of the test suite yet. I have a test which triggers is .. some times.. but at least it's a specific test

@flavorjones
Copy link

👍 even a "sometimes" repro is good enough with most memory-related issues, because I can always run it in a loop.

@flavorjones
Copy link

Please follow along upstream at sparklemotion/nokogiri#2598

@flavorjones
Copy link

flavorjones commented Jul 21, 2022

Side note: I have a PR up that adds creek to the set of downstream gems that Nokogiri integration-tests against in CI.

See https://github.com/sparklemotion/nokogiri/runs/7454887315?check_suite_focus=true for an example.

@flavorjones
Copy link

FWIW this is addressed in Nokogiri v1.13.8 and later, and can probably be closed.

@bf4
Copy link
Author

bf4 commented Oct 16, 2022

Closing as resolved by new release

@bf4 bf4 closed this as completed Oct 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants