Every repository with this icon (
Every repository with this icon (
| Description: | Nokogiri (鋸) is an HTML, XML, SAX, and Reader parser with XPath and CSS selector support. edit |
-
raise(RuntimeError, "Nokogiri requires JRuby 1.4.0RC1 or later on Windows") if JRUBY_VERSION < "1.4.0RC1"fails if my version is 1.4.0
reported by bhauff
Comments
-
1 comment Created 9 days ago by chriseppstein1.4.1xAncestor search doesn't work with a css query.tenderlovexSee this script for an example: http://gist.github.com/227429
Comments
tenderlove
Thu Nov 05 19:46:04 -0800 2009
| link
Node#matches? works in nodes contained by a DocumentFragment. closed by d41db1a
-
1 comment Created 9 days ago by tenderlove1.4.1xAdd the ":self" psuedo selectortenderlovexAdd the ":self" pseudo selector so that people can have CSS expressions like this:
":self > foo"which would be equivalent to this:
"./foo"Comments
tenderlove
Mon Nov 09 15:37:11 -0800 2009
| link
Fixed in 55fbf25
-
Nokogiri::HTML::DocumentFragment#parse regression in how UTF8 is handled
2 comments Created 15 days ago by naofumiI'm comparing 1.3.3 on rubyforge, with the Oct. 29 09:05:43 2009 -0700 commit 7fbf262 .
I'm on Ruby 1.8.7 ruby 1.8.7 (2009-06-12 patchlevel 174) [i686-darwin10]
LibXML2 is 2.7.3.With 1.3.3, DocumentFragment works OK with UTF8 strings.
puts Nokogiri::HTML::DocumentFragment.parse(%Q(<body>こんにちは</body>)).to_html(:encoding => 'UTF-8')
=> <body>こんにちは</body>
However, with the latest GitHub version, the result is
=><body>ããã«ã¡ã¯</body>Interestingly, if I add a <meta charset> tag, the latest GitHub version recognizes it and processes UTF8 strings correctly. The following is an example.
puts Nokogiri::HTML::DocumentFragment.parse(%Q(<meta content="text/html; charset=UTF8" http-equiv="content-type">\n<body>こんにちは</body>)).to_html(:encoding => 'UTF-8')
=> <meta content="text/html; charset=UTF8" http-equiv="content-type"><body>こんにちは</body>
I looked into the source, but I couldn't get any closer to the cause.Comments
tenderlove
Fri Oct 30 17:45:51 -0700 2009
| link
バグレポートを送ってくれてありがとう。
修正しました!
Thanks for using Nokogiri.
tenderlove
Fri Oct 30 17:46:44 -0700 2009
| link
Fixed in 7a77846
-
Segmentation fault when creating new node with 'Self' reference
1 comment Created 16 days ago by jhingoI get a [BUG] Segmentation Fault with the following code:
class XMLDocument < Nokogiri::XML::Document def initialize() super() body_node = Nokogiri::XML::Node.new("body",self) body_node.content = "stuff" self.root = body_node end endIf I execute the following 'xmldoc=XMLDocument.new()' the seg fault occurs on the line:
body_node = Nokogiri::XML::Node.new("body",self)If I create a separate 'tempdoc=Nokogiri::XML::Document.new' in the initialize() and change the references from 'self' to 'tempdoc' in the rest of the code then there's no fault. So the fault is being triggered by the 'self' reference.
Running
Windows XP
ruby 1.8.6
nokogiri 1.3.3
libxml 2.7.3I'm a newbie so apologies if I'm doing anything moronic.
Cheers!
Comments
tenderlove
Thu Oct 29 09:05:48 -0700 2009
| link
- ext/nokogiri/xml_document.c (Nokogiri_wrap_xml_document) init called after the tuple is set up. closed by 7fbf262
-
Fragment nodes with namespaces should work properly
1 comment Created 18 days ago by flavorjonesReported by Iñaki Baz Castillo on the mailing list.
Creating a fragment with a namespace makes the prefix part of the tag name, and (arbitrarily?) uses the namespace of the document root's first child.
Comments
flavorjones
Tue Oct 27 07:58:44 -0700 2009
| link
Closed by 597195f
-
NodeSet.wrap does not preserve document structure
2 comments Created 25 days ago by flavorjonesFailing spec:
def test_wrap_preserves_document_structure assert_equal "employeeId", @xml.at_xpath("//employee").children.detect{|j| ! j.text? }.name @xml.xpath("//employeeId[text()='EMP0001']").wrap("<wrapper/>") assert_equal "wrapper", @xml.at_xpath("//employee").children.detect{|j| ! j.text? }.name endComments
flavorjones
Mon Oct 19 20:10:58 -0700 2009
| link
NodeSet.wrap now preserves document structure. closed by f7388be.
flavorjones
Mon Oct 19 20:12:12 -0700 2009
| link
and 2d3db36
-
Would also be nice to see a :new_lines => false option at some point, so you get back to back elements, no extra spacing. Making the builder object searchable would be awesome too. e.g.
builder.at('//root').to_xml(:indent => 0, :new_lines => false)
or better yet:
builder.root.to_xml(:spacing => false)
Comments
tenderlove
Thu Oct 29 09:06:37 -0700 2009
| link
This was merged in with fd6fc58
Right, but the :new_lines and :spacing options don't exist yet. I was meaning someone add code for these options so we can make the resulting xml completely compact (no indents, no new lines etc).
tenderlove
Thu Oct 29 13:47:11 -0700 2009
| link
ah, oops. I thought this was the same thing.
The document on the builder is searchable. You can do:
builder.doc.at('//root')In fact, Builder#doc just returns a Nokogiri::XML::Document that you can manipulate as your normally would.
Awesome. So all that's left is a way to remove new lines. At the moment we do
.gsub(/\n/, '')But any content with new lines gets compacted too.
>> builder = Nokogiri::XML::Builder.new >> builder.root { |xml| xml.test('hey') } >> builder.doc.root.to_xml(:indent => 0) => "<root>\n<test>hey</test>\n</root>"Is there a way to do this?
>> builder = Nokogiri::XML::Builder.new >> builder.root { |xml| xml.test('hey') } >> builder.doc.root.to_xml(:indent => 0, :new_lines => false) => "<root><test>hey</test></root>" -
2 comments Created about 1 month ago by paranormal1.4.1xNokogiri::XML ignore my set encodingtenderlovexReproduce
Nokogiri::XML(open('http://www.cite-sciences.fr/rss/ressources/fr/faq_fr_20.xml').read, nil, 'UTF-8', 18543)
Nokogiri::XML::SyntaxError: Unsupported encoding UTF-85
This xml, not valid becouse <?xml version="1.0" encoding="UTF-85"> but encoding set global.
Thanks for nokogiri ^-^.
Comments
paranormal
Mon Nov 02 00:50:22 -0800 2009
| link
This is important for me. Because of my program is web robot that detects encoding and merge to utf before nokogiri work.
I think, nokogiri must ignore tag encoding if they set in initialize.
tenderlove
Mon Nov 09 17:47:05 -0800 2009
| link
-
Anchor tags tightly wrapping another element generate unwanted whitespace on #to_xhtml
3 comments Created about 1 month ago by dasil003This is just weird. Observe the reduced test cases:
s = '<a><b>see</b></a>' n = Nokogiri::HTML::DocumentFragment.parse(s) n.to_xhtml => "<a>\n <b>see</b>\n</a>"to_html works:
s = '<a><b>see</b></a>' n = Nokogiri::HTML::DocumentFragment.parse(s) n.to_html => "<a><b>see</b></a>"as does adding a text node in the source:
s = '<a> <b>see</b></a>' n = Nokogiri::HTML::DocumentFragment.parse(s) n.to_xhtml => "<a> <b>see</b></a>"Comments
BTW, I just discovered it also affects the OBJECT tag.
tenderlove
Wed Oct 14 15:06:55 -0700 2009
| link
I don't think this is a bug. The default save options for to_xhtml say to format or "pretty print" the document. If your document contains space nodes, it will preserve them. If there are no blank nodes in the document, it will add them to make the output formatted.
If you don't want formatting, you can change the to_xhtml save options:
s = '<a><b>see</b></a>' n = Nokogiri::HTML::DocumentFragment.parse(s) puts n.to_xhtml(:save_with => Nokogiri::XML::Node::SaveOptions::AS_XHTML) -
1 comment Created about 1 month ago by EmpactSegmentation Fault on modified re-raise1.4.0xxml = Nokogiri::XML('<xml />') begin xml.xpath('http://') rescue Nokogiri::XML::XPath::SyntaxError => e raise e, "howdy" endresults in:
[BUG] Segmentation fault ruby 1.8.7 (2009-06-12 patchlevel 174) [i686-darwin10] Abort trap
for
--- warnings: [] libxml: loaded: 2.7.5 binding: extension compiled: 2.7.5 nokogiri: 1.3.3
Comments
tenderlove
Tue Oct 13 20:49:06 -0700 2009
| link
duplicating erorrs works. yay! closed by 33922d7
-
Comments
-
This segfaults, if you look at it funny out of the corner of your eye:
require "nokogiri" class TextHandler < Nokogiri::XML::SAX::Document def initialize @chunks = [] end attr_reader :chunks def cdata_block(string) characters(string) end def characters(string) @chunks << string.strip if string.strip != "" end end th = TextHandler.new parser = Nokogiri::XML::SAX::Parser.new(th) parser.parse(<<-XML) <?xml version="1.0" encoding="utf-8"?> <root> <stuff> one </stuff> <stuff> two </stuff> </root> XMLI was able to duplicate consistently for awhile, but I uninstalled and reinstalled nokogiri a few times, and now it works. It would reach the end of the document before segfaulting. The end_document event would fire, and then it would segfault shortly thereafter.
Comments
sporkmonger
Sat Oct 10 21:10:51 -0700 2009
| link
Looks like 2.7.3.
tenderlove
Mon Oct 12 17:54:21 -0700 2009
| link
If you were on libxml2, 2.6.16, then I wouldn't be surprised. That version was very old an unstable.
I can't repro this (even with thousands of iterations), so I will assume it's a bug with 2.6.16. I am going to close this, but if you are able to repro with 2.7.3, please reopen this ticket. Thanks!
sporkmonger
Mon Oct 12 17:59:47 -0700 2009
| link
Shouldn't require thousands of iterations, it's a happens-every-time kind of bug. However, I may have been wrong about the version of nokogiri I was using. It might have been edge.
-
Adding a Document to a Node causes segfault on program exit
1 comment Created about 1 month ago by david$ ruby -rubygems -rnokogiri -e 'Nokogiri::XML("").root << Nokogiri::XML::Document.new'
: [BUG] Segmentation fault
ruby 1.9.1p243 (2009-07-16 revision 24175) [i486-linux]-- control frame ----------
c:0001 p:0000 s:0002 b:0002 l:0011a4 d:0011a4 TOP
-- Ruby level backtrace information-----------------------------------------
-- C level backtrace information ------------------------------------------- 0xb76cd6e9 /usr/lib/libruby-1.9.1.so.1.9(rb_vm_bugreport+0x69) [0xb76cd6e9]
0xb75e907f /usr/lib/libruby-1.9.1.so.1.9 [0xb75e907f]
0xb75e911a /usr/lib/libruby-1.9.1.so.1.9(rb_bug+0x3a) [0xb75e911a]
0xb7674fa4 /usr/lib/libruby-1.9.1.so.1.9 [0xb7674fa4]
0xb7768410 [0xb7768410]
0xb767b817 /usr/lib/libruby-1.9.1.so.1.9(st_foreach+0x17) [0xb767b817]
0xb72d0aa9 /var/lib/gems/1.9.1/gems/nokogiri-1.3.3/lib/nokogiri/nokogiri.so [0xb72d0aa9]
0xb75f954d /usr/lib/libruby-1.9.1.so.1.9 [0xb75f954d]
0xb75f96e4 /usr/lib/libruby-1.9.1.so.1.9 [0xb75f96e4]
0xb75f98dc /usr/lib/libruby-1.9.1.so.1.9(rb_gc_call_finalizer_at_exit+0x17c) [0xb75f98dc]
0xb75eb0ee /usr/lib/libruby-1.9.1.so.1.9 [0xb75eb0ee]
0xb75ec436 /usr/lib/libruby-1.9.1.so.1.9(ruby_cleanup+0x116) [0xb75ec436]
0xb75ec5ee /usr/lib/libruby-1.9.1.so.1.9(ruby_run_node+0x5e) [0xb75ec5ee]
0x80487e8 ruby(main+0x68) [0x80487e8]
0xb73dfb56 /lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe6) [0xb73dfb56]
0x80486e1 ruby [0x80486e1][NOTE] You may encounter a bug of Ruby interpreter. Bug reports are welcome.
For details: http://www.ruby-lang.org/bugreport.html^CAborted (core dumped)
-- SNIP --
This is ruby 1.9.1, but the same thing happened to me on 1.8.7.
The reason why I think this is significant is because I mistakenly was adding a Document to a Node in my code, and that kept failing with me understanding why, so the same thing may happen to other people.
Thanks.
Comments
tenderlove
Tue Oct 13 11:17:11 -0700 2009
| link
raising an exception if someone tries to reparent a *::Document. closed by c557764
-
Hi guys,
I get the following error message while downloading an XML file and opening it using nokogiri:
res = Net::HTTP.post_form(URI.parse(....), {...}) doc = Nokogiri::XML(Nokogiri::XML(res.body).xpath("//text()").to_s.gsub("& lt;", "<").gsub("& gt;", ">"))I have installed the latest nightly on OS X 10.5.6.
/Library/Ruby/Gems/1.8/gems/nokogiri-1.3.3.20091004000018/lib/nokogiri/xml/document.rb:33: [BUG] Segmentation fault ruby 1.8.7 (2008-08-11 patchlevel 72) [universal-darwin10.0] Abort trapI have also tried to split the constructor calls:
doc = Nokogiri::XML(res.body).xpath("//text()").to_s.gsub("< ;", "<").gsub("> ;", ">")I have about 10 different XML files, and it crashes randomly on the different files, so I can't say that it's one specific file.
The XML files vary in size from 3mb to 150mb.
The files are very basic XML:
<?xml version="1.0" encoding="utf-8"?> <string>....</string>where the string element contains escaped XML. Unfortunately the XML files are data we receive from an external vendor, so not really anything we can do about that.
I can try to normalize the data using gsub and then use nokogiri on the xmlified data.
Regards
William
Comments
tenderlove
Tue Oct 06 08:49:35 -0700 2009
| link
Can you run "nokogiri -v" and add the output to this ticket?
It seems that that did the trick. I have not had any segfaults since.
libxml:
loaded: 2.7.3 binding: extension compiled: 2.7.3 nokogiri: 1.3.3I am still busy testing it, but so far so good. I will keep you posted and close this ticket when I am sure that it is not an issue anymore.
tenderlove
Tue Oct 06 09:05:55 -0700 2009
| link
Okay, sounds good. You might also want to try upgrading libxml2. The latest libxml2 is 2.7.5 and I know they've packed in a bunch of bug fixes. If that doesn't do the trick, would you mind sending us a sample of the XML you're using to make it crash? It shouldn't SEGV under any circumstances. :-)
tenderlove
Tue Oct 13 13:49:22 -0700 2009
| link
Any updates on this?
tenderlove
Thu Oct 15 09:16:59 -0700 2009
| link
I'm closing this since there have been no updates. Please reopen if you're still having problems! Thanks!
-
test_empty_string_returns_empty_doc is empty
2 comments Created about 1 month ago by djsunThere is no test in the code below...
# test_document.rb def test_empty_string_returns_empty_doc doc = Nokogiri::HTML('') endComments
tenderlove
Sun Oct 04 20:32:52 -0700 2009
| link
Thanks for the heads up! http://ihighfive.com/
You are very whalecome: http://ihighfive.com/whale-high-five.php
-
WARNING: Nokogiri was built against LibXML version 2.7.3, but has dynamically loaded 2.6.32
3 comments Created about 1 month ago by ponnyAny idea how do I make it load the correct version? I'm on Ubuntu. I've followed the directions here: http://wiki.github.com/tenderlove/nokogiri/use-libxml-from-source but it's just not flying.
Thanks in advance.
Comments
tenderlove
Sun Oct 04 20:41:02 -0700 2009
| link
have you added '/usr/local/lib' to your ld.so.conf file?
tenderlove
Tue Oct 13 13:45:05 -0700 2009
| link
hello?
tenderlove
Wed Oct 14 18:49:33 -0700 2009
| link
Closing. If you're still having troubles with this, please send an email to the mailing list:
-
[PATCH] adding Builder#<< for appending raw strings
4 comments Created about 1 month ago by dudleyfHere's a tiny patch implementing the functionality talked about here[0].
[0] http://rubyforge.org/pipermail/nokogiri-talk/2009-March/000224.html
diff --git a/lib/nokogiri/xml/builder.rb b/lib/nokogiri/xml/builder.rb index 89cd63a..5cdcafd 100644 --- a/lib/nokogiri/xml/builder.rb +++ b/lib/nokogiri/xml/builder.rb @@ -277,6 +277,12 @@ module Nokogiri @doc.to_xml end + ### + # Append the given raw XML +string+ to the document + def << string + @doc.fragment(string).children.each { |x| insert(x) } + end + def method_missing method, *args, &block # :nodoc: if @context && @context.respond_to?(method) @context.send(method, *args, &block) diff --git a/test/xml/test_builder.rb b/test/xml/test_builder.rb index d4a6e26..12b1f86 100644 --- a/test/xml/test_builder.rb +++ b/test/xml/test_builder.rb @@ -117,6 +117,26 @@ module Nokogiri assert_equal 'hello', builder.doc.at('baz').content end + def test_raw_append + builder = Nokogiri::XML::Builder.new do |xml| + xml.root do + xml << 'hello' + end + end + + assert_equal 'hello', builder.doc.at('//root/foo').content + end + + def test_raw_append_with_instance_eval + builder = Nokogiri::XML::Builder.new do + root do + self << 'hello' + end + end + + assert_equal 'hello', builder.doc.at('//root/foo').content + end + def test_cdata builder = Nokogiri::XML::Builder.new do root { -- 1.6.4.3Comments
tenderlove
Sun Oct 04 20:40:05 -0700 2009
| link
I've applied the patch, but next time please make sure the tests pass. After applying the patch, I got these errors:
1) Error: test_raw_append(Nokogiri::XML::TestBuilder): NoMethodError: undefined method `content' for nil:NilClass test/xml/test_builder.rb:127:in `test_raw_append' 2) Error: test_raw_append_with_instance_eval(Nokogiri::XML::TestBuilder): NoMethodError: undefined method `content' for nil:NilClass test/xml/test_builder.rb:137:in `test_raw_append_with_instance_eval'
tenderlove
Sun Oct 04 20:40:29 -0700 2009
| link
XML Builder can append raw strings. closed by 98b10d2
tenderlove
Mon Oct 05 08:38:30 -0700 2009
| link
No problem! :-)
-
Nokogiri::HTML(data)
src/tcmalloc.cc:186] Attempt to free invalid pointer: 0x20e030Nokogiri::VERSION => "1.3.2"
Nokogiri::LIBXML_VERSION => "2.6.32"
ruby -v => ruby 1.8.6 (2008-08-11 patchlevel 287) [i686-darwin9.7.0] Ruby Enterprise Edition 20090610Where 'data' is... http://gist.github.com/199496
Thanks for the help!
Comments
Still crashes for me using libxml2 2.7.5 and nokogiri 1.3.3 and ree-1.8.6-20090610. It does however work with normal MRI: ruby 1.8.6 (2009-08-04 patchlevel 383) [i686-darwin9.8.0]
tenderlove
Sun Oct 04 21:03:28 -0700 2009
| link
This isn't crashing for me. I'm using:
[apatterson@higgins nokogiri (master)]$ ruby -v ruby 1.8.6 (2008-08-11 patchlevel 287) [i686-darwin10.0.0] Ruby Enterprise Edition 20090610 [apatterson@higgins nokogiri (master)]$ ruby -I lib bin/nokogiri -v --- nokogiri: 1.3.3 warnings: [] libxml: compiled: 2.7.5 loaded: 2.7.5 binding: extension [apatterson@higgins nokogiri (master)]$Can you try with the nightly? To install the nightly build, do this:
$ sudo gem install nokogiri -s http://tenderlovemaking.com
tenderlove
Tue Oct 13 13:50:41 -0700 2009
| link
I can't reproduce this. Please reopen if the problem persists. I need more details to fix this if there is a problem. Thanks!
-
7 comments Created about 1 month ago by flavorjonesextconf.rb have_func() always fails under Ruby Enterprise build systemREExruby-enterprise-1.8.6-20090610:
checking for xmlRelaxNGSetParserStructuredErrors()... no checking for xmlRelaxNGSetParserStructuredErrors()... no checking for xmlRelaxNGSetValidStructuredErrors()... no checking for xmlSchemaSetValidStructuredErrors()... no checking for xmlSchemaSetParserStructuredErrors()... noComments
flavorjones
Wed Sep 30 22:58:10 -0700 2009
| link
root cause:
"gcc -o conftest -I/usr/include/libxml2 -I/usr/include -I. -I/home/mike/builds/ruby-enterprise-1.8.6-20090610-install/lib/ruby/1.8/i686-linux -I/home/mike/code/nokogiri/ext/nokogiri -I/usr/include/libxml2 -I/usr/include -I. -I/home/mike/builds/ruby-enterprise-1.8.6-20090610-install/lib/ruby/1.8/i686-linux -I/home/mike/code/nokogiri/ext/nokogiri -g -O2 -g -DXP_UNIX -O3 -Wall -Wcast-qual -Wwrite-strings -Wconversion -Wmissing-noreturn -Winline conftest.c -L/opt/local/lib -Wl,-R/opt/local/lib -L. -rdynamic -Wl,-export-dynamic -lexslt -lxslt -lxml2 -lruby-static -lexslt -lxslt -lxml2 -ldl -lcrypt -lm -lc" conftest.c: In function ‘t’: conftest.c:3: warning: implicit declaration of function ‘xmlRelaxNGSetParserStructuredErrors’ /usr/bin/ld: cannot find -lruby-static collect2: ld returned 1 exit status checked program was: /* begin */ 1: /*top*/ 2: int main() { return 0; } 3: int t() { xmlRelaxNGSetParserStructuredErrors(); return 0; } /* end */suggested fix:
diff --git a/ext/nokogiri/extconf.rb b/ext/nokogiri/extconf.rb index 7c21e7d..b77552d 100644 --- a/ext/nokogiri/extconf.rb +++ b/ext/nokogiri/extconf.rb @@ -129,7 +129,7 @@ unless find_library('exslt', 'exsltFuncRegister', *LIB_DIRS) abort "libxslt is missing. try 'port install libxslt' or 'yum install libxslt-devel'" end -def nokogiri_link_command ldflags, opt='', libpath=$LIBPATH +def nokogiri_link_command ldflags, opt='', libpath=$DEFLIBPATH|$LIBPATH old_link_command ldflags, opt, libpath end
flavorjones
Wed Sep 30 23:03:20 -0700 2009
| link
proposed fix from Michael Reinsch:
- def nokogiri_link_command ldflags, opt='', libpath=$LIBPATH + def nokogiri_link_command ldflags, opt='', libpath=$DEFLIBPATH|$LIBPATHwhich appears to work for me. Aaron, thoughts?
(which is what is used in mkmf.rb, link_command for ruby enterprise)
tenderlove
Sun Oct 04 20:53:33 -0700 2009
| link
Ugh. This is a PITA, but basically we can't make everyone happy. If I add this patch, I may as well remove that "nokogiri_link_command" stuff all together. Let me try to explain why:
Someone has ruby installed in /usr/lib, they also have libxml2 installed in /usr/lib. They've installed a newer version of libxml2 in /usr/local. We try to be nice and search /opt/local/lib in addition to /usr/local/lib before falling back to /usr/lib. Unfortunately, if the custom directory (/opt/local or /usr/lib) isn't supplied to dir_config(), it won't search that path. We can only supply one directory. If mkmf doesn't find it in that directory, then it falls back to /usr/lib.
That means we either get /opt/local/lib or /usr/lib, unless the user intervenes with a --with-xml-lib=/whatever --with-xml-include=/whataver, or we use my Super Hack® code. Unfortunately my Super Hack® screws over people with custom ruby installs because it will never find the ruby-static library.
I'm going to apply this fix (and by apply, I mean remove my custom code). I'd rather it "just work" for people with custom ruby installs. People with custom libxml2 installs can use the command line arguments.
tenderlove
Sun Oct 04 20:54:09 -0700 2009
| link
removing my Super Hack® closed by fbe7217
flavorjones
Mon Oct 05 05:33:42 -0700 2009
| link
Aaron, the --with-xml-lib and --with-xml-include options do not appear to affect have_func(), since it consistently uses the wrong header files.
-
Hi
I've got some code here which replaces some parts of a html document with new content. This works pretty well in 99 out of 100 times, but in certain situation nokogiri segfaults. Yesterday we were able to capture a 'crashing' situation, this is: a) the original document b) a start and end dom-id and c) the new content. Based on this I was able to create a simpel example[0] which shows this bug.
I suspected that some char / element causes the crash, but during my first inspection it turned out that it must be a combination of some weird circumstances. For example when I reduce the size of c) (new content), the crashes happen less often. The same happens when I reduce the range between b) start- and end-id. It also appears to crash more often when rails is loaded as well (although I was now able to get it 'reliably' crashing wihtout rails). I was only able to reproduce it on linux, it doesn't crash with the same ruby/libxml/nokogiri version on OS X.
I'm pretty out of ideas, I hope somebody can checkout my example[0] and try to find the root cause. My current theory is that nokogiri/libxml somewhere corrupts memory, either on the stack or heap and that later somebody chokes on that corruption. I also think it is somehow related to the use of the french é, written as é, when I remove these chars the crash can't be reproduced.
You can see the output plus stacktrace of a run on my machine on:
https://gist.github.com/f37eda8131e39fac9dd4Thanks for the help!
Cheers
RetoComments
flavorjones
Mon Sep 28 23:39:13 -0700 2009
| link
will look at this ASAP.
flavorjones
Wed Sep 30 23:04:24 -0700 2009
| link
i've got a repro case and a possible fix.
tenderlove
Wed Sep 30 23:05:10 -0700 2009
| link
Can you try out the nightly build? I believe this may already be fixed.
To get the nightly, do this:
$ sudo gem install nokogiri -s http://tenderlovemaking.comI tested it against nokogiri/master and the nightly, the crash still happens.
But I increased the 'input' length in the test script, the crash should now happen more often (almost every time now on three of my machines).
git://github.com/retoo/nokogiri-bug.git
Thanks!
flavorjones
Thu Oct 01 07:48:29 -0700 2009
| link
retoo - the fix tenderlove is referring to does not apply to your/my situation, sorry about that. We have a repro case and a potential fix, and we are working on it. Thanks!
tenderlove
Sat Oct 03 22:05:16 -0700 2009
| link
@retoo I think we've got a fix in. Can you pull the latest nightly and see if that fixes your crash?
It doesn't crash anymore here. Bisect says that it has been fixed in c753c8d.
Thank you guys!
Reto
Hi guys,
I might have the same issue, I get the following error message while downloading a an xml file and opening it using nokogiri:
res = Net::HTTP.post_form(URI.parse(....), {...})
doc = Nokogiri::XML(Nokogiri::XML(res.body).xpath("//text()").to_s.gsub("& lt;", "<").gsub("& gt;", ">"))I have installed the latest nightly on OS X 10.5.6.
/Library/Ruby/Gems/1.8/gems/nokogiri-1.3.3.20091004000018/lib/nokogiri/xml/document.rb:33: [BUG] Segmentation fault ruby 1.8.7 (2008-08-11 patchlevel 72) [universal-darwin10.0]
Abort trap
Regards
WilliamHmm, probably not, 1. It didn't crash on os x in my case, 2. it crashed when I tried to replace() some parts of it, not during any xpath/constructor calls.
Nonetheless, I would report this (perhaps in a new, unused fresh ticket :D). Does this crash everytime? If yes, seperate the two constructor calls like:
step 1 = XML(input).xpath.to_s.gub puts "I survived so far step2 = XML(step1) puts "should never reach this point"It would be very helpful if you could give the output of res.body. Or even better, the minimum of res.body which still triggers the crash.
Cheers.
Hey Reto, it errors out on
doc = Nokogiri::XML(res.body).xpath("//text()").to_s.gsub("< ;", "<").gsub("> ;", ">")I have about 10 different XML files, and it crashes randomly on the different files, so I can't say that it's one specific file.
The XML files vary in size from 3mb to 150mb.
The files are very basic XML:
<?xml version="1.0" encoding="utf-8"?> <string>....</string>where the string element contains escaped XML. Unfortunately the XML files are data we receive from an external vendor, so not really anything we can do about that.
I can try to normalize the data using gsub and then use nokogiri on the xmlified data.
flavorjones
Tue Oct 06 05:58:36 -0700 2009
| link
Hi, can you please open a new issue for this? It appears to be unrelated to this actual issue, which has been solved in the nightlies.
Created a new ticket: http://github.com/tenderlove/nokogiri/issues/#issue/144
tenderlove
Tue Oct 13 13:49:01 -0700 2009
| link
Closing this because we fixed it. :-)
-
Right now, the JRuby gem does not ship with the DLLs.
To work around this, I tried copying the DLLs from the Windows MRI gem, but then Nokogiri gives the following error:
FFI::NotFoundError: Function 'calloc' not found in [exslt]
Comments
Same problems here which is unfortunate since the community really needs a cross-platform XML library. The developers of our project work on Windows, Linux as well as Solaris and deploy on Solaris. Nokogiri would be perfect in this regard if this bug was just fixed.
Please, this problem has existed for about a year and is a show-stopper for many projects. I know the developers time are limited and they do this in their spare time, but if wide usage of the library is of any interest, addressing this bug is a quick win to expand the use of Nokogiri.
tenderlove
Sun Oct 04 20:59:03 -0700 2009
| link
Hey everyone, this is a bug in JRuby. I've filed a ticket with them here:
http://jira.codehaus.org/browse/JRUBY-4052
Once they get it sorted out, I will close this ticket. :-)
I am confused about what it is we're not doing. If calloc is a (fairly) standard POSIX function, what is it we should be doing differently?
tenderlove
Tue Oct 06 11:49:50 -0700 2009
| link
@headius I think the libc functions should be loaded by default. Wayne gives a workaround in the JIRA ticket, but I think it's unreasonable to require me to change my FFI code depending on the platform. The current code works on everything but windows.
Fix added to git://github.com/jojje/nokogiri.git for people to try having access to a Windows environment. Requires JRuby 1.4.0RC1 or higher due to some needed fixes regarding FFI.
tenderlove
Tue Oct 13 13:52:01 -0700 2009
| link
I applied this:
I'm closing this ticket since it should be fixed for JRuby / Windows people with 1.4.0
-
10 comments Created 2 months ago by bfolkens1.4.0xinner_html= dropping some elementstenderlovexI'm having some trouble on the following environment. The code below fails on a linux install but not on a macports install. Both environments are:
$ nokogiri -v --- warnings: [] libxml: loaded: 2.7.3 binding: extension compiled: 2.7.3 nokogiri: 1.3.3But on the linux environment, the following code:
require 'test/unit' require 'rubygems' require 'nokogiri' class BugTest < Test::Unit::TestCase def test_should_parse_inner_text text = '<base><one>1</one><two>2</two></base>' doc = Nokogiri::XML(text) doc.search('base').each do |base_tag| base_tag.name = 'span' base_tag.inner_html = "<sup>#{base_tag.at('one').inner_text}</sup>/<sub>#{base_tag.at('two').inner_text}</sub>" end assert_equal '<span><sup>1</sup>/<sub>2</sub></span>', doc.to_html.strip end endFails with:
test_should_parse_inner_text(BugTest) [oo.rb:15]: <"<span><sup>1</sup>/<sub>2</sub></span>"> expected but was <"<span><sup>1</sup></span>">.Am I missing something obvious, or is this a bug? The above code is an abstraction from a larger project I'm working on, so I've tried to reduce it to the base of the issue. It passes the test on the MacPorts install (same version of libxml2 and nokogiri as on the Linux install).
The Ruby -v on linux is:
ruby 1.8.7 (2008-08-11 patchlevel 72) [i686-linux]And the MacPorts install is:
ruby 1.8.7 (2008-08-11 patchlevel 72) [i686-darwin9]Comments
tenderlove
Sat Sep 12 11:35:53 -0700 2009
| link
Strange. Are you sure nokogiri -v returns the same thing on the linux box? That seems crazy.
Yeah - I thought I was going crazy so I even did a diff. Weirdest thing I've seen... AFAIK libxml2 doesn't really depend on much does it? Or is there something else that Nokogiri depends on that might be causing this? I tried libxml2 2.7.2 just in case, but still had the same problem.
tenderlove
Sat Sep 12 11:53:41 -0700 2009
| link
libxml2 only depends on iconv and zlib. Neither of those should cause this problem.
What linux are you running?
Gentoo (default/linux/x86/2008.0 profile) over the 2.6.18-xenU-ec2-v1.0 kernel
libxml2: 2.7.3-r2
libc: 2.8_p20080602-r1
zlib: 1.2.3-r1
tenderlove
Sat Sep 12 12:01:46 -0700 2009
| link
Okay. I'll get a gentoo box up and running. Might be a little while before I get this one to repro. :-(
Thanks a ton! In the meantime I'm trying to upgrade glibc and anything else that might be out of date, just to try some different versions of things.
FWIW - The new glibc (2.9_p20081201-r2) didn't affect anything.
Here's another take on it, if it helps at all:
require 'test/unit' require 'rubygems' require 'nokogiri' module Nokogiri::XML class Node include Test::Unit::Assertions def inner_html=(tags) children.each { |x| x.remove } assert_equal ['sup', 'sub'], document.fragment(tags).children.map {|n| n.name } document.fragment(tags).children.to_a.each do |node| add_child node end self end end end class BugTest < Test::Unit::TestCase def test_should_parse_inner_html text = '<base><one>1</one><two>2</two></base>' doc = Nokogiri::XML(text) base_tag = doc.at('base') base_tag.inner_html = "<sup>#{base_tag.at('one').inner_text}</sup><sub>#{base_tag.at('two').inner_text}</sub>" assert_equal ['sup', 'sub'], base_tag.children.map {|n| n.name } end endSuccessful return on the MacPorts install, and on the Linux install:
1) Failure: test_should_parse_inner_html(BugTest) [oo2.rb:12:in `inner_html=' oo2.rb:27:in `test_should_parse_inner_html']: <["sup", "sub"]> expected but was <["sup"]>.In fact, even just this code returns only the first element and not the other:
Nokogiri::XML::DocumentFragment.parse("<one>1</one><two>2</two>")Unless it's wrapped in another outer element like:
<x><one>1</one><two>2</two></x>...then it returns the whole thing. And then obviously on the MacPorts install it returns an accurate copy regardless of the surrounding element.
I think I narrowed this down finally. For whatever reason, my local copy of Nokogiri (even though the gem was labeled 1.3.3) showed this diff from the copy on the linux machine (which was recently installed):
8,9c8,13 < @html_eh = node.kind_of? Nokogiri::HTML::DocumentFragment < --- > @klass = if node.kind_of?(Nokogiri::HTML::DocumentFragment) > Nokogiri::HTML::DocumentFragment > else > Nokogiri::XML::DocumentFragment > end > # 23,25c27,28 < regex = @html_eh ? %r{^\s*<#{Regexp.escape(name)}}i : < %r{^\s*<#{Regexp.escape(name)}} < --- > regex = (@klass == Nokogiri::HTML::DocumentFragment) ? %r{^\s*<#{Regexp.escape(name)}}i \ > : %r{^\s*<#{Regexp.escape(name)}}So a fresh install on my MacPorts version now fails as well - lol - not quite the expected result. However, installing the gem from the master works great - so looks like this was already fixed ;)
tenderlove
Mon Sep 14 21:07:45 -0700 2009
| link
Ugh. You're right. It fails against 1.3.3. I was checking against master. :-(
I guess I can stop fighting with VirtualBox now. Thanks for letting me know!
-
I think we should remove Node#collect_namespaces. Since namespace names are not unique, I don't know that this method is very useful.
Comments
flavorjones
Mon Sep 14 15:36:38 -0700 2009
| link
+1
tenderlove
Mon Sep 14 22:08:31 -0700 2009
| link
You're supposed to use the upvote button! ;-)
Although, I like the +1 better because it's not anonymous.
tenderlove
Sun Oct 04 20:34:09 -0700 2009
| link
This was removed in c7eb4b2
-
2 comments Created 2 months ago by david1.4.0xAdding a node with a default namespace stores it as 'no-namespace' in the parenttenderlovexThis works:
doc = Nokogiri::XML("<element><child xmlns="woop:de:doo" /></element>") doc.at("//xmlns:child", 'xmlns' => 'woop:de:doo') #=> <child xmlns="woop:de:doo" />This doesn't:
doc = Nokogiri::XML::Document.new e = Nokogiri::XML::Node.new('element', doc) c = Nokogiri::XML::Node.new('child', doc) c.add_namespace(nil, 'woop:de:doo') e.add_child(c) doc.add_child(c) doc.at("//xmlns:child", 'xmlns' => 'woop:de:doo') #=> nilComments
I'd also like to add that if you have a document like this:
<element> <c1 xmlns="one" /> <c2 xmlns="two" /> </element>then
doc.root.collect_namespaces.inspect #=> {'xmlns' => 'two'}
tenderlove
Fri Sep 11 21:48:15 -0700 2009
| link
Yup. That is the danger of collect_namespaces. I think that method should be removed.
The first problem is fixed here: c6e5fa0
-

I need original text like 'Пупкин'
ruby -v
ruby 1.8.7 (2008-08-11 patchlevel 72) [i486-linux]Linux lenny5/stable
nokogiri -1.3.3.Comments
tenderlove
Tue Sep 08 08:53:53 -0700 2009
| link
What encoding is the XML using?
romanvbabenko
Tue Sep 08 09:02:40 -0700 2009
| link
$ enca staff.xml Universal transformation format 8 bits; UTF-8
Mixed line terminators
tenderlove
Tue Sep 08 09:19:52 -0700 2009
| link
If you specify the encoding like this:
Nokogiri::XML('....', nil, 'UTF-8')What does it return?
romanvbabenko
Tue Sep 08 09:38:47 -0700 2009
| link
text first post picture
<?xml version="1.0"?>
without encoding
tenderlove
Tue Sep 08 10:01:03 -0700 2009
| link
This works well for me:
require 'nokogiri' doc = Nokogiri::XML('<person last_name="Пупкин"></person>', nil, 'UTF-8') puts doc.at('person')['last_name']How about you?
romanvbabenko
Tue Sep 08 10:16:36 -0700 2009
| link
see thet
i need save to file readable text
tenderlove
Tue Sep 08 10:25:08 -0700 2009
| link
Try this:
require 'nokogiri' doc = Nokogiri::XML('<person last_name="Пупкин"></person>', nil, 'UTF-8') doc.encoding = 'UTF-8' puts doc.to_xml
romanvbabenko
Tue Sep 08 10:48:48 -0700 2009
| link
Oh. That is work fine. But, i see text in my file only after require 'multibyte'
tenderlove
Tue Sep 08 10:55:52 -0700 2009
| link
multibyte should not make a difference. Can you paste a short script somewhere that shows the problem?
tenderlove
Fri Sep 11 21:09:12 -0700 2009
| link
I believe this is working as expected. Please reopen if I am incorrect.
-
4 comments Created 2 months ago by wtn1.4.0xSegfault when searching with #at (1.3.3)tenderlovex/Library/Ruby/Gems/1.8/gems/nokogiri-1.3.3/lib/nokogiri/xml/node.rb:591: [BUG] Segmentation fault ruby 1.8.7 (2008-08-11 patchlevel 72) [universal-darwin10.0]
I think it happens whether I'm using OS X 10.5, 10.6, ruby 1.9.1, or 1.8.7
Also, it doesn't happen every time, but about a third of the time (I'm using Mechanize, perhaps the input is changing somewhat)Comments
tenderlove
Sun Sep 06 10:35:33 -0700 2009
| link
Could you possibly give me the HTML you're parsing? Or some sort of script so that I can reproduce it?
Thanks. I discovered that this crashes about half the time when I paste it in irb, but when I run it as a script on the command line it doesn't crash.
I emailed you the script.
tenderlove
Fri Sep 11 21:08:43 -0700 2009
| link
I can't seem to reproduce this with Nokogiri 1.3.3. Could you possibly get me the document that makes this crash?
I suspect that the webpage is dynamic, and only one of the dynamic pages causes it to crash.
tenderlove
Mon Sep 14 22:15:21 -0700 2009
| link
Figured this out. There was a bug in Node#inspect. This commit fixed it:
-
1 comment Created 2 months ago by zoozed1.4.0xNodeSet#slice doesn't handle ranges beyond the end of the arraytenderlovexLooks like NodeSet#slice isn't handling range beyond the end of the array...
#!/usr/bin/env ruby require 'rubygems' require 'nokogiri' xml = Nokogiri::XML('<?xml version="1.0" encoding="utf-8"?> <rss version="2.0"> <channel> <item><title>t1</title></item> <item><title>t2</title></item> </channel> </rss> ') items = (xml/:item) not_blowed_up = items[0, items.size] all_blowed_up = items[0, 100]Comments
tenderlove
Fri Sep 11 21:07:34 -0700 2009
| link
fixing node slices where the slice is larger than the node set length. closed by 9f04464
-
I think the reference to the racc tarball (v 1.4.5) should be removed from the gem compilation script because that version doesn't work on ruby 1.9. Instead of the user being directed to that tarball, the user should be told to install the racc gem (now 1.4.6 and compatable with 1.9).
Comments
tenderlove
Tue Sep 01 13:16:10 -0700 2009
| link
I can't seem to find the code you are talking about. Can you send me a link? Or be more specific?
Thanks
Oh sorry, here's what I got on the console before I installed the racc gem:
localhost:home user$ sudo gem install tenderlove-nokogiri
Building native extensions. This could take a while...
ERROR: Error installing tenderlove-nokogiri:
ERROR: Failed to build gem native extension./usr/local/bin/ruby19 -rubygems /usr/local/lib/ruby19/gems/1.9.1/gems/rake-0.8.7/bin/rake RUBYARCHDIR=/usr/local/lib/ruby19/gems/1.9.1/gems/tenderlove-nokogiri-0.0.0.20081021110113/lib RUBYLIBDIR=/usr/local/lib/ruby19/gems/1.9.1/gems/tenderlove-nokogiri-0.0.0.20081021110113/lib Hoe.new {...} deprecated. Switch to Hoe.spec.
(in /usr/local/lib/ruby19/gems/1.9.1/gems/tenderlove-nokogiri-0.0.0.20081021110113) WARNING: HOE DEPRECATION: Add '>= 0' to the 'rake' dependency.
/usr/local/bin/ruby19 extconf.rb checking for xmlParseDoc() in -lxml2... yes
checking for xsltParseStylesheetDoc() in -lxslt... yes
checking for libxml/xmlversion.h in /usr/include/libxml2... yes
checking for libxslt/xslt.h in /usr/include... yes
checking for racc... no
need racc, get the tarball from http://i.loveruby.net/archive/racc/racc-1.4.5-all.tar.gz
extconf.rb failed Could not create Makefile due to some reason, probably lack of
necessary libraries and/or headers. Check the mkmf.log file for more
details. You may need configuration options.Provided configuration options:
--with-opt-dir --without-opt-dir --with-opt-include --without-opt-include=${opt-dir}/include --with-opt-lib --without-opt-lib=${opt-dir}/lib --with-make-prog --without-make-prog --srcdir=. --curdir --ruby=/usr/local/bin/ruby19 --with-xml2lib --without-xml2lib --with-xsltlib --without-xsltlibrake aborted!
Command failed with status (1): [/usr/local/bin/ruby19 extconf.rb...]
/usr/local/lib/ruby19/gems/1.9.1/gems/tenderlove-nokogiri-0.0.0.20081021110113/Rakefile:58:in `block (2 levels) in ' (See full trace by running task with --trace)Gem files will remain installed in /usr/local/lib/ruby19/gems/1.9.1/gems/tenderlove-nokogiri-0.0.0.20081021110113 for inspection.
Results logged to /usr/local/lib/ruby19/gems/1.9.1/gems/tenderlove-nokogiri-0.0.0.20081021110113/gem_make.out
tenderlove
Tue Sep 01 14:58:55 -0700 2009
| link
Never install nokogiri from github. Always install it from rubyforge.
$ sudo gem install nokogiriI will see about getting the gem removed from github. If it is hosted on github, that is a mistake.
-
10 comments Created 2 months ago by DrusTheAxe1.4.0x1.3.3 breaks RELAXNG.new(rng).validate(xml)tenderlovexRELAXNG's validate() method changed in 1.3.3:
In 1.3.2, it returns an array of errors (and empty if succesful). In 1.3.3, it always returns an empty array AND writes to stderr.This is probably due to 1.3.3's registering a libxml global error-handler.
This is a critical breaking change on 2 counts:
validate() no longer returns errors ; impossible to programatically branch based on if the XML is valid or not stderr may not exist; run this under IIS (where stderr is not set) and you faultNOTE: This was tested against Ruby 1.8.6 no Windows 7, but none of that should matter.
At the very least a documented and clean workaround to alter the default error handler back to 1.3.2 behavior is critical; the docs are...thin...on the subject, and the only thing close to on point on the web are some dated forum messages in Feb'09 (the included code doesn't even parse correctly against 1.3.2 or 1.3.3).
Below is a trivial test to repro the problem
abort 'Usage: ruby relaxng_validate.rb <version>' if ARGV.empty? nokogiri_version = ARGV[0] require 'rubygems' gem 'nokogiri', nokogiri_version require 'nokogiri' puts "Nokogiri version #{Nokogiri::VERSION}" xml = <<EOXML <A/> EOXML schema = <<EOSCHEMA <?xml version="1.0" encoding="UTF-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0"> <start> <ref name="A"/> </start> <define name="A"> <element name="A"> <interleave> <attribute name="B"/> <element name="C"> <text/> </element> <element name="D"> <element name="E"> <text/> </element> </element> </interleave> </element> </define> </grammar> EOSCHEMA puts 'Loading xml...' doc = Nokogiri::XML(xml) puts 'Loading schema...' relaxng = Nokogiri::XML::RelaxNG(schema) puts 'Validating xml against schema...' errors = relaxng.validate(doc) puts "Errors.size = #{errors.size}" errors.each { |error| puts " Error: #{error}" } puts 'Done.'Comments
DrusTheAxe
Mon Aug 31 22:16:07 -0700 2009
| link
Oops. Run the command
ruby test.rb 1.3.2and the output is
Nokogiri version 1.3.2 Loading xml... Loading schema... Validating xml against schema... Errors.size = 2 Error: Invalid sequence in interleave Error: Element A failed to validate content Done.Run the command
ruby test.rb 1.3.3and the output is
Nokogiri version 1.3.3 Loading xml... Loading schema... Validating xml against schema... element A: Relax-NG validity error : Invalid sequence in interleave element A: Relax-NG validity error : Element A failed to validate content Errors.size = 0 Done.
flavorjones
Mon Aug 31 23:12:11 -0700 2009
| link
I cannot reproduce this on Ubuntu Linux 9.04.
flavorjones
Mon Aug 31 23:17:45 -0700 2009
| link
Howard,
Can you please include some information about your platform, as well as the value of Nokogiri::VERSION_INFO (which is a hash) after loading Nokogiri 1.3.3 ?
Thanks much!
DrusTheAxe
Mon Aug 31 23:34:54 -0700 2009
| link
Sure. Changing the line
puts "Nokogiri version #{Nokogiri::VERSION}"to
puts "Ruby #{RUBY_PLATFORM} #{RUBY_VERSION}" puts "Nokogiri version #{Nokogiri::VERSION}" puts "Nokogiri VERSION_INFO: #{Nokogiri::VERSION_INFO.inspect}"and run with
ruby test.rb 1.3.2now shows at the top of the output
Ruby i386-mswin32 1.8.6 Nokogiri version 1.3.2 Nokogiri VERSION_INFO: {"nokogiri"=>"1.3.2", "warnings"=>[], "libxml"=>{"compiled"=>"2.7.3", "loaded"=>"2.7.3", "binding"=>"extension"}}and when run with
ruby test.rb 1.3.3outputs
Ruby i386-mswin32 1.8.6 Nokogiri version 1.3.3 Nokogiri VERSION_INFO: {"nokogiri"=>"1.3.3", "warnings"=>[], "libxml"=>{"compiled"=>"2.7.3", "loaded"=>"2.7.3", "binding"=>"extension"}}As I said, I'm running 1.8.6 on Windows (XP and 7, same results).
The only difference in Nokogiri::VERSION_INFO appears to be the "nokogiri"=>"1.3.2" vs. "...1.3.3".Nokogiri on Windows with --platform mswin32 has a prebuilt libxml, right?
Then I'd suspect the problem is in 1.3.3's registered libxml global error-handler
--and/or-- in the prebuilt libxml binary included in the Nokogiri 1.3.3 mswin32 gem.P.S. I got Nokogiri via
get install Nokogiri --platform mswin32 --version 1.3.2 get install Nokogiri --platform mswin32 --version 1.3.3to explicitly pull both versions down to my machine to repro the issue.
Anything else I can do to help?
flavorjones
Tue Sep 01 00:45:30 -0700 2009
| link
Confirmed this issue on Windows XP with 1.3.3.
DrusTheAxe
Tue Sep 01 18:17:24 -0700 2009
| link
A few questions:
- Do you understand the source of the problem?
- Do you have an ETA on a fix?
- Do you have a workaround? (besides punting 1.3.3 and locking into 1.3.2)
flavorjones
Wed Sep 02 14:00:31 -0700 2009
| link
Howard -
Neither Aaron nor I will be able to spend the necessary time to debug this, since Windows is not a native platform for either of us. If you are under a time crunch, I'd advise backing down to 1.3.2.
And, if you have Windows expertise you'd like to lend to help us track the problem down, we'd love to have the help. Please remember, we're doing this in our spare time, for the benefit of humanity. That's you!
-m
DrusTheAxe
Thu Sep 03 22:53:07 -0700 2009
| link
Windows is native for me and I'm familiar with several XML libraries (Xerces, ElementTree, others), but new to Nokogiri and LibXML. I tried to hunt down how things were wired up, but didn't quite follow.
My suspicion is 1.3.3's wiring up of libxml's global error-handler, but i couldn't find such a thing. I did find what looks like 6-7 callbacks, but not sure which are which, and they look pretty cryptic anyway. Looks like there's some VB-Declare / .NET-P/Invoke / etc like native-ish-from-Ruby wiring ('pointer' and such), but I've only been doing Ruby for 4 months and haven't see that (in Ruby).
My suspicion is the newly registered global error-handler is dumping to stderr and not propogating the error messages up to Ruby like used to happen. That said, I'm not quite sure where to look. Not even sure it's Ruby code, could be native code, though the changelog comment said libxml-ruby (fwiw).
I'd be glad to help lend a hand, but at this point I've taken it as far as I can on my own. Looking for some pointers.
tenderlove
Sun Sep 06 14:47:31 -0700 2009
| link
Ugh. So, here is what I have learned so far. If I cross compile from the tag (REL_1.3.3), I cannot reproduce the problem. If I cross compile from HEAD, I cannot reproduce the problem. I can reproduce the problem with the released gem, and if I re-cross compile the released gem, I can reproduce the problem.
Also, I've noticed that the test suite does pick this up. Running the tests inside the released gem picks this up.
I am going to re-compile from the 1.3.3 release tag and replace the gems on rubyforge with the recompiled versions.
tenderlove
Sun Sep 06 15:02:32 -0700 2009
| link
I've uploaded the new gems, so if you uninstall, then reinstall, everything should work.
I haven't found the root cause of this problem, but I'm closing this ticket because our tests exercise this behavior, and it is not reproducible from HEAD.
I need to find a less painful way of getting the tests running on windows. :-(
- Do you understand the source of the problem?
-
After installing nokogiri on Snow Leoaprd (using ARCHFLAG or not), it blows up with:
dlopen(/usr/local/lib/ruby/gems/1.8/gems/nokogiri-1.3.3/lib/nokogiri/nokogiri.bundle, 9): no suitable image found. Did find: /usr/local/lib/ruby/gems/1.8/gems/nokogiri-1.3.3/lib/nokogiri/nokogiri.bundle: mach-o, but wrong architecture - /usr/local/lib/ruby/gems/1.8/gems/nokogiri-1.3.3/lib/nokogiri/nokogiri.bundle
Comments
tenderlove
Mon Aug 31 10:18:26 -0700 2009
| link
Try uninstalling nokogiri and re-installing it.
tenderlove
Mon Aug 31 10:43:51 -0700 2009
| link
Turns out that ruby needed to be recompiled.
Closing. :-D
-
1 comment Created 2 months ago by tenderloveimplement Nokogiri::XML::DTD#external_id and system_idtenderloveximplement Nokogiri::XML::DTD#external_id and system_id
look at xmlDtdPtr
Comments
tenderlove
Sat Sep 12 14:06:51 -0700 2009
| link
adding DTD external id an system id. closed by 303b2b2
-
1 comment Created 2 months ago by tenderlove1.4.0xImplement Nokogiri::XML::ElementDecl#contenttenderlovexImplement Nokogiri::XML::ElementDecl#content
look at tree.h
struct _xmlElement, content member
Comments
tenderlove
Sat Sep 12 17:14:18 -0700 2009
| link
updating changelog. closed by 1f658f0
-
strings returned by xpath expression "/text()" are bad formatted
1 comment Created 2 months ago by jneyStrings returned by Nokogiri::HTML(open(url)).xpath("//xpath_expression/text()").to_ary are such formatted that comparaison return false on identic strings.
To avoid the problem i have to do it : Nokogiri::HTML(open(url)).xpath("//xpath_expression").collect(&:text)Comments
tenderlove
Sun Aug 30 10:45:37 -0700 2009
| link
Right, because it returns a Nokogiri::XML::Text node. That is different than a string.
-
Nokogiri produces error "output error : unknown encoding" on certain pages
1 comment Created 2 months ago by mogman1Every now and again Nokogiri will fail to process an HTML document, producing the error "output error : unknown encoding". Reference issue 122 as I strongly suspect that this is also related to the version of libxml2. This issue only comes up in my Windows XP environment and it works fine in my Linux environment (again, please see #122 as my environments have remained exactly the same). To see this, try the following:
doc = open('http://businesslogos.com/resources_services.php')
noko = Nokogiri::HTML(doc)That second line will generate the error. Most URLs are just fine, but the one above is an example of a URL that produces the error. I skimmed through the document looking for any bizarre characters but did not find anything obvious. If you upgrade the version of libxml2 to fix issue 122, that may very well fix this problem as well, but I wanted to alert you to it. Also, I want to reiterate that this is a problem that only shows up in the Windows environment and does not happen in the Linux envrionment, the exact opposite of the problem in issue 122 :-/
Sorry to lob two complaints in so short a time. I absolutely love the gem and think it's the best out there for this sort of thing, I am just trying to help make it even better!
Comments
tenderlove
Thu Aug 27 18:33:51 -0700 2009
| link
I believe this is a problem with the version of ICONV that nokogiri on windows is using. I will make sure it is upgraded in the next release, but beyond that, this is a problem I can't fix.
-
1 comment Created 2 months ago by samsm1.4.0xDocumentFragment lacks detailed searchtenderlovexfragment = '<p id="content">hi</p>' Nokogiri::HTML.fragment(fragment).search('#content').length # this returns zero Nokogiri::HTML(fragment).search('#content').length # this returns 1Searching for an element ('p') does work, but using any CSS selector or XPath seems to always produce zero results. Non-fragment search works as I'd expect.
Comments
tenderlove
Fri Aug 28 22:23:02 -0700 2009
| link
delegating DocumentFragment#css to the fragment children. closed by ed10f01
-
Linux environment:
CentOS 5.2
ruby 1.8.6 (2008-08-11 patchlevel 287) [x86_64-linux]
rails 2.3.3
Nokogiri 1.3.3Windows environment:
Windows XP
ruby 1.8.6 (2008-08-11 patchlevel 287) [i386-mswin32]
rails 2.3.3
Nokogiri 1.3.3This is a peculiar issue I discovered when I moved my application from my Windows XP development box to a linux box for production. When I run Nokogiri::HTML(open('http://www.some-url.com')) on my Windows environment, the entire HTML document is returned. However, when I do that same thing in the Linux environment, curiously only the HTML comment tags are returned.
When I checked the temp file that open() generates, the entire HTML document was there, but for whatever reason Nokogiri only grabbed the comment tags. I checked and the same comment tags show inside the original document and in that order. Just for whatever reason, everything but comment tags are filtered out.
My guess is that this is some weird bug having to do with the Linux version of Ruby working with Nokogiri, but otherwise I am thoroughly perplexed. I also tried using a previous version of Nokogiri (1.3.1) to see if that would work, but I got the same result.
Comments
It should be noted that I can pass in a string and Nokogiri parses things just fine. So while Nokogiri::HTML(open('http://www.some-url.com')) produces the strange behaviour, Nokogiri::HTML(open('http://www.some-url.com').read) works exactly as expected. So at least there is a work-around if someone else comes across this.
tenderlove
Wed Aug 26 16:58:49 -0700 2009
| link
Can you run 'nokogiri -v' for me and add the output to the comments? I think this may be a bug in libxml2
nokogiri: 1.3.3
warnings: []libxml:
compiled: 2.7.3 loaded: 2.7.3 binding: extension
tenderlove
Thu Aug 27 10:11:35 -0700 2009
| link
Is that from the linux environment, or the windows environment?
Sorry, completely spaced there.
Linux:
nokogiri: 1.3.3
warnings: []libxml:
compiled: 2.6.26 loaded: 2.6.26 binding: extensionWindows:
nokogiri: 1.3.3
warnings: []libxml:
compiled: 2.7.3 loaded: 2.7.3 binding: extension
tenderlove
Thu Aug 27 17:49:53 -0700 2009
| link
Okay. This is a bug in libxml2. If you upgrade your server to at least 2.6.32, the problem will go away.
-
2 comments Created 2 months ago by henriktcmalloc error parsing "<a><b></a>" fragment with REEREExWorks fine with MRI, but not with REE:
henrik@Nyx ~/Code$ which nokogiri /opt/ruby-enterprise-1.8.6-20090610/bin/nokogiri henrik@Nyx ~/Code$ nokogiri -v --- nokogiri: 1.3.3 warnings: [] libxml: compiled: 2.7.3 loaded: 2.7.3 binding: extension henrik@Nyx ~/Code$ which ruby /opt/ruby-enterprise-1.8.6-20090610/bin/ruby henrik@Nyx ~/Code$ ruby -rubygems -e 'require "nokogiri"; puts Nokogiri::HTML::DocumentFragment.parse("<a><b></a>")' src/tcmalloc.cc:186] Attempt to free invalid pointer: 0x201a90 Abort trap henrik@Nyx ~/Code$Expected output is
<a><b></b></a>without the error.
Comments
tenderlove
Mon Sep 14 21:41:54 -0700 2009
| link
I'm not sure what to do about this. It's working for me:
tenderlove
Sun Oct 04 21:19:21 -0700 2009
| link
I can't repro this and the ticket hasn't been updated for almost a month. I will assume this is fixed on master.
Please reopen and update the ticket if it's still breaking against master. Thanks.
-
nokogiri-1.3.3 is not working with libxml2, libxslt in 1.3.1
1 comment Created 2 months ago by maoI am use ubuntu 9.04 amd64, xml2, xslt and exslt libraries are installed. When I require nokogiri, I got following error:
irb(main):003:0> require 'nokogiri'
LoadError: Could not open any of [xml2, xslt, exslt]
from /home/dan/installed/jruby/lib/ruby/1.8/ffi/library.rb:18:inffi_lib'<br/> from /home/dan/installed/jruby-1.3.1-rails/lib/ruby/gems/1.8/gems/nokogiri-1.3.3-java/lib/nokogiri/ffi/libxml.rb:5<br/> from /home/dan/installed/jruby-1.3.1-rails/lib/ruby/gems/1.8/gems/nokogiri-1.3.3-java/lib/nokogiri/ffi/libxml.rb:31:inrequire'
from /home/dan/installed/jruby/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:31:inrequire'<br/> from /home/dan/installed/jruby-1.3.1-rails/lib/ruby/gems/1.8/gems/nokogiri-1.3.3-java/lib/nokogiri.rb:10<br/> from /home/dan/installed/jruby-1.3.1-rails/lib/ruby/gems/1.8/gems/nokogiri-1.3.3-java/lib/nokogiri.rb:36:inrequire'
from /home/dan/installed/jruby/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:36:in `require'
from (irb):4I solved the problem and record them here:
http://maodan520.spaces.live.com/blog/cns!E0C8D36B1650926A!237.entryBut I think nokogiri is not working correct with neither libxml2.so nor libxslt.so, maybe it's a issue.
Comments
flavorjones
Tue Aug 18 23:17:33 -0700 2009
| link
this problem was solved, i think. see #90 for followup.
-
NodeSet#search problems and NodeSet#+ is not different from NodeSet#&
1 comment Created 3 months ago by SerabeHi there:
I've been tracking down some problems with my implementation for XPath and I think it is not my problem, but yours. Let me explain. In Nokogiri::XML::NodeSet#search you can see:
each do |node| paths.each do |path| sub_set += send(path =~ /^(\.\/|\/)/ ? :xpath : :css, *(paths + [ns])) end endOk, for each node you're iterating over paths and, for each path, you're calling either the xpath or the css method of NodeSet. And guess what, both of them iterates over each node... again. You can take a look at search, xpath and css methods in Nokogiri::XML::NodeSet.
The easiest way to fix this is deleting lines 75 and 80.
Just to explain why this is happening, my implementation is based on RubyArray and I used the op_plus function in it. By the way, if you're implementation of the plus operator removes duplicates I see no difference between + and &, so I would like to know what should I do in the java impl.
Comments
tenderlove
Wed Aug 12 11:44:31 -0700 2009
| link
making NodeSet more consistent with Set, adding NodeSet#| closed by 541cbeb
-
1 comment Created 3 months ago by tenderlove1.4.0xConvert meta_encoding and meta_encoding= to rubytenderlovexThese methods need to be converted to Ruby. The current meta_encoding= method will call xmlFreeNode() on the old meta tag which will cause a segv:
require 'nokogiri' doc = Nokogiri::HTML DATA.read node = doc.at('meta') puts node.name doc.meta_encoding = 'EUC-JP' p node __END__ <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <title>Hello World</title> </head> <body> <h1>Hello Again</h1> </body> </html>Comments
tenderlove
Tue Oct 13 21:38:38 -0700 2009
| link
meta_encoding and meta_encoding= are implemented in ruby. closed by ceffd26
-
4 comments Created 3 months ago by naofumi1.4.0xShouldn't inner_html convert to UTF8 the same way as inner_text?tenderlovexThere seems to be an inconsistency between how encoding conversion is applied with the inner_text, inner_html and to_html methods.
With #inner_text, I think all output is automatically converted to UTF8.
With #inner_html, encoding conversions are not applied.
With #to_html, you can specify the desired encoding for the result with the :encoding option.I would prefer that the output for both #inner_html and #to_html are converted to UTF8 by default, but that you can override this with the :encoding option.
At least, it would be nice to be able to pass the :encoding option to #inner_html.
Comments
In order to provide an :encoding option for #inner_html, maybe the following example;
in nokogiri/xml/node.rb
def inner_html (\*args) children.map { |x| x.to_html(\*args) }.join endin nokogiri/xml/node_set
def inner_html (\*args) collect{|j| j.inner_html(\*args)}.join('') end
tenderlove
Thu Aug 27 18:16:15 -0700 2009
| link
I've added the ability to pass encoding to #inner_html. I don't want to automatically convert all documents to UTF-8 when calling #to_html. I think that would be bad for people processing documents in something besides UTF-8, and they want the final output to remain the specified encoding.
If you always want the output to be UTF-8, just tell the document that it should be encoded with UTF-8 like so:
doc = Nokogiri::HTML open('http://example.com/') doc.encoding = 'UTF-8' # Set the document encoding to UTF-8After doing that inner_html and to_html will return UTF-8 documents.
tenderlove
Thu Aug 27 18:17:40 -0700 2009
| link
inner_html takes the same arguments as to_html. closed by ab9a8a0
-
nokogiri-1.3.3 introduces dependency on st.h -- error: st.h: No such file or directory
1 comment Created 3 months ago by TylerRick1.3.2 installs fine but I can't seem to build/install 1.3.3. I'm running Ubuntu 9.04.
What is st.h and how do I get it to find it?
Thanks!
> sudo gem1.9 install nokogiri -v 1.3.2 Building native extensions. This could take a while... Successfully installed nokogiri-1.3.2 1 gem installed Installing ri documentation for nokogiri-1.3.2... Installing RDoc documentation for nokogiri-1.3.2... > sudo gem1.9 install nokogiri -v 1.3.3 Building native extensions. This could take a while... ERROR: Error installing nokogiri: ERROR: Failed to build gem native extension. /usr/bin/ruby1.9 extconf.rb install nokogiri -v 1.3.3 checking for iconv.h in /opt/local/include/,/opt/local/include/libxml2,/opt/local/include,/opt/local/include,/opt/local/include/libxml2,/usr/local/include,/usr/local/include/libxml2,/usr/include,/usr/include/libxml2,/usr/include,/usr/include/libxml2... yes checking for libxml/parser.h in /opt/local/include/,/opt/local/include/libxml2,/opt/local/include,/opt/local/include,/opt/local/include/libxml2,/usr/local/include,/usr/local/include/libxml2,/usr/include,/usr/include/libxml2,/usr/include,/usr/include/libxml2... yes checking for libxslt/xslt.h in /opt/local/include/,/opt/local/include/libxml2,/opt/local/include,/opt/local/include,/opt/local/include/libxml2,/usr/local/include,/usr/local/include/libxml2,/usr/include,/usr/include/libxml2,/usr/include,/usr/include/libxml2... yes checking for libexslt/exslt.h in /opt/local/include/,/opt/local/include/libxml2,/opt/local/include,/opt/local/include,/opt/local/include/libxml2,/usr/local/include,/usr/local/include/libxml2,/usr/include,/usr/include/libxml2,/usr/include,/usr/include/libxml2... yes checking for xmlParseDoc() in -lxml2... yes checking for xsltParseStylesheetDoc() in -lxslt... yes checking for exsltFuncRegister() in -lexslt... yes checking for xmlRelaxNGSetParserStructuredErrors()... yes checking for xmlRelaxNGSetParserStructuredErrors()... yes checking for xmlRelaxNGSetValidStructuredErrors()... yes checking for xmlSchemaSetValidStructuredErrors()... yes checking for xmlSchemaSetParserStructuredErrors()... yes creating Makefile make cc -I. -I/usr/include/libxml2 -I/usr/include -I/usr/include/ruby-1.9.0/x86_64-linux -I/usr/include/ruby-1.9.0 -I. -DHAVE_XMLRELAXNGSETPARSERSTRUCTUREDERRORS -DHAVE_XMLRELAXNGSETPARSERSTRUCTUREDERRORS -DHAVE_XMLRELAXNGSETVALIDSTRUCTUREDERRORS -DHAVE_XMLSCHEMASETVALIDSTRUCTUREDERRORS -DHAVE_XMLSCHEMASETPARSERSTRUCTUREDERRORS -I/usr/include/libxml2 -I/usr/include -I/usr/include/ruby-1.9.0/x86_64-linux -I/usr/include/ruby-1.9.0 -I. -fPIC -fno-strict-aliasing -g -g -O2 -O2 -g -Wall -Wno-parentheses -fPIC -g -DXP_UNIX -O3 -Wall -Wcast-qual -Wwrite-strings -Wconversion -Wmissing-noreturn -Winline -o xml_reader.o -c xml_reader.c In file included from /usr/include/ruby-1.9.0/ruby.h:15, from ./nokogiri.h:6, from ./xml_reader.h:4, from xml_reader.c:1: /usr/include/ruby-1.9.0/ruby/ruby.h: In function ‘rb_type’: /usr/include/ruby-1.9.0/ruby/ruby.h:973: warning: conversion to ‘int’ from ‘VALUE’ may alter its value In file included from ./nokogiri.h:81, from ./xml_reader.h:4, from xml_reader.c:1: ./xml_document.h:5:16: error: st.h: No such file or directory xml_reader.c: In function ‘attribute_nodes’: xml_reader.c:171: warning: cast discards qualifiers from pointer target type xml_reader.c: In function ‘attribute_at’: xml_reader.c:199: warning: conversion to ‘int’ from ‘long int’ may alter its value xml_reader.c: In function ‘from_memory’: xml_reader.c:466: warning: conversion to ‘int’ from ‘long int’ may alter its value xml_reader.c:474: warning: conversion to ‘int’ from ‘long int’ may alter its value xml_reader.c: In function ‘from_io’: xml_reader.c:506: warning: conversion to ‘int’ from ‘long int’ may alter its value make: *** [xml_reader.o] Error 1Comments
tenderlove
Thu Aug 06 13:35:47 -0700 2009
| link
Ruby 1.9.0 is not supported. You should upgrade to 1.9.1-p129 or even the 1.9.2. 1.9.0 is too broken to be supported. :-(
-
1 comment Created 3 months ago by arndtjenssen1.4.0x"ArgumentError: NULL pointer given" on calling doc.meta_encodingtenderlovexIs thrown on (malformed) documents with missing encoding info on v1.3.2
Steps to reproduce:require 'nokogiri' require 'open-uri' doc = Nokogiri::HTML(open('http://www.europapress.es/valencia/noticia-moby-dick-john-huston-obrira-cap-setmana-filmoteca-destiu-valencia-20090731183544.html')) doc.meta_encodingresults in:
ArgumentError: NULL pointer given from (irb):55:in `meta_encoding' from (irb):55 from :0Comments
tenderlove
Thu Aug 06 21:24:55 -0700 2009
| link
returns nil when an HTML document does not declare a meta encoding tag. closed by d0e9312
-
5 comments Created 3 months ago by alkanshel1.4.0xError parsing provided css_path()tenderlovexNokogiri occasionally has issues parsing the css_path it provides. A sample case is the following path (retrieved by running Node.css_path()):
html > body > div > div:nth-of-type(2) > div > text():nth-of-type(2)
Reasonably certain the cause is the 'text():nth-of-type(2)', which is generated from the corresponding .path of /html/body/div/form/table[3]/tr[11]/td/div/div/div/div/table/tr[2]/td[2]/span[2]/text().
The error message is 'Unexpected ':' in #<Nokigiri::CSS::...'
Comments
tenderlove
Thu Aug 06 20:32:08 -0700 2009
| link
That XPath doesn't result in the CSS path you're showing. Can you give me a code example with reference HTML for me?
Hmm. Okay, I'll check my code/data set and see if I can replicate the issue with better documentation.
tenderlove
Fri Aug 07 09:13:55 -0700 2009
| link
Thank you, I'd really appreciate it!
Weird. I ran Nokogiri over the same data set and looked over the CSS paths generated, and I can't seem to replicate the issue. I'll have to chalk it up to solar radiation or PEBKAC, then. Sorry about that.
tenderlove
Fri Aug 07 19:19:11 -0700 2009
| link
Haha! No problem. If you run across it again, please let me know!
-
1 comment Created 3 months ago by tenderlove1.4.0xffixFix platform detection codetenderlovexWe need to be able to tell when a user is running on windows, not by just the platform. Right now, the code looks at the platform when it needs to look at the OS. Switch to this (from Luis):
RUBY_PLATFORM + RbConfig::CONFIG['host_os'] Similar approach is being used by mspec and the RubySpec to determine which API behavior should be checked for Java on Windows.Comments
tenderlove
Thu Aug 06 21:24:55 -0700 2009
| link
using host OS to figure out ENV["PATH"]. closed by 544b431
-
2 comments Created 3 months ago by tenderlove1.4.0xffixPackage libxml2 dll's with jruby gemtenderlovexPackage libxml2 dll's with jruby gem so that the jruby gem can work on windows / jruby
Comments
tenderlove
Thu Jul 30 09:44:59 -0700 2009
| link
I forgot, people seem to get this error:
C:/Program Files/JRuby/jruby-1.3.1/bin/../lib/ruby/1.8/ffi/ffi.rb: 114:in `create _invoker': Function 'calloc' not found in [exslt] (FFI::NotFoundError) from C:/Program Files/JRuby/jruby-1.3.1/bin/../lib/ruby/1.8/ ffi/library. rb:50:in `attach_function' from C:/Program Files/JRuby/jruby-1.3.1/bin/../lib/ruby/1.8/ ffi/library. rb:48:in `each' from C:/Program Files/JRuby/jruby-1.3.1/bin/../lib/ruby/1.8/ ffi/library. rb:48:in `attach_function' from C:/Program Files/JRuby/jruby-1.3.1/lib/ruby/gems/1.8/gems/ nokogiri- 1.3.2-x86-mswin32/lib/nokogiri/ffi/libxml.rb:54 from C:/Program Files/JRuby/jruby-1.3.1/lib/ruby/gems/1.8/gems/ nokogiri- 1.3.2-x86-mswin32/lib/nokogiri/ffi/libxml.rb:31:in `require' from C:/Program Files/JRuby/jruby-1.3.1/bin/../lib/ruby/ site_ruby/1.8/ru bygems/custom_require.rb:31:in `require' from C:/Program Files/JRuby/jruby-1.3.1/lib/ruby/gems/1.8/gems/ nokogiri- 1.3.2-x86-mswin32/lib/nokogiri.rb:10 from C:/Program Files/JRuby/jruby-1.3.1/lib/ruby/gems/1.8/gems/ nokogiri- 1.3.2-x86-mswin32/lib/nokogiri.rb:36:in `require' from C:/Program Files/JRuby/jruby-1.3.1/bin/../lib/ruby/ site_ruby/1.8/ru bygems/custom_require.rb:36:in `require' from TestNokogiri.rb:3
tenderlove
Wed Aug 12 11:45:21 -0700 2009
| link
This was fixed here: 70ad006
-
Windows Nokogiri 1.3.3 - ALWAYS LoadError: no such file to load -- nokogiri/1.9/nokogiri
1 comment Created 3 months ago by ocoolioNo matter how you set up, on every Windows system the newest Nokogiri (1.3.3) fails to load with the following error (both 1.9 and 1.8 Ruby):
irb(main):001:0> require 'rubygems'
=> false irb(main):002:0> require 'nokogiri'
LoadError: no such file to load -- nokogiri/1.9/nokogiri
from c:/Ruby/lib/ruby/gems/1.9.1/gems/nokogiri-1.3.3-x86-mingw32/lib/nokogiri/nokogiri.rb:1:in `require'
from c:/Ruby/lib/ruby/gems/1.9.1/gems/nokogiri-1.3.3-x86-mingw32/lib/nokogiri/nokogiri.rb:1:in `'
from c:/Ruby/lib/ruby/gems/1.9.1/gems/nokogiri-1.3.3/lib/nokogiri.rb:12:in `require'
from c:/Ruby/lib/ruby/gems/1.9.1/gems/nokogiri-1.3.3/lib/nokogiri.rb:12:1.3.2 works great
Comments
tenderlove
Tue Jul 28 13:20:50 -0700 2009
| link
Sorry about that. I uploaded the wrong gems. Please uninstall that version and try again.
-
5 comments Created 3 months ago by darrylmdalessioxdocument.rb:104: [BUG] object allocation during garbage collection phaseREExrequire 'nokogiri' GC_HACK = false GC.disable if GC_HACK # will delay "error : Name is not from the document dictionnary" gc_count = 0 cycles = 0 loop do cycles = cycles + 1 if GC_HACK if gc_count > 10000 GC.enable GC.start p "gc start cycles: #{cycles}" sleep 10 gc_count = 0 GC.disable end gc_count = gc_count + 1 end p "cycles: #{cycles}" if cycles%1000 == 0 doc = Nokogiri::XML::Document.parse("<bad>blinky</bad>") doc.xpath('/bad').each{ |t| new_node = Nokogiri::XML::Node.new('bad', doc) new_node.content = 'clyde' t.replace(new_node) } end # spits out: # "element bad: error : Name is not from the document dictionnary 'bad'" # between 2 and 20 thousand times then dies with: #/opt/ruby-enterprise-1.8.6-20090610/lib/ruby/gems/1.8/gems/nokogiri-1.3.2/lib /nokogiri/xml/document.rb:104: [BUG] object allocation during garbage collection phase #ruby 1.8.6 (2008-08-11) [i686-linux] #












Thanks for the help on this.
fixed in 6510fe7
ah, for clarity, this is flavorjones. wrong login.
I concur. that was me.
closing. this will be in 1.4.1 sometime in the next week or so.