Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

DocumentFragment.parse of certain attribute values mangles document on JRuby #747

Closed
jmcnevin opened this Issue · 2 comments

3 participants

@jmcnevin

Nokogiri 1.5.5
JRuby 1.6.7.2

There seems to be an issue on JRuby with attribute values beginning with something that looks like a protocol (namely, characters, and then a colon). This causes the document to be parsed incorrectly.

I created a small script to test this:

require 'nokogiri'

test_string = <<-EOF
<p>This is a sample document that has been created as an example of a link to a file that is not an .html document.</p>
<p>
  <img src="embedded:image1.png" alt="image1.png" />
</p>
EOF

doc = Nokogiri::HTML::DocumentFragment.parse(test_string)

puts doc.to_s

CRuby:

<p>This is a sample document that has been created as an example of a link to a file that is not an .html document.</p>
<p>
  <img src="embedded:image1.png" alt="image1.png"></p>

JRuby:

<p>This is a sample document that has been created as an example of a link to a file that is not an .html document.</p>
<p>
  <img image1.png="">
</p>
@yokolet
Owner

Hello!

I got below on Nokogiri master and JRuby 1.7.0.preview2:

<p>This is a sample document that has been created as an example of a link to a file that is not an .html document.</p>
<p>
  <img alt="image1.png" src="embedded:image1.png">
</p>

So, it looks the bug has been fixed by another bug fix.

@jvshahid
Owner

Fixed.

@jvshahid jvshahid closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.