Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DocumentFragment.parse of certain attribute values mangles document on JRuby #747

Closed
jmcnevin opened this issue Aug 14, 2012 · 2 comments
Closed

Comments

@jmcnevin
Copy link

Nokogiri 1.5.5
JRuby 1.6.7.2

There seems to be an issue on JRuby with attribute values beginning with something that looks like a protocol (namely, characters, and then a colon). This causes the document to be parsed incorrectly.

I created a small script to test this:

require 'nokogiri'

test_string = <<-EOF
<p>This is a sample document that has been created as an example of a link to a file that is not an .html document.</p>
<p>
  <img src="embedded:image1.png" alt="image1.png" />
</p>
EOF

doc = Nokogiri::HTML::DocumentFragment.parse(test_string)

puts doc.to_s

CRuby:

<p>This is a sample document that has been created as an example of a link to a file that is not an .html document.</p>
<p>
  <img src="embedded:image1.png" alt="image1.png"></p>

JRuby:

<p>This is a sample document that has been created as an example of a link to a file that is not an .html document.</p>
<p>
  <img image1.png="">
</p>
@yokolet
Copy link
Member

yokolet commented Aug 19, 2012

Hello!

I got below on Nokogiri master and JRuby 1.7.0.preview2:

<p>This is a sample document that has been created as an example of a link to a file that is not an .html document.</p>
<p>
  <img alt="image1.png" src="embedded:image1.png">
</p>

So, it looks the bug has been fixed by another bug fix.

@jvshahid
Copy link
Member

Fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants