Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Add failing test case for Nokogiri::XML::Node#content inconsistency between Java and C #794

Closed
wants to merge 2 commits into from

2 participants

@benlangfeld

C implementation includes content of child nodes recursively, Java does not.

@benlangfeld

Any guidance on how I should approach fixing this would be most appreciated.

@jvshahid
Owner

sorry for the late response. I'll take a look and get back to you shortly.

@jvshahid
Owner

Can you explain what's wrong with the test in commit 0e43632, i mean what are you trying to test ?

@benlangfeld

Note that this is, I hope, the last thing blocking my Nokogiri-dependent stuff from all passing on JRuby. I would love to get this in to 1.5.6, assuming that is coming soon. It'll be beers all round at that point :)

@benlangfeld

On JRuby, node.content returns "I <3 nachos". On libxml it returns "I ". A similar thing can be seen in the rendered XML. It seems libxml is assuming all instances of "<" mean there's an XML element coming and doesn't escape correctly.

@jvshahid
Owner

That's what I thought. I think in this case the C implementation is doing the right thing, the << method on nodes accept markup, which explains why everything after the < was dropped. This is just the way libxml handles invalid markup.

@benlangfeld

So how then do I append a string containing "<"? I can't do node.content += "<". That would work, other than the issue in the other test here where on libxml #content walks all children (which I believe is correct, though perhaps this should be optional).

@jvshahid
Owner

You can use node.content += on the text node or insert a new text node if there isn't one. Element nodes don't have text content by definition. If that still doesn't make sense, can you please send an example of what you're trying to do.

@benlangfeld
@jvshahid
Owner

No worries. I agree that the behavior should be consistent. I'll merge the changes in 311be4b on master but I won't merge the other test.

I'll work on the fix some time later today.

@jvshahid jvshahid referenced this pull request from a commit in jvshahid/nokogiri
@jvshahid jvshahid Initial fix for #794. I think the solution can be drastically simplif…
…ied. Details below.

This patch changes the behavior of node.content to recursively
aggregate text from children nodes, unless this is a Text node in
which case the content is returned.

In order to support recursively appending contents of children
elements and to fix the broken append behavior demonstrated by the
test in the previous commit, I had to remove the caching (which I'm
not sure why it was there in the first place) Also since we
recursively append text, we'll have to invalidate the cache when
changes deeper in the document tree occur.

Also I removed the content() from XML::Text since we don't need it
anymore.
1928899
@jvshahid jvshahid referenced this pull request from a commit in jvshahid/nokogiri
@jvshahid jvshahid pushed the fix for #794 to the ruby land.
This solution is much cleaner, but may not be as efficient.
This can be reverted to the previous commit which was pure
Java if there are issues with this implementation.
96f91eb
@jvshahid jvshahid referenced this pull request
Closed

Fix for pull request #794 #797

@jvshahid
Owner

Sorry it took too long to push a fix. I had to wrap my head around the current implementation. I pushed another pull request with the first test and its fix. I'd like @yokolet to review the changes before it gets merged into master.

I'm going to close the issue, but feel free to comment on the new pull request if the problem isn't solved.

@jvshahid jvshahid closed this
@benlangfeld
@yokolet yokolet referenced this pull request from a commit
@yokolet yokolet Merges java version of fix of a pull request #797, which incluse a te…
…st case from pull request #794.
3b905f6
@jvshahid jvshahid referenced this pull request from a commit in jvshahid/nokogiri
@jvshahid jvshahid Revert "pushed the fix for #794 to the ruby land."
This reverts commit 96f91eb.
b4f3f47
@patcheng patcheng referenced this pull request from a commit
@yokolet yokolet Merges java version of fix of a pull request #797, which incluse a te…
…st case from pull request #794.
bd44ce7
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
This page is out of date. Refresh to see the latest.
Showing with 10 additions and 0 deletions.
  1. +10 −0 test/xml/test_node.rb
View
10 test/xml/test_node.rb
@@ -798,6 +798,10 @@ def test_content_equals
node.content = "1234 <-> 1234"
assert_equal "1234 <-> 1234", node.content
assert_equal "<form>1234 &lt;-&gt; 1234</form>", node.to_xml
+
+ node.content = '1234'
+ node.add_child '<foo>5678</foo>'
+ assert_equal '12345678', node.content
end
def test_set_content_should_unlink_existing_content
@@ -807,6 +811,12 @@ def test_set_content_should_unlink_existing_content
children.each { |child| assert_nil child.parent }
end
+ def test_append_content_escaping
+ node = Nokogiri::XML::Node.new('form', @xml)
+ node << "I <3 nachos"
+ assert_equal "I <3 nachos", node.content
+ end
+
def test_whitespace_nodes
doc = Nokogiri::XML.parse("<root><b>Foo</b>\n<i>Bar</i> <p>Bazz</p></root>")
children = doc.at('//root').children.collect{|j| j.to_s}
Something went wrong with that request. Please try again.