Update to 1.6.9 breaks HTML entities #62

cyruscollier · 2016-04-04T20:31:36Z

When I updated to 1.6.9, it appears the way HTML entities are treated in parsed documents has changed, and it incorrectly handles some characters. The only entity I see so far that converts incorrectly is i (lower case "i"). When requesting a node's text with this entity in it, it returns "\n5;" instead (newline, "5", semi-colon). As soon as I reverted back to 1.6.8, it fixed it.

Is it possible to disable all entity handling entirely, since I can easily do that myself with html_entity_decode() if I need to?

The text was updated successfully, but these errors were encountered:

paquettg · 2016-04-05T18:21:58Z

Interesting, we do not do any handling of entities in the module. We do use the &#10 entity to preserve new lines though, which might be what is causing this issue for you.

I am not able to duplicate this solely with the line you provided, it just returns the correct line. I have added a test to confirm this (I'll tag this comment in my new text).

Could you please change the test so that it breaks for you/causes the issue you are experiencing? After that is done I can get to fixing the issue.

thanks.

paquettg · 2016-04-05T18:22:57Z

Here is the commit that added the test.
01a9e2f

cyruscollier · 2016-04-05T18:39:03Z

Thanks, I'll take a look at the test and see if I can replicate it. The newline entity you mentioned may indeed have something to do with it. It is possible that hexadecimal entities are converted to decimal ones somewhere in the process? The hex entity i becomes the dec entity i, which matches my output if that entity somehow gets broken up into two parts.

paquettg · 2016-04-05T18:53:08Z

hey @clarinetlord,

I added a semi-color to the end of my conversion code, which really should have been their in the first place... this is probably why they have that.

This should solve the issue. Let me know if you have any other issue.

cyruscollier · 2016-04-05T19:25:07Z

Yep, looks like that did it! You could now change that new TextNode test if you want, to test something like i instead of i, since any decimal entities starting with &#10 are what actually broke, not that particular hex entity I originally thought.

paquettg closed this as completed in d68e966 Apr 5, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update to 1.6.9 breaks HTML entities #62

Update to 1.6.9 breaks HTML entities #62

cyruscollier commented Apr 4, 2016

paquettg commented Apr 5, 2016

paquettg commented Apr 5, 2016

cyruscollier commented Apr 5, 2016

paquettg commented Apr 5, 2016

cyruscollier commented Apr 5, 2016

Update to 1.6.9 breaks HTML entities #62

Update to 1.6.9 breaks HTML entities #62

Comments

cyruscollier commented Apr 4, 2016

paquettg commented Apr 5, 2016

paquettg commented Apr 5, 2016

cyruscollier commented Apr 5, 2016

paquettg commented Apr 5, 2016

cyruscollier commented Apr 5, 2016