Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Omittable closing tags #54

Open
wants to merge 2 commits into
from

Conversation

Projects
None yet
6 participants

jankuca commented Aug 16, 2012

Hi,

there are several situations in which you can omit closing tags of some elements.

<p>First paragraph.
<p>Second paragraph.

This is valid HTML5 code and should be parsed as

<p>First paragraph.</p>
<p>Second paragraph.</p>

The parser instead nests the second paragraph into the former one as

<p>First paragraph.
  <p>Second paragraph.</p>
</p>

I tweaked the 1.x version of the parser. It should now correctly add closing tags when they are intentionally omitted.

The rules are in the closingOpeningTags map. There are lists of tags that close each of the elements that have ommitable closing tags.

For the HTML code above, the actual result is

<p>First paragraph.
</p><p>Second paragraph.</p>

(which is totally correct).

I hope you can include this in the codebase. Cheers!

P.S. My editor also removed trailing white space from all the lines. Let me know if that's a problem for you.

@jankuca jankuca referenced this pull request in tmpvar/jsdom Aug 16, 2012

Closed

Incorrect HTML parsing, omitted closing tags #482

domenic commented Oct 10, 2012

+1!!

+1 please merge

catilac commented Jan 10, 2013

+1

fb55 commented Apr 7, 2013

<ul><li><ul><li><ul><li><ul></ul></li><li></li></ul>.

@AndreasMadsen AndreasMadsen referenced this pull request in AndreasMadsen/htmlparser2 Jul 31, 2013

@fb55 fb55 Merge pull request #54 from abarre/master
[tokenizer] fix perf regression
a842129
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment