Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

<source> tags are not parsed properly #50

secretrobotron opened this Issue Jun 22, 2012 · 0 comments


None yet
1 participant

I might be doing the readout wrong, but this is the second time I've picked this up. It seems that <source> isn't identified as a void tag, so they become children of one another when listed inside a <video>:

var htmlparser = require('htmlparser');

var htmlContent = "<html><head></head><body><video><source src=\"foo.ogv\"><source src=\"lol.smaz\"></video><div></div></body></html>";

var handler = new htmlparser.DefaultHandler(function (error, dom) {
  function parse(dom, spacing){
    console.log(spacing, dom.name);
      for(var i=0; i<dom.children.length; ++i){
        parse(dom.children[i], spacing + ' ');
  parse(dom[0], '');

new htmlparser.Parser(handler).parseComplete(htmlContent);

AndreasMadsen referenced this issue in AndreasMadsen/htmlparser2 Jun 10, 2013

Merge pull request #50 from AndreasMadsen/long-cdata-ending
[Tokenizer] don't reset CDATA state in case of long endings
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment