Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A bug of htmlparse #507

Closed
zyf0330 opened this issue Feb 25, 2016 · 12 comments
Closed

A bug of htmlparse #507

zyf0330 opened this issue Feb 25, 2016 · 12 comments

Comments

@zyf0330
Copy link

zyf0330 commented Feb 25, 2016

I am using version 1.1.1, and there is a bug. I have tested with 1.2.0 version in http://kangax.github.io/html-minifier/ , the same situation took place.
The problem appears at the comment // Start tag: of htmlparser.js, specific code is match = html.match( startTag );.
Please see here jsbin, but please copy it to someplace with a console.time

@alexlamsl
Copy link
Collaborator

Would you mind just pasting here an example HTML fragment which html-minifier chokes on?

I couldn't copy anything from your jsbin link because the tab just hangs there. And by "bug" do you mean a crash, an unexpected output, or something else?

@zyf0330
Copy link
Author

zyf0330 commented Feb 25, 2016

This is

var str = "objpicsurl_";
var unnormalAttr = 'ss"123';
var s = '<tag v-ref:vm_pv :imgs=" '+ str +' "></tag>';
var s1 = '  <tag v-ref:vm_pv :imgs=" '+ str +' " '+ unnormalAttr +'></tag>';
var s2 = '<tag v-ref:vm_pv :imgs=" '+ str +' " '+ unnormalAttr +'></tag>';
var startTag = /^<([\w:-]+)((?:\s*[\w:\.-]+(?:\s*(?:(?:=))\s*(?:(?:"[^"]*")|(?:'[^']*')|[^>\s]+))?)*)\s*(\/?)>/;

console.time('not exists unormal HTML tag');
s.match(startTag);
console.timeEnd('not exists unormal HTML tag');

console.log('exists unormal HTML tag and start with a whitespace');
for(var i in str){
    console.time(i);
    s1.replace(str, str.substring(0, str.length - i)).match(startTag);
    console.timeEnd(i);
}

console.log('exists unormal HTML tag and not start with a whitespace');
for(var i in str){
    console.time(i);
    s2.replace(str, str.substring(0, str.length - i)).match(startTag);
    console.timeEnd(i);
}
console.log('If I add a char into str, time increase to about double.');

@alexlamsl
Copy link
Collaborator

Can you paste the actual HTML content you'd pass to html-minifier?

@zyf0330
Copy link
Author

zyf0330 commented Feb 25, 2016

Ok. <tag v-ref:vm_pv :imgs=" objpicsurl_ " ss"123></tag>, this is the bad HTML tag.

@zyf0330
Copy link
Author

zyf0330 commented Feb 25, 2016

console.log('If I add a char into str, time increase to about double.');
This is the most important.

@alexlamsl
Copy link
Collaborator

It is invalid HTML/XML indeed. So I guess the problem here is you'd expect it to error out, but it is taking a long time before it'd do so?

@zyf0330
Copy link
Author

zyf0330 commented Feb 25, 2016

Yeah, you are right. It is a invalid HTML tag.
And regex match returns null as expect.
But it take so long time.

@zyf0330
Copy link
Author

zyf0330 commented Feb 25, 2016

11 chars taste 4300ms, 12 chars taste 8400ms about double.
But it is not only one problem about invalid HTML tag.
s1 is also an invalid HTML tag, but it starts with a tab, so regex match of it is so fast.
s2 starts without a tab or other whitespaces, regex match of it is so slow, and time increases double with char length increass of attr value.

@zyf0330
Copy link
Author

zyf0330 commented Feb 25, 2016

Sorry, I type "waste" wrongly to "taste".

@alexlamsl
Copy link
Collaborator

We are on the same page now 👍

I'll take a look at it - thanks for reporting!

@alexlamsl
Copy link
Collaborator

fix submitted in #510 - unfortunately it's on a branch other than gh-pages, so I can't give you an online version to test until it is merged 😢

@zyf0330
Copy link
Author

zyf0330 commented Feb 25, 2016

OK, will see later. Thanks

On Thu, Feb 25, 2016, 18:44 Alex Lam S.L. notifications@github.com wrote:

fix submitted in #510 #510

  • unfortunately it's on a branch other than gh-pages, so I can't give you
    an online version to test until it is merged [image: 😢]


Reply to this email directly or view it on GitHub
#507 (comment)
.

追随

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants