Fixed bug #61 (Parser hangs on some input) #62

Closed
wants to merge 3 commits into
from

Projects

None yet

1 participant

@arnaud-lb

This fixes bug #61 (Parser hangs on some input).

The bug is caused by _reTagName (/^\s*(\/?)\s*([^\s\/]+)/) doing too much backtracking when the input string contains many whitespaces.

@kirbysayshi kirbysayshi pushed a commit to kirbysayshi/node-htmlparser that referenced this pull request Dec 19, 2013
@fb55 fb55 [tokenizer] re-added the carriage return as whitespace
fixes #62

apparently Google's gumbo-parser does behave this way:
https://github.com/google/gumbo-parser/blob/101726c50e172e45be6002c51b85
e45f27f0c2c6/src/tokenizer.c#L322
163a4ce
@arnaud-lb arnaud-lb closed this Dec 17, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment