-
-
Notifications
You must be signed in to change notification settings - Fork 236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
htmlparser2 adapter uses no html-encoding in text nodes #8
Comments
jsdom's entity decoding isn't entirely spec compliant & should be replaced with the parser-provided alternatives. |
The more of that kind of thing we can move into the parser, the better. I am not exactly sure how to do that, but we'll look into it... |
While I do agree that we should move that kind of thing into the parser as far as possible, this problem doesn't stem from jsdom. I'd expect the htmlparser2 tree-adapter to produce the same output format as htmlparser2 itself, however apparently this is not the case. |
Oh well, thanks @fb55, had more time to look into it just now. Seems like we can drop our custom HTMLDecode function and use htmlparser2/parse5 for that. Thanks! |
Update DOCTYPE tokenization per spec
Compare this test script:
which produces this ouput:
Seems like text nodes already contain decoded data in parse5 and it's used as-is in the tree adapter?
The text was updated successfully, but these errors were encountered: