Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.html() is decoding entities when decodeEntities: false #1279

Closed
anagai opened this issue Feb 14, 2019 · 2 comments
Closed

.html() is decoding entities when decodeEntities: false #1279

anagai opened this issue Feb 14, 2019 · 2 comments

Comments

@anagai
Copy link

anagai commented Feb 14, 2019

I am using 1.0.0-rc.2 release version. I have html entity such as > and would not like that decoded when I use .html(). I had loaded the html object this way:

let obj = cheerio.load('>', {decodeEntities: false})

console.log(obj.html())

The output is decoded to >. This is not correct.

When I used the 0.22.0 release this issue does not occur. Can you please set the latest stable release as 0.22.0 please.

@AlynxZhou
Copy link

This is a bug because of different behavior of parse5 and htmlparse2, if you set decodeEntities to false, dom-serializer won't encode them, and htmlparse2 won't decode too. But when cheerio switches to parse2, it always decode entities, which leads into this problem. If you let dom-serializer encode entities, other chars like CJK will be encode too, which leads into other problems. Also, it's really confusing because sometimes cheerio chooses parse5 and sometimes cheerio chooses htmlparser2

This is a long-time problem, and I am feeling bad that no cheerio maintainer makes a decision to solve this, it's really annoying, we should keep a correct behavior instead of keeping compatible to old incorrect behavior, I have submitted a solution here but no one replied to my PR. In a word, to encode all entities is not a good choice, cheerio is used to parse like DOM, not encoder.

@fb55
Copy link
Member

fb55 commented Jul 11, 2019

Thanks @AlynxZhou!

@fb55 fb55 closed this as completed Jul 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants