Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-strict HTML entities #821

Closed
Mickael-van-der-Beek opened this issue Jul 7, 2014 · 1 comment
Closed

Non-strict HTML entities #821

Mickael-van-der-Beek opened this issue Jul 7, 2014 · 1 comment
Labels

Comments

@Mickael-van-der-Beek
Copy link

If you have an HTML entity in the text-content like this:

<a>&#x61;</a>

then:

window.document.getElementsByTagName('a')[0].textContent === "a"

will evaluate as true because &#x61; is the HTML entity for the character a.

Now if you use a non-strict HTML entity and by "non-strict" I mean HTML entities that are not semi-colon ; terminated, it won't get decoded to it's UTF-8 counter-part. e.g:

&#x61 should be decoded to the character a too

Most modern browser don't require semi-colon termination for HTML entities so I wanted to ask if this was a bug or just an intended behaviour.

@Sebmaster
Copy link
Member

This should be fixed with this PR with the switch to parse5. Try it with that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants