You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
semi-colon is only optional for HTML 3.2 entities. Semi-colon is required for entities added afterward
the regEx is also broken. The decoder would fail with "&something"
Html4Entities doesn't decode ""><" but Html5Entities does. We should accept those special cases in quirk mode
doesn't accept numeric character reference starting with "&#X" even though "&#X" and "&#x" are both valid in HTML 4 and HTML 5 specification.
I am doing a complete rewrite of the decoder. The decoder would use incremental parser instead of regEx, and it would have a quirk mode and a strict mode.
The text was updated successfully, but these errors were encountered:
For non-strict decoding, the new HTML5 entity decoder is the slowest of all. This is because if you make semi-colon optional, you have to start trying to match the longest named entities first, and there are lots of entities to cover.
For strict decoding, the new decoder is lightning fast :-)
The code is incomplete (doesn't do numerical decoding yet).
The code should be changed to do String.split('&') first. That will speed things up considerably.
The decoder has multiple issues that must be addressed
I am doing a complete rewrite of the decoder. The decoder would use incremental parser instead of regEx, and it would have a quirk mode and a strict mode.
The text was updated successfully, but these errors were encountered: