HtmlUnit - NekoHtml Parser 5.0.0
- java17 & modules
- improved entity error handling
- introduce hgroup support
- fix null char handling in script content
- fix translation of 0x98 to \u02DC
- fix surrogate-character-reference parse error handling
- fix consecutive ampersands before namede entity parsing
- include tests from https://github.com/html5lib/html5lib-tests.git
- more fuzzer tests and fixes
- HTMLScanner.scanName: ASCII fast-path for the per-char inner loop
- javadoc fixes
- handle empty attributes always the same way
- more robust collection handling, missing forcedEndElement_ reset fixed
- add missing methods to EmptyXMLAttributesImpl to make it more robust
- introduce and use immutable EmptyXMLAttributesImpl
- reuse the plaintext scanner (like the other scanners) and avoid toLowerCase if not needed
- fix: use the correct property (http://cyberneko.org/html/properties/names/attrs) in case of the error handling for strange attributes
- adjust column no also when skipping
- modernize by using Arrays.mismatch()
- reuse the HTMLUnicodeEntitiesParser (like we do for many other things) and simplify the code a it
see HtmlUnit release notes for more details
Full Changelog: 4.21.0...5.0.0