Skip to content

HtmlUnit - NekoHtml Parser 5.0.0

Choose a tag to compare

@rbri rbri released this 23 May 14:48
· 2 commits to master since this release
548d1a0
  • java17 & modules
  • improved entity error handling
  • introduce hgroup support
  • fix null char handling in script content
  • fix translation of 0x98 to \u02DC
  • fix surrogate-character-reference parse error handling
  • fix consecutive ampersands before namede entity parsing
  • include tests from https://github.com/html5lib/html5lib-tests.git
  • more fuzzer tests and fixes
  • HTMLScanner.scanName: ASCII fast-path for the per-char inner loop
  • javadoc fixes
  • handle empty attributes always the same way
  • more robust collection handling, missing forcedEndElement_ reset fixed
  • add missing methods to EmptyXMLAttributesImpl to make it more robust
  • introduce and use immutable EmptyXMLAttributesImpl
  • reuse the plaintext scanner (like the other scanners) and avoid toLowerCase if not needed
  • fix: use the correct property (http://cyberneko.org/html/properties/names/attrs) in case of the error handling for strange attributes
  • adjust column no also when skipping
  • modernize by using Arrays.mismatch()
  • reuse the HTMLUnicodeEntitiesParser (like we do for many other things) and simplify the code a it

see HtmlUnit release notes for more details

Full Changelog: 4.21.0...5.0.0