Skip to content

Step 1c: HTML tokenizer — html5lib conformance suite #56

@thomasnemer

Description

@thomasnemer

Parent: #20

Goal

Vendor the html5lib tokenizer test suite and write a conformance harness. Fix any edge cases to achieve >98% pass rate.

Prerequisites

File Changes

  • crates/ie-html/tests/html5lib-tokenizer/*.json — vendored test fixtures
  • crates/ie-html/tests/tokenizer_conformance.rs — conformance test harness

Implementation

  • Vendor html5lib-tests tokenizer JSON files
  • Conformance harness: parse each test, run tokenizer, compare tokens
  • Adjacent Character tokens coalesced for comparison
  • Report pass/fail with descriptive names
  • Fix any failing edge cases in tokenizer

Acceptance Criteria

  • 98% html5lib tokenizer conformance pass rate

  • All unit tests still pass
  • Clippy clean

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions