Skip to content

refactor: tighten Rust idioms (saturating casts, parse_tag split, or-named helper)#11

Merged
zrosenbauer merged 1 commit into
mainfrom
fix/rust-idioms
May 15, 2026
Merged

refactor: tighten Rust idioms (saturating casts, parse_tag split, or-named helper)#11
zrosenbauer merged 1 commit into
mainfrom
fix/rust-idioms

Conversation

@zrosenbauer
Copy link
Copy Markdown
Member

Summary

  • Replaces the saturating unwrap_or(usize::MAX) / unwrap_or(u32::MAX) pattern (17 sites) with SourcePosition::offset_usize() or .expect(). MAX_INPUT_BYTES is gated at parse entry, so the conversions are infallible — silently saturating was hiding bugs.
  • Splits the 145-line parse_tag (previously #[allow(clippy::too_many_lines)]) into parse_end_tag + parse_opening_tag + parse_attribute_list, returning a named AttributeList struct. Drops the allow.
  • Splits the or-named skip_comment_or_cdata into try_skip_comment + try_skip_cdata over a shared scan_to_terminator.
  • children.clone().next() peek → as_slice().first(); qualified std::collections::HashSet<…> paths in mutate.rs → single use import.

Test plan

  • cargo fmt --all -- --check
  • cargo clippy --all-targets --all-features -- -D warnings
  • cargo test --all-features --workspace (198 tests pass)
  • bindings/node: pnpm build:debug && pnpm test (23 pass)
  • CI green across ubuntu/macos/windows

Replaces TS-brain reflexes with the conventional Rust shape:

- Saturating conversions (~17 sites) → `SourcePosition::offset_usize()`
  helper or `.expect()`. `MAX_INPUT_BYTES` is checked at parse entry so
  every u32↔usize round-trip is infallible; saturating silently to
  `usize::MAX` was hiding bugs instead of letting them surface.
- `parse_tag` (145-line monolith with `#[allow(too_many_lines)]`) →
  thin dispatcher over `parse_end_tag` and `parse_opening_tag`, with
  attribute-list parsing extracted into its own function returning a
  named `AttributeList` struct. Drops the allow and the comment churn.
- `skip_comment_or_cdata` → `try_skip_comment` + `try_skip_cdata`,
  sharing a `scan_to_terminator` primitive. Eliminates the
  multi-purpose `or`-named helper.
- Tokenizer attribute-dedup logic is now two single-purpose helpers
  (`seen_attribute` + `record_seen_attr`) instead of one fused block.
- `children.clone().next()` peek (TextSegments) →
  `as_slice().first()`. Same Big-O, reads as "peek" instead of
  "clone iterator".
- Qualified `std::collections::HashSet<…>` paths in mutate.rs →
  single `use` import.
@zrosenbauer zrosenbauer merged commit bab1c34 into main May 15, 2026
9 checks passed
@zrosenbauer zrosenbauer deleted the fix/rust-idioms branch May 15, 2026 21:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant