Skip to content

fix: address Codex review findings (parser/mutator/serializer hardening)#10

Merged
zrosenbauer merged 2 commits into
mainfrom
fix/codex-review-findings
May 15, 2026
Merged

fix: address Codex review findings (parser/mutator/serializer hardening)#10
zrosenbauer merged 2 commits into
mainfrom
fix/codex-review-findings

Conversation

@zrosenbauer
Copy link
Copy Markdown
Member

Summary

  • Parser hardened against adversarial input: MAX_DEPTH=1024 and MAX_INPUT_BYTES=u32::MAX bounds; duplicate-id detection scoped to siblings-under-same-parent rather than global.
  • Tokenizer now emits a TokenStream carrying trivia so round-trips preserve whitespace/comments.
  • New escape module consolidates escape_attr / escape_text / is_valid_name, eliminating double-escape paths.
  • Mutate, serialize (pretty-print), schema/validate (anchored regex), and selector parsing tightened per review.
  • Adds proptest regression fixtures; ignores local .claude/, .agents/, skills-lock.json.

Test plan

  • cargo fmt --all -- --check
  • cargo clippy --all-targets --all-features -- -D warnings
  • cargo test --all-features --workspace (all 198 tests pass locally)
  • CI green on ubuntu/macos/windows
  • Node binding build + vitest green

zrosenbauer and others added 2 commits May 15, 2026 14:57
…izer

Hardens the marxml crate against adversarial inputs and tightens
semantics flagged in review:

- parse: enforce MAX_DEPTH (1024) and MAX_INPUT_BYTES (u32::MAX); scope
  duplicate-id detection to sibling-under-same-parent (was global)
- tokenizer: emit TokenStream with trivia preserved for round-trips
- escape: extract escape_attr/escape_text/is_valid_name into dedicated
  module; consistent entity handling without double-escaping
- mutate: refactored update / replace_content / replace_in paths;
  additional safety around invalid attr names
- serialize: pretty-print fixes for mixed content, self-close handling,
  and multi-root separation
- schema/validate: regex constraints fully anchored; clearer error
  messages
- selector: parser/matcher tightening for edge cases

Also adds proptest regression file and ignores local agent/skills
tooling directories.

Co-Authored-By: Claude <noreply@anthropic.com>
…quired semantics

The compliant-doc fixture relied on a child element satisfying
contentRequired. After tightening that rule to mean "direct text
content" (see structural_only_body_fails_content_required in the
Rust suite), the fixture needs a text body too.

Co-Authored-By: Claude <noreply@anthropic.com>
@zrosenbauer zrosenbauer merged commit f7f59e8 into main May 15, 2026
9 checks passed
@zrosenbauer zrosenbauer deleted the fix/codex-review-findings branch May 15, 2026 19:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant