Skip to content

Better xml parsing#157

Merged
CorentinB merged 4 commits intointernetarchive:mainfrom
saveweb:better-xml-parse
Nov 12, 2024
Merged

Better xml parsing#157
CorentinB merged 4 commits intointernetarchive:mainfrom
saveweb:better-xml-parse

Conversation

@yzqzss
Copy link
Copy Markdown
Collaborator

@yzqzss yzqzss commented Nov 9, 2024

This PR is based on #155


  • Introduce lax parsing mode, so that we can get URLs from unbalanced or malformed XML.
  • Return parsed URLs instead of nil when an error occurs.
  • Replace TestXMLBodyReadError test with TestXMLBodySyntaxEOFError.

@yzqzss yzqzss force-pushed the better-xml-parse branch 2 times, most recently from 163e2f0 to 0482f25 Compare November 9, 2024 18:52
@yzqzss yzqzss marked this pull request as draft November 9, 2024 18:54
@yzqzss yzqzss marked this pull request as ready for review November 12, 2024 03:21
Copy link
Copy Markdown
Collaborator

@CorentinB CorentinB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@CorentinB CorentinB merged commit 5719f94 into internetarchive:main Nov 12, 2024
@yzqzss yzqzss deleted the better-xml-parse branch March 11, 2025 07:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants