Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Xml prefixed attributes do not appropriately find namespace #20

Open
jbrayfaithlife opened this issue Apr 14, 2023 · 1 comment
Open
Labels
bug Something isn't working

Comments

@jbrayfaithlife
Copy link
Contributor

Bug Report

During Xml parsing, attributes with an xml prefix ought to be associated with the xml namespace, even if such a namespace is not explicitly declared. According to https://www.w3.org/TR/REC-xml-names/#ns-decl :

The prefix xml is by definition bound to the namespace name http://www.w3.org/XML/1998/namespace. It MAY, but need not, be declared, and MUST NOT be bound to any other namespace name. Other prefixes MUST NOT be bound to this namespace name, and it MUST NOT be declared as the default namespace.

Prerequisites

  • [/] Can you reproduce the problem in a MWE?
  • [/] Are you running the latest version of AngleSharp?
  • [/] Did you check the FAQs to see if that helps you?
  • [/] Are you reporting to the correct repository? (there are multiple AngleSharp libraries, e.g., AngleSharp.Css for CSS support)
  • [/] Did you perform a search in the issues?

For more information, see the CONTRIBUTING guide.

Description

Xml prefixed attributes ought to be associated with the Xml namespace even if it has not been explicitly declared.

Steps to Reproduce

var xmlParser = new XmlParser();
var doc = xmlParser.ParseDocument("<xml xml:lang=\"en\">Test</xml>");
using (var stringWriter = new StringWriter()){
	doc.ToHtml(stringWriter, new XhtmlMarkupFormatter());
	stringWriter.ToString().Dump();
}

Expected behavior: Output should be <xml xml:lang=\"en\">Test</xml>

Actual behavior: Output is <xml lang="en">Test</xml>

Compare with the output to the following linqpad script

var xmlParser = new XmlParser();
var doc = xmlParser.ParseDocument("<xml xmlns:xml=\"http://www.w3.org/XML/1998/namespace\" xml:lang=\"en\">Test</xml>");
using (var stringWriter = new StringWriter()){
	doc.ToHtml(stringWriter, new XhtmlMarkupFormatter());
	stringWriter.ToString().Dump();
}
	

Output: <xml xmlns:="http://www.w3.org/XML/1998/namespace" xml:lang="en">Test</xml>

Environment details: Win 10; .NET 6.0.15

Possible Solution

In the XmlDomBuilder we need to replace this code:
with this:

if (prefix.Is(NamespaceNames.XmlPrefix))
{
    ns = NamespaceNames.XmlUri;
}
else if (!prefix.Is(NamespaceNames.XmlNsPrefix))
{
    ns = CurrentNode.LookupNamespaceUri(prefix);
}

A PR can be made to this effect with a test by which I confirmed the bug and solution.

@FlorianRappl
Copy link
Contributor

Sounds fine to me!

FlorianRappl added a commit that referenced this issue Aug 24, 2023
Auto-identify the XML namespace Issue 20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants