Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/xml: Encoder duplicates namespace tags #7535

Open
gopherbot opened this issue Mar 13, 2014 · 12 comments

Comments

Projects
None yet
@gopherbot
Copy link

commented Mar 13, 2014

by seanerussell:

== What does 'go version' print?

go version go1.2.1 darwin/amd64

== What steps reproduce the problem?

http://play.golang.org/p/3_oUruPYhq

== What happened?

Encoder.EncodeToken duplicates namespace attributes.

== What should have happened instead?

The encoded document should have had a single namespace attribute.

== Please provide any additional information below.

Attribute names on an element must be unique; this is a well-formedness constraint per
the XML 1.0 specification (http://www.w3.org/TR/xml/#uniqattspec). Per the
specification, both validating and non-validating parsers must report well-formedness
violations (http://www.w3.org/TR/xml/#sec-conformance).

Encoding and decoding XML documents should be idempotent and produce equivalent
documents.  This issue means that not only that decoding and encoding the result
produces a non-equivalent document, but that the document it generates is
not-well-formed.

This issue only occurs with namespaces.  Normal attributes are handled correctly.
@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented May 9, 2014

Comment 1:

Labels changed: added repo-main, release-none.

@rogpeppe

This comment has been minimized.

Copy link
Contributor

commented Nov 21, 2014

Comment 2:

Here's another example: http://play.golang.org/p/GTjuLNxE-d
This lack of encode/decode idempotency makes things awkward when trying to test for
expected output, as well as the lack of well formedness.
@gopherbot

This comment has been minimized.

Copy link
Author

commented Dec 2, 2014

Comment 3:

CL https://golang.org/cl/179540043 mentions this issue.

@gopherbot gopherbot added new labels Dec 2, 2014

@bradfitz bradfitz removed the new label Dec 18, 2014

@rogpeppe rogpeppe closed this in 3be158d Feb 13, 2015

@mikioh mikioh removed the release-none label Feb 15, 2015

@mikioh mikioh added this to the Go1.5 milestone Feb 15, 2015

@mikioh mikioh modified the milestones: Go1.6, Go1.5 Jul 28, 2015

@mikioh mikioh reopened this Jul 28, 2015

@mikioh

This comment has been minimized.

Copy link
Contributor

commented Jul 28, 2015

See #11841

@rsc

This comment has been minimized.

Copy link
Contributor

commented Nov 25, 2015

Blocked on #13400.

@rsc rsc modified the milestones: Go1.7, Go1.6 Nov 25, 2015

@pdw-mb

This comment has been minimized.

Copy link

commented Feb 16, 2016

I would expect Token to strip xmlns attributes: if you want them, use RawToken.

The change below does that and appears to fix both of the above examples:

https://code.blinkace.com/go/xml/commit/bded824c18c5a2595e750c920ea5e7437607900c

The code base above also exposes the current set of namespace bindings on Decoder, which is generally more useful that having the xmlns attributes themselves (see #12406)

@rsc rsc modified the milestones: Go1.8, Go1.7 May 18, 2016

@quentinmit quentinmit added the NeedsFix label Oct 7, 2016

@rsc rsc modified the milestones: Go1.9Early, Go1.8 Oct 26, 2016

@alexellis

This comment has been minimized.

Copy link

commented Feb 8, 2017

I wanted to know if there is a work around for this yet (duplication of namespaces)?

I want to use a decoder/encoder combination with Token() to selectively reconstitute an XML document (.NET csproj format)

	for {
		token, _ := decoder.Token()

		encoder.EncodeToken(token)
		if token == nil {
			break
		}
     }

Creating and maintaining all the structs to demarshal into an object is not a suitable solution.

Input:

<?xml version="1.0" encoding="utf-8"?>
<Project ToolsVersion="4.0" DefaultTargets="Build" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
  <PropertyGroup>
    <Configuration Condition=" '$(Configuration)' == '' ">Debug</Configuration>
    <Platform Condition=" '$(Platform)' == '' ">AnyCPU</Platform>
    <ProductVersion>9.0.30729</ProductVersion>
    <SchemaVersion>2.0</SchemaVersion>
    <ProjectGuid>{153CB7F7-EB7B-44F2-B53E-F157288E3F19}</ProjectGuid>
    <OutputType>Library</OutputType>
    <AppDesignerFolder>Properties</AppDesignerFolder>

Output:

<?xml version="1.0" encoding="utf-8"?>
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003" ToolsVersion="4.0" DefaultTargets="Build" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
  <PropertyGroup xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
    <Configuration xmlns="http://schemas.microsoft.com/developer/msbuild/2003" Condition=" &#39;$(Configuration)&#39; == &#39;&#39; ">Debug</Configuration>
    <Platform xmlns="http://schemas.microsoft.com/developer/msbuild/2003" Condition=" &#39;$(Platform)&#39; == &#39;&#39; ">AnyCPU</Platform>
    <ProductVersion xmlns="http://schemas.microsoft.com/developer/msbuild/2003">9.0.30729</ProductVersion>
    <SchemaVersion xmlns="http://schemas.microsoft.com/developer/msbuild/2003">2.0</SchemaVersion>
    <ProjectGuid xmlns="http://schemas.microsoft.com/developer/msbuild/2003">{153CB7F7-EB7B-44F2-B53E-F157288E3F19}</ProjectGuid>
    <OutputType xmlns="http://schemas.microsoft.com/developer/msbuild/2003">Library</OutputType>
    <AppDesignerFolder xmlns="http://schemas.microsoft.com/developer/msbuild/2003">Properties</AppDesignerFolder>

aubm added a commit to aubm/gomega that referenced this issue Apr 29, 2017

Adds support for MatchXML
Actual and expected can not be pretty printed in the test output because
there are issues in the encoding/xml package related to deplicated
namespaces. See golang/go#7535

aubm added a commit to aubm/gomega that referenced this issue Apr 29, 2017

Adds support for MatchXML
Actual and expected can not be pretty printed in the test output because
there are issues in the encoding/xml package related to deplicated
namespaces. See golang/go#7535

aubm added a commit to aubm/gomega that referenced this issue Apr 29, 2017

Adds support for MatchXML
Actual and expected can not be pretty printed in the test output because
there are issues in the encoding/xml package related to deplicated
namespaces. See golang/go#7535

aubm added a commit to aubm/gomega that referenced this issue Apr 29, 2017

Adds support for MatchXML
Actual and expected can not be pretty printed in the test output because
there are issues in the encoding/xml package related to deplicated
namespaces. See golang/go#7535

aubm added a commit to aubm/gomega that referenced this issue Apr 30, 2017

Adds support for MatchXML
Actual and expected can not be pretty printed in the test output because
there are issues in the encoding/xml package related to deplicated
namespaces. See golang/go#7535

Because Go has no support for collecting all XML attributes before 1.8
(see [here](golang/go@c1a1328)),
the two following XMLs will be equal for 1.7 and before:

<person gender="female">

<person gender="male">

aubm added a commit to aubm/gomega that referenced this issue Apr 30, 2017

Adds support for MatchXML
Actual and expected can not be pretty printed in the test output because
there are issues in the encoding/xml package related to deplicated
namespaces. See golang/go#7535

Because Go has no support for collecting all XML attributes before 1.8
(see [here](golang/go@c1a1328)),
the two following XMLs will be equal for 1.7 and before:

```
<person gender="female">
```

```
<person gender="male">
```

aubm added a commit to aubm/gomega that referenced this issue Apr 30, 2017

Adds support for MatchXML
Actual and expected can not be pretty printed in the test output because
there are issues in the encoding/xml package related to deplicated
namespaces. See golang/go#7535

Because Go has no support for collecting all XML attributes before 1.8
(see [here](golang/go@c1a1328)),
the two following XMLs will be equal for 1.7 and before:

```
<person gender="female">
```

```
<person gender="male">
```

@bradfitz bradfitz modified the milestones: Go1.9Early, Go1.10Early May 3, 2017

aubm added a commit to aubm/gomega-matchers that referenced this issue May 4, 2017

feat: adds support for MatchXML
Actual and expected can not be pretty printed in the test output because
there are issues in the encoding/xml package related to deplicated
namespaces. See golang/go#7535

@bradfitz bradfitz modified the milestones: Go1.10Early, Go1.10 Jun 14, 2017

@SamWhited

This comment has been minimized.

Copy link
Member

commented Jul 2, 2017

I ran into this today for the first time (suprisingly).

I wouldn't mind working on this once some of my other XML patches are merged if a decision can be made on how to handle it. Would stripping xmlns attributes from Token() (but leaving them for RawToken()) violate the compatibility guarantee? That seems sensible to me, but I suspect it's not possible this late in the game. Alternatively, maybe we could just not write a second XMLNS tag if one already exists.

UPDATE: There appear to be tests that specifically check for this behavior, but I have no idea why as it seems categorically wrong. Maybe my naive understanding of XML is wrong (as it so often is)?

@gopherbot

This comment has been minimized.

Copy link
Author

commented Jul 2, 2017

CL https://golang.org/cl/47357 mentions this issue.

@rsc rsc modified the milestones: Go1.10, Go1.11 Nov 22, 2017

@gopherbot

This comment has been minimized.

Copy link
Author

commented Apr 18, 2018

Change https://golang.org/cl/107755 mentions this issue: encoding/xml : fix duplication of namespace tags by encoder

@iWdGo

This comment has been minimized.

Copy link
Contributor

commented Apr 18, 2018

A tag prefix identifies the name space of the tag (https://www.w3.org/TR/xml/#sec-starttags)
and not the default name space like xmlns="...". Writing the prefix is incorrect when it is
bound to a name space using the standard xmlns:prefix="..." attribute.
This fix skips this print and duplication is avoided in line with name space standard in reference. It fixes this issue and well-formed XML is always produced.
To keep the previous behavior, the prefix is printed in all other cases.

Some logic was added to handle exceptions. The produced tag includes strings of attributes like xmlns="space" xmlns:_xmlns="xmlns" _prefix="..."
With the absence of duplication, these strings do not appear anymore and have been removed in all wants of tests.

Only, explicit namespace and a colliding prefix can produce not well-formed XML because of attributes
like xmlns:x="x" which are added by the described exception handling.

@gopherbot

This comment has been minimized.

Copy link
Author

commented Apr 27, 2018

Change https://golang.org/cl/109855 mentions this issue: encoding/xml : Fixes to enforce XML namespace standard

@bradfitz bradfitz modified the milestones: Go1.11, Go1.12 May 18, 2018

@gopherbot gopherbot modified the milestones: Go1.12, Unplanned May 23, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.