Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/xml: incompatible changes in the Go 1.21.0 #61881

Closed
xuri opened this issue Aug 9, 2023 · 15 comments
Closed

encoding/xml: incompatible changes in the Go 1.21.0 #61881

xuri opened this issue Aug 9, 2023 · 15 comments
Assignees
Labels
NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@xuri
Copy link
Contributor

xuri commented Aug 9, 2023

What version of Go are you using (go version)?

Go1.21.0

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

darwin/amd64

What did you do?

There are some incompatible changes in the Go 1.21.0 encoding/xml library. Please take a look at following example:

package main

import (
    "encoding/xml"
    "fmt"
)

func main() {
    type B struct {
        XMLName xml.Name `xml:"b"`
        C       bool     `xml:"c"`
    }

    type A struct {
        XMLName xml.Name `xml:"http://example.com a"`
        B       B
    }

    var a A

    input := `<a xmlns="http://example.com"><b><c>true</c></b></a>`

    if err := xml.Unmarshal([]byte(input), &a); err != nil {
        fmt.Print(err)
    }

    output, err := xml.Marshal(&a)
    if err != nil {
        fmt.Println(err)
    }

    fmt.Printf("input:  %s\r\noutput: %s", input, string(output))
}

What did you expect to see?

Output by Go 1.20.7 and previous Go released version

input:  <a xmlns="http://example.com"><b><c>true</c></b></a>
output: <a xmlns="http://example.com"><b><c>true</c></b></a>

What did you see instead?

Output by Go 1.21.0

input:  <a xmlns="http://example.com"><b>true</b></a>
output: <a xmlns="http://example.com"><b xmlns=""><c>true</c></b></a>

There are new empty XML namespace xmlns attributes in the serialized output, which made the input and output XML content inconsistent without any modify.

I created a patch for it and still waiting for a reply, follow up to https://go.dev/cl/466295. Relates to #58401, and external Excelize project issues #1465, #1595 and #1603.

Thanks.

@gopherbot
Copy link

Change https://go.dev/cl/466295 mentions this issue: encoding/xml: overriding by empty namespace when no new name declaration

@mknyszek
Copy link
Contributor

mknyszek commented Aug 9, 2023

CC @rsc via https://dev.golang.org/owners

@mknyszek mknyszek added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Aug 9, 2023
@mknyszek mknyszek added this to the Backlog milestone Aug 9, 2023
@bcmills bcmills modified the milestones: Backlog, Go1.22 Aug 9, 2023
@bcmills
Copy link
Contributor

bcmills commented Aug 9, 2023

Seems like this was probably introduced by https://go.dev/cl/108796 (for #7113)?

(attn @iwdgo @ianlancetaylor)

@bcmills
Copy link
Contributor

bcmills commented Aug 9, 2023

Per #56986, though, it seems like this change it behavior should at least have been guarded by a new GODEBUG value. 🤔

@Hr0bar
Copy link

Hr0bar commented Aug 11, 2023

Is the milestone Go 1.22 going to be kept ? Shouldnt this be fixed in 1.21 minor version ?

@jfesler
Copy link

jfesler commented Aug 15, 2023

This has me in a bind. :(. I'm having to advise people using my code to copy go1.20's encoding/xml on top of go1.21 as a stopgap.

Extra annoying this wasn't in the release notes.

@bcmills
Copy link
Contributor

bcmills commented Aug 15, 2023

@gopherbot, please backport to Go 1.21. Per #56986 this should at least have a release note and a GODEBUG guard.

@gopherbot
Copy link

Backport issue(s) opened: #62051 (for 1.21).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://go.dev/wiki/MinorReleases.

@gopherbot
Copy link

Change https://go.dev/cl/522316 mentions this issue: [release-branch.go1.21] encoding/xml: overriding by empty namespace when no new name declaration

@bronger
Copy link

bronger commented Aug 25, 2023

Is the new behaviour considered incorrect, or should it be simply be made conditional with GODEBUG?

And, is it possible to get rid of the explicit namespace at <b>? One can set it easily to the same as <a>, but can one avoid it? (Yes, I know that on the XML level, both variants are equivalent. Still …)

@ianlancetaylor
Copy link
Contributor

The new behavior is considered incorrect, and we will fix it in the next minor release.

@fithisux
Copy link

fithisux commented Aug 26, 2023

The new behavior is considered incorrect, and we will fix it in the next minor release.

"Considered" or "is"?

My understanding though is that the two "different" outputs from the point of view of standard can be the same depending on the requirements.

https://stackoverflow.com/questions/1587891/is-xmlns-a-valid-xml-namespace

@mpwalkerdine
Copy link

mpwalkerdine commented Aug 26, 2023

The new behavior is considered incorrect, and we will fix it in the next minor release.

"Considered" or "is"?

My understanding though is that the two "different" outputs from the point of view of standard can be the same depending on the requirements.

https://stackoverflow.com/questions/1587891/is-xmlns-a-valid-xml-namespace

As I understand it, including an empty namespace attribute is not the same as omitting it. The accepted answer in the link says as much: "It is legal, and this is the way to bring an element into the global namespace."

The new behaviour which is considered incorrect is the change to the output when encoding Go data structures into XML. It is no longer possible to define what the element name will be without also overriding the default namespace, i.e. the one from the parent element.

For example, this struct defines a name without a namespace:

type Item struct {
	XMLName xml.Name `xml:"item"`
	Value   string `xml:"value,attr"`
}

It used to be possible to include this in another struct, which has a namespace definition, and have it inherit that one.

type Names struct {
	XMLName xml.Name `xml:"http://some.ns.org/names names"`
	Values  []Item
}

var names = Names{Values: []Item{{Value: "a"}, {Value: "b"}}}

I would expect names to marshal as:

<names xmlns="http://some.ns.org/names">
  <item value="a"></item>
  <item value="b"></item>
</names>

Note there is no xmlns attribute on each item, so they're considered to be in http://some.ns.org/names by default. But instead what we now see in Go 1.21 is:

<names xmlns="http://some.ns.org/names">
  <item value="a" xmlns=""></item>
  <item value="b" xmlns=""></item>
</names>

Which means those 2 items are no longer in http://some.ns.org/names, they're in the global namespace.

You can see this in the playground if you switch between Go 1.20 and 1.21: https://go.dev/play/p/i1pHuben06z

So it seems this isn't so much about XML standards, but about being able to write Go code which produces whichever XML representation is needed in any given context. Before the bug fix for #7113, it wasn't possible to override the default namespace by including an empty namespace, which is a valid requirement. Now it's no longer possible not to, which is also valid.

@bronger
Copy link

bronger commented Aug 26, 2023

You can set the NS of the children explicitly to the parent’s one. But this is awkward, and it leads to a verbose XML serialisation because the marshaller does not detect that.

@fithisux
Copy link

Thank you for the deatiled explanation.

gopherbot pushed a commit that referenced this issue Aug 30, 2023
…hen no new name declaration

The unmarshal and marshal XML text should be consistent if not modified deserialize variable.

For #61881
Fixes #62051

Change-Id: I475f7b05211b618685597d3ff20b97e3bbeaf8f8
GitHub-Last-Rev: 6831c77
GitHub-Pull-Request: #58401
Reviewed-on: https://go-review.googlesource.com/c/go/+/522316
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Commit-Queue: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
@dmitshur dmitshur added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Aug 30, 2023
nics added a commit to ugent-library/oai-service that referenced this issue Aug 31, 2023
nicolasfranck added a commit to ugent-library/old-people-service that referenced this issue Sep 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

Successfully merging a pull request may close this issue.