Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/xml: Unmarshalling empty XML attribute to int throws error #19333

Closed
neocortical opened this issue Mar 1, 2017 · 18 comments
Closed
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Milestone

Comments

@neocortical
Copy link

neocortical commented Mar 1, 2017

Go version:
1.8 (regression from 1.7.x --> 1.8.0)

Environment:
Dockerized Alpine (golang:1.8-alpine and golang:1.7-alpine)

Description:
In Go 1.7, an XML entity like <Object foo=""></Object> could be unmarshalled into a struct where Foo is an int value without throwing an error. When upgrading to Go 1.8.0, the same code throws an error due to strconv.ParseInt being called with the empty string as an argument.

This seems to be a behavioral regression between Go 1.7.x and Go 1.8.0.

Example code:
https://play.golang.org/p/GIsONzXQQQ

Expected behavior:
Example code should return without error, printing Foo: 0

Actual behavior:
An error is thrown, indicating that ParseInt was called with an empty string.

Compiling this same code with Go 1.7.x results in the expected behavior.

@tobbbles
Copy link

tobbbles commented Mar 6, 2017

Recently tried upgrading to 1.8 and ran into this.

Many of our XML-based 3rd parties leave attributes empty for prices, price="". I think unmarshalling to the type default is expected here.

@SamWhited
Copy link
Member

SamWhited commented Mar 9, 2017

This appears to have been introduced by 2c58cb3 which returns an error that was previously ignored.

This is a wider change that may have affected several types; I'm unsure if it breaks the compatibility promise, or was fixing a bug. I'm inclined to think that it doesn't make sense for ints to unmarshal to their zero value in XML, but this is one of those weird cases where Go types/XML aren't really compatible and any mapping between them will have to do some funky stuff, and if it's always been doing this maybe we should special case it and ignore the err result of strconv.

/cc @ericlagergren (author), @bradfitz (reviewer)

@bradfitz
Copy link
Contributor

bradfitz commented Mar 9, 2017

@rsc, do you have opinions here?

@bradfitz bradfitz added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Mar 9, 2017
@SamWhited
Copy link
Member

SamWhited commented Mar 9, 2017

Also worth noting: before that commit unmarshaling values did return an strconv error, it was only unmarshaling attributes that ignored it. Test:

// Issue 19333. Unmarshaling empty attr or element into int must not error.
func TestUnmarsdhalInt(t *testing.T) {
	t.Run("Attr", func(t *testing.T) {
		v := &struct {
			XMLName Name `xml:"int"`
			Foo     int  `xml:"foo,attr"`
		}{}
		if err := Unmarshal([]byte(`<int foo=""></int>`), v); err != nil {
			t.Errorf("did not expect error, but got `%s`", err)
		}
		if v.Foo != 0 {
			t.Errorf("want 0, have %d", v.Foo)
		}
	})
	t.Run("Value", func(t *testing.T) {
		v := &struct {
			XMLName Name `xml:"int"`
			Foo     int  `xml:"foo"`
		}{}
		if err := Unmarshal([]byte(`<int><foo></foo></int>`), v); err != nil {
			t.Errorf("did not expect error, but got `%s`", err)
		}
		if v.Foo != 0 {
			t.Errorf("want 0, have %d", v.Foo)
		}
	})
}

@ericlagergren
Copy link
Contributor

ericlagergren commented Mar 9, 2017

Like @SamWhited said, it makes the most sense to special case attr="" as the Go type's zero value.

I created the original issue because I tried to marshal a single letter into a byte and instead of returning an error saying, "you can't do that" it just silently left the field as 0.

The alternative (reverting 2c58cb3) means if the client sends an invalid XML request (e.g., attr="foobar" into an int64) the server has no way to determine whether it was a bad request or the field was meant to be 0.

I believe that behavior is definitely a bug in the XML package.

@bradfitz bradfitz added this to the Go1.8.1 milestone Mar 21, 2017
@s-mang
Copy link
Contributor

s-mang commented Mar 22, 2017

related to #13417
and my open CL https://go-review.googlesource.com/c/38386

@s-mang
Copy link
Contributor

s-mang commented Mar 23, 2017

Fixed via 0a0186f

@s-mang s-mang closed this as completed Mar 23, 2017
@rsc
Copy link
Contributor

rsc commented Apr 5, 2017

Reopening for backport I guess.

@rsc rsc reopened this Apr 5, 2017
@rsc
Copy link
Contributor

rsc commented Apr 5, 2017

I can't figure out what is going on here. The linked commit 0a0186f changes the behavior of the XML package beyond this bug fix. It is not appropriate for Go 1.8.1. Leaving this open because we need to decide whether to live with the bug or prepare a different release-specific fix.

@SamWhited
Copy link
Member

SamWhited commented Apr 5, 2017

Could we revert 2c58cb3 for 1.8.1 only?

@ericlagergren
Copy link
Contributor

ericlagergren commented Apr 5, 2017 via email

@rsc
Copy link
Contributor

rsc commented Apr 5, 2017

OK but the bug was marked Go 1.8.1 and closed with a link to that commit. That commit is an inappropriate fix for Go 1.8.1 (and not cherry-picked yet anyway), so reopening.

@rsc
Copy link
Contributor

rsc commented Apr 5, 2017

@SamWhited, yes, that seems to be the solution. I will send a CL.

@s-mang
Copy link
Contributor

s-mang commented Apr 5, 2017

My apologies for closing.

@rsc
Copy link
Contributor

rsc commented Apr 5, 2017

No worries @adams-sarah, it's all very confusing around point release milestones. We hope to have a bot help with this soon.

@rsc
Copy link
Contributor

rsc commented Apr 5, 2017

Here is a test program. It covers the cases in this issue (attributes) and the ones in #13417 (elements):

package main

import (
	"encoding/xml"
	"fmt"
)

func main() {
	type X struct {
		XMLName xml.Name `xml:"X"`
		A       int      `xml:",attr"`
		C       int
	}

	var tests = []struct {
		input string
		ok    bool
	}{
		{`<X></X>`, true},
		{`<X A=""></X>`, true},
		{`<X A="bad"></X>`, false},
		{`<X></X>`, true},
		{`<X><C></C></X>`, true},
		{`<X><C/></X>`, true},
		{`<X><C>bad</C></X>`, false},
	}

	for _, tt := range tests {
		err := xml.Unmarshal([]byte(tt.input), new(X))
		if err != nil {
			fmt.Printf("%-20s ERROR %v\n", tt.input, err)
		} else {
			fmt.Printf("%-20s ok\n", tt.input)
		}
	}
}

Go 1.2 through Go 1.7 were consistent: attributes unchecked, children strictly checked:

$ go1.7 run /tmp/x.go
<X></X>              ok
<X A=""></X>         ok
<X A="bad"></X>      ok
<X></X>              ok
<X><C></C></X>       ERROR strconv.ParseInt: parsing "": invalid syntax
<X><C/></X>          ERROR strconv.ParseInt: parsing "": invalid syntax
<X><C>bad</C></X>    ERROR strconv.ParseInt: parsing "bad": invalid syntax
$

Go 1.8 made attributes strictly checked, matching children:

$ go1.8 run /tmp/x.go
<X></X>              ok
<X A=""></X>         ERROR strconv.ParseInt: parsing "": invalid syntax
<X A="bad"></X>      ERROR strconv.ParseInt: parsing "bad": invalid syntax
<X></X>              ok
<X><C></C></X>       ERROR strconv.ParseInt: parsing "": invalid syntax
<X><C/></X>          ERROR strconv.ParseInt: parsing "": invalid syntax
<X><C>bad</C></X>    ERROR strconv.ParseInt: parsing "bad": invalid syntax
$ 

Go 1.9 will (at least right now) relax things so that only non-empty bad inputs are checked:

$ go run /tmp/x.go  # Go 1.9 development
<X></X>              ok
<X A=""></X>         ok
<X A="bad"></X>      ERROR strconv.ParseInt: parsing "bad": invalid syntax
<X></X>              ok
<X><C></C></X>       ok
<X><C/></X>          ok
<X><C>bad</C></X>    ERROR strconv.ParseInt: parsing "bad": invalid syntax
$ 

For Go 1.8.1, I will revert the checking behavior back to Go 1.7 for now. We can try another attempt at attribute checking, with @adams-sarah's strictness relaxation applied both to attributes and children, in Go 1.9.

$ go run /tmp/x.go  # Go 1.8.1 development
<X></X>              ok
<X A=""></X>         ok
<X A="bad"></X>      ok
<X></X>              ok
<X><C></C></X>       ERROR strconv.ParseInt: parsing "": invalid syntax
<X><C/></X>          ERROR strconv.ParseInt: parsing "": invalid syntax
<X><C>bad</C></X>    ERROR strconv.ParseInt: parsing "bad": invalid syntax

@gopherbot
Copy link

CL https://golang.org/cl/39607 mentions this issue.

@rsc
Copy link
Contributor

rsc commented Apr 5, 2017

Cherry-picked.

@rsc rsc closed this as completed Apr 5, 2017
gopherbot pushed a commit that referenced this issue Apr 5, 2017
…ntax, like Go 1.7

Consider this struct, which expects an attribute A and a child C both ints:

    type X struct {
        XMLName xml.Name `xml:"X"`
        A       int      `xml:",attr"`
        C       int
    }

Go 1.2 through Go 1.7 were consistent: attributes unchecked,
children strictly checked:

    $ go1.7 run /tmp/x.go
    <X></X>              ok
    <X A=""></X>         ok
    <X A="bad"></X>      ok
    <X></X>              ok
    <X><C></C></X>       ERROR strconv.ParseInt: parsing "": invalid syntax
    <X><C/></X>          ERROR strconv.ParseInt: parsing "": invalid syntax
    <X><C>bad</C></X>    ERROR strconv.ParseInt: parsing "bad": invalid syntax
    $

Go 1.8 made attributes strictly checked, matching children:

    $ go1.8 run /tmp/x.go
    <X></X>              ok
    <X A=""></X>         ERROR strconv.ParseInt: parsing "": invalid syntax
    <X A="bad"></X>      ERROR strconv.ParseInt: parsing "bad": invalid syntax
    <X></X>              ok
    <X><C></C></X>       ERROR strconv.ParseInt: parsing "": invalid syntax
    <X><C/></X>          ERROR strconv.ParseInt: parsing "": invalid syntax
    <X><C>bad</C></X>    ERROR strconv.ParseInt: parsing "bad": invalid syntax
    $

but this broke XML code that had empty attributes (#19333).

In Go 1.9 we plan to start allowing empty children (#13417).
The fix for that will also make empty attributes work again:

    $ go run /tmp/x.go  # Go 1.9 development
    <X></X>              ok
    <X A=""></X>         ok
    <X A="bad"></X>      ERROR strconv.ParseInt: parsing "bad": invalid syntax
    <X></X>              ok
    <X><C></C></X>       ok
    <X><C/></X>          ok
    <X><C>bad</C></X>    ERROR strconv.ParseInt: parsing "bad": invalid syntax
    $

For Go 1.8.1, we want to restore the empty attribute behavior
to match Go 1.7 but not yet change the child behavior as planned for Go 1.9,
since that change hasn't been through release testing.

Instead, restore the more lax Go 1.7 behavior, so that XML files
with empty attributes will not be broken until Go 1.9:

    $ go run /tmp/x.go  # after this CL
    <X></X>              ok
    <X A=""></X>         ok
    <X A="bad"></X>      ok
    <X></X>              ok
    <X><C></C></X>       ERROR strconv.ParseInt: parsing "": invalid syntax
    <X><C/></X>          ERROR strconv.ParseInt: parsing "": invalid syntax
    <X><C>bad</C></X>    ERROR strconv.ParseInt: parsing "bad": invalid syntax
    $

Fixes #19333.

Change-Id: I3d38ebd2509f5b6ea3fd4856327f887f9a1a8085
Reviewed-on: https://go-review.googlesource.com/39607
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Sarah Adams <shadams@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
@golang golang locked and limited conversation to collaborators Apr 5, 2018
@rsc rsc removed their assignment Jun 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Projects
None yet
Development

No branches or pull requests

8 participants