Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/xml: Unmarshalling empty XML attribute to int throws error #19333

Closed
neocortical opened this Issue Mar 1, 2017 · 18 comments

Comments

Projects
None yet
8 participants
@neocortical
Copy link

neocortical commented Mar 1, 2017

Go version:
1.8 (regression from 1.7.x --> 1.8.0)

Environment:
Dockerized Alpine (golang:1.8-alpine and golang:1.7-alpine)

Description:
In Go 1.7, an XML entity like <Object foo=""></Object> could be unmarshalled into a struct where Foo is an int value without throwing an error. When upgrading to Go 1.8.0, the same code throws an error due to strconv.ParseInt being called with the empty string as an argument.

This seems to be a behavioral regression between Go 1.7.x and Go 1.8.0.

Example code:
https://play.golang.org/p/GIsONzXQQQ

Expected behavior:
Example code should return without error, printing Foo: 0

Actual behavior:
An error is thrown, indicating that ParseInt was called with an empty string.

Compiling this same code with Go 1.7.x results in the expected behavior.

@tobbbles

This comment has been minimized.

Copy link

tobbbles commented Mar 6, 2017

Recently tried upgrading to 1.8 and ran into this.

Many of our XML-based 3rd parties leave attributes empty for prices, price="". I think unmarshalling to the type default is expected here.

@SamWhited

This comment has been minimized.

Copy link
Member

SamWhited commented Mar 9, 2017

This appears to have been introduced by 2c58cb3 which returns an error that was previously ignored.

This is a wider change that may have affected several types; I'm unsure if it breaks the compatibility promise, or was fixing a bug. I'm inclined to think that it doesn't make sense for ints to unmarshal to their zero value in XML, but this is one of those weird cases where Go types/XML aren't really compatible and any mapping between them will have to do some funky stuff, and if it's always been doing this maybe we should special case it and ignore the err result of strconv.

/cc @ericlagergren (author), @bradfitz (reviewer)

@bradfitz

This comment has been minimized.

Copy link
Member

bradfitz commented Mar 9, 2017

@rsc, do you have opinions here?

@SamWhited

This comment has been minimized.

Copy link
Member

SamWhited commented Mar 9, 2017

Also worth noting: before that commit unmarshaling values did return an strconv error, it was only unmarshaling attributes that ignored it. Test:

// Issue 19333. Unmarshaling empty attr or element into int must not error.
func TestUnmarsdhalInt(t *testing.T) {
	t.Run("Attr", func(t *testing.T) {
		v := &struct {
			XMLName Name `xml:"int"`
			Foo     int  `xml:"foo,attr"`
		}{}
		if err := Unmarshal([]byte(`<int foo=""></int>`), v); err != nil {
			t.Errorf("did not expect error, but got `%s`", err)
		}
		if v.Foo != 0 {
			t.Errorf("want 0, have %d", v.Foo)
		}
	})
	t.Run("Value", func(t *testing.T) {
		v := &struct {
			XMLName Name `xml:"int"`
			Foo     int  `xml:"foo"`
		}{}
		if err := Unmarshal([]byte(`<int><foo></foo></int>`), v); err != nil {
			t.Errorf("did not expect error, but got `%s`", err)
		}
		if v.Foo != 0 {
			t.Errorf("want 0, have %d", v.Foo)
		}
	})
}
@ericlagergren

This comment has been minimized.

Copy link
Contributor

ericlagergren commented Mar 9, 2017

Like @SamWhited said, it makes the most sense to special case attr="" as the Go type's zero value.

I created the original issue because I tried to marshal a single letter into a byte and instead of returning an error saying, "you can't do that" it just silently left the field as 0.

The alternative (reverting 2c58cb3) means if the client sends an invalid XML request (e.g., attr="foobar" into an int64) the server has no way to determine whether it was a bad request or the field was meant to be 0.

I believe that behavior is definitely a bug in the XML package.

@bradfitz bradfitz added this to the Go1.8.1 milestone Mar 21, 2017

@adams-sarah

This comment has been minimized.

Copy link
Contributor

adams-sarah commented Mar 22, 2017

@adams-sarah

This comment has been minimized.

Copy link
Contributor

adams-sarah commented Mar 23, 2017

Fixed via 0a0186f

@rsc

This comment has been minimized.

Copy link
Contributor

rsc commented Apr 5, 2017

Reopening for backport I guess.

@rsc rsc reopened this Apr 5, 2017

@rsc

This comment has been minimized.

Copy link
Contributor

rsc commented Apr 5, 2017

I can't figure out what is going on here. The linked commit 0a0186f changes the behavior of the XML package beyond this bug fix. It is not appropriate for Go 1.8.1. Leaving this open because we need to decide whether to live with the bug or prepare a different release-specific fix.

@SamWhited

This comment has been minimized.

Copy link
Member

SamWhited commented Apr 5, 2017

Could we revert 2c58cb3 for 1.8.1 only?

@ericlagergren

This comment has been minimized.

Copy link
Contributor

ericlagergren commented Apr 5, 2017

@rsc

This comment has been minimized.

Copy link
Contributor

rsc commented Apr 5, 2017

OK but the bug was marked Go 1.8.1 and closed with a link to that commit. That commit is an inappropriate fix for Go 1.8.1 (and not cherry-picked yet anyway), so reopening.

@rsc

This comment has been minimized.

Copy link
Contributor

rsc commented Apr 5, 2017

@SamWhited, yes, that seems to be the solution. I will send a CL.

@adams-sarah

This comment has been minimized.

Copy link
Contributor

adams-sarah commented Apr 5, 2017

My apologies for closing.

@rsc

This comment has been minimized.

Copy link
Contributor

rsc commented Apr 5, 2017

No worries @adams-sarah, it's all very confusing around point release milestones. We hope to have a bot help with this soon.

@rsc

This comment has been minimized.

Copy link
Contributor

rsc commented Apr 5, 2017

Here is a test program. It covers the cases in this issue (attributes) and the ones in #13417 (elements):

package main

import (
	"encoding/xml"
	"fmt"
)

func main() {
	type X struct {
		XMLName xml.Name `xml:"X"`
		A       int      `xml:",attr"`
		C       int
	}

	var tests = []struct {
		input string
		ok    bool
	}{
		{`<X></X>`, true},
		{`<X A=""></X>`, true},
		{`<X A="bad"></X>`, false},
		{`<X></X>`, true},
		{`<X><C></C></X>`, true},
		{`<X><C/></X>`, true},
		{`<X><C>bad</C></X>`, false},
	}

	for _, tt := range tests {
		err := xml.Unmarshal([]byte(tt.input), new(X))
		if err != nil {
			fmt.Printf("%-20s ERROR %v\n", tt.input, err)
		} else {
			fmt.Printf("%-20s ok\n", tt.input)
		}
	}
}

Go 1.2 through Go 1.7 were consistent: attributes unchecked, children strictly checked:

$ go1.7 run /tmp/x.go
<X></X>              ok
<X A=""></X>         ok
<X A="bad"></X>      ok
<X></X>              ok
<X><C></C></X>       ERROR strconv.ParseInt: parsing "": invalid syntax
<X><C/></X>          ERROR strconv.ParseInt: parsing "": invalid syntax
<X><C>bad</C></X>    ERROR strconv.ParseInt: parsing "bad": invalid syntax
$

Go 1.8 made attributes strictly checked, matching children:

$ go1.8 run /tmp/x.go
<X></X>              ok
<X A=""></X>         ERROR strconv.ParseInt: parsing "": invalid syntax
<X A="bad"></X>      ERROR strconv.ParseInt: parsing "bad": invalid syntax
<X></X>              ok
<X><C></C></X>       ERROR strconv.ParseInt: parsing "": invalid syntax
<X><C/></X>          ERROR strconv.ParseInt: parsing "": invalid syntax
<X><C>bad</C></X>    ERROR strconv.ParseInt: parsing "bad": invalid syntax
$ 

Go 1.9 will (at least right now) relax things so that only non-empty bad inputs are checked:

$ go run /tmp/x.go  # Go 1.9 development
<X></X>              ok
<X A=""></X>         ok
<X A="bad"></X>      ERROR strconv.ParseInt: parsing "bad": invalid syntax
<X></X>              ok
<X><C></C></X>       ok
<X><C/></X>          ok
<X><C>bad</C></X>    ERROR strconv.ParseInt: parsing "bad": invalid syntax
$ 

For Go 1.8.1, I will revert the checking behavior back to Go 1.7 for now. We can try another attempt at attribute checking, with @adams-sarah's strictness relaxation applied both to attributes and children, in Go 1.9.

$ go run /tmp/x.go  # Go 1.8.1 development
<X></X>              ok
<X A=""></X>         ok
<X A="bad"></X>      ok
<X></X>              ok
<X><C></C></X>       ERROR strconv.ParseInt: parsing "": invalid syntax
<X><C/></X>          ERROR strconv.ParseInt: parsing "": invalid syntax
<X><C>bad</C></X>    ERROR strconv.ParseInt: parsing "bad": invalid syntax
@gopherbot

This comment has been minimized.

Copy link

gopherbot commented Apr 5, 2017

CL https://golang.org/cl/39607 mentions this issue.

@rsc

This comment has been minimized.

Copy link
Contributor

rsc commented Apr 5, 2017

Cherry-picked.

@rsc rsc closed this Apr 5, 2017

gopherbot pushed a commit that referenced this issue Apr 5, 2017

[release-branch.go1.8] encoding/xml: disable checking of attribute sy…
…ntax, like Go 1.7

Consider this struct, which expects an attribute A and a child C both ints:

    type X struct {
        XMLName xml.Name `xml:"X"`
        A       int      `xml:",attr"`
        C       int
    }

Go 1.2 through Go 1.7 were consistent: attributes unchecked,
children strictly checked:

    $ go1.7 run /tmp/x.go
    <X></X>              ok
    <X A=""></X>         ok
    <X A="bad"></X>      ok
    <X></X>              ok
    <X><C></C></X>       ERROR strconv.ParseInt: parsing "": invalid syntax
    <X><C/></X>          ERROR strconv.ParseInt: parsing "": invalid syntax
    <X><C>bad</C></X>    ERROR strconv.ParseInt: parsing "bad": invalid syntax
    $

Go 1.8 made attributes strictly checked, matching children:

    $ go1.8 run /tmp/x.go
    <X></X>              ok
    <X A=""></X>         ERROR strconv.ParseInt: parsing "": invalid syntax
    <X A="bad"></X>      ERROR strconv.ParseInt: parsing "bad": invalid syntax
    <X></X>              ok
    <X><C></C></X>       ERROR strconv.ParseInt: parsing "": invalid syntax
    <X><C/></X>          ERROR strconv.ParseInt: parsing "": invalid syntax
    <X><C>bad</C></X>    ERROR strconv.ParseInt: parsing "bad": invalid syntax
    $

but this broke XML code that had empty attributes (#19333).

In Go 1.9 we plan to start allowing empty children (#13417).
The fix for that will also make empty attributes work again:

    $ go run /tmp/x.go  # Go 1.9 development
    <X></X>              ok
    <X A=""></X>         ok
    <X A="bad"></X>      ERROR strconv.ParseInt: parsing "bad": invalid syntax
    <X></X>              ok
    <X><C></C></X>       ok
    <X><C/></X>          ok
    <X><C>bad</C></X>    ERROR strconv.ParseInt: parsing "bad": invalid syntax
    $

For Go 1.8.1, we want to restore the empty attribute behavior
to match Go 1.7 but not yet change the child behavior as planned for Go 1.9,
since that change hasn't been through release testing.

Instead, restore the more lax Go 1.7 behavior, so that XML files
with empty attributes will not be broken until Go 1.9:

    $ go run /tmp/x.go  # after this CL
    <X></X>              ok
    <X A=""></X>         ok
    <X A="bad"></X>      ok
    <X></X>              ok
    <X><C></C></X>       ERROR strconv.ParseInt: parsing "": invalid syntax
    <X><C/></X>          ERROR strconv.ParseInt: parsing "": invalid syntax
    <X><C>bad</C></X>    ERROR strconv.ParseInt: parsing "bad": invalid syntax
    $

Fixes #19333.

Change-Id: I3d38ebd2509f5b6ea3fd4856327f887f9a1a8085
Reviewed-on: https://go-review.googlesource.com/39607
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Sarah Adams <shadams@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>

@golang golang locked and limited conversation to collaborators Apr 5, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.