New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/json: confusing errors when unmarshaling custom types #28189

Open
adnsv opened this Issue Oct 13, 2018 · 10 comments

Comments

Projects
None yet
4 participants
@adnsv

adnsv commented Oct 13, 2018

In go1.11

There is a minor issue with custom type unmarshaling using "encoding/json". As far as I understand the documentation (and also looking through the internal code), the requirements are:

  • when processing map keys, the custom type needs to support UnmarshalText
  • when processing values, the custom type needs to support UnmarshalJSON

When both of these interfaces are supported, everything is fine. Otherwise, the produced error messages are a bit cryptic and in some cases really confusing. I believe, adding a test for Implements(jsonUnmarshallerType) inside encoding/json/decode.go: func (d *decodeState) object(v reflect.Value) will make things more consistent.

Here is the code (try commenting out UnmarshalText and/or/xor UnmarshalJSON methods):

package main

import (
	"encoding/json"
	"fmt"
)

type Enum int

const (
	Enum1 = Enum(iota + 1)
	Enum2
)

func (enum Enum) String() string {
	switch enum {
	case Enum1: return "Enum1"
	case Enum2: return "Enum2"
	default: return "<INVALID ENUM>"
	}
}

func (enum *Enum) unmarshal(b []byte) error {
	var s string
	err := json.Unmarshal(b, &s)
	if err != nil { return err }
	switch s {
	case "ONE": *enum = Enum1
	case "TWO": *enum = Enum2
	default: return fmt.Errorf("Invalid Enum value '%s'", s)
	}
	return nil
}

func (enum *Enum) UnmarshalText(b []byte) error {
	return enum.unmarshal(b)
}

func (enum *Enum) UnmarshalJSON(b []byte) error {
	return enum.unmarshal(b)
}

func main() {
	data := []byte(`{"ONE":"ONE", "TWO":"TWO"}`)

	var ss map[string]string
	err := json.Unmarshal(data, &ss)
	if err != nil { fmt.Println("ss failure:", err) } else { fmt.Println("ss success:", ss) }

	var se map[string]Enum
	err = json.Unmarshal(data, &se)
	if err != nil { fmt.Println("se failure:", err) } else { fmt.Println("se success:", se) }

	var es map[Enum]string
	err = json.Unmarshal(data, &es)
	if err != nil { fmt.Println("es failure:", err) } else { fmt.Println("es success:", es) }

	var ee map[Enum]Enum
	err = json.Unmarshal(data, &ee)
	if err != nil { fmt.Println("ee failure:", err) } else { fmt.Println("ee success:", ee) }

	// Output when both UnmarshalText and UnmarshalJSON are defined:
	// ss success: map[ONE:ONE TWO:TWO]
	// se success: map[ONE:Enum1 TWO:Enum2]
	// es success: map[Enum1:ONE Enum2:TWO]
	// ee success: map[Enum1:Enum1 Enum2:Enum2]

	// Output when UnmarshalJSON is commented out:
	// ss success: map[ONE:ONE TWO:TWO]
	// se failure: invalid character 'T' looking for beginning of value
	// es failure: invalid character 'O' looking for beginning of value
	// ee failure: invalid character 'T' looking for beginning of value

	// Output when UnmarshalText is commented out:
	// ss success: map[ONE:ONE TWO:TWO]
	// se success: map[ONE:Enum1 TWO:Enum2]
	// es failure: json: cannot unmarshal number ONE into Go value of type main.Enum
	// ee failure: json: cannot unmarshal number ONE into Go value of type main.Enum

	// In more complex cases, having UnmarshalText undefined also produced this
	// error message: JSON decoder out of sync - data changing underfoot?
}

@FiloSottile FiloSottile changed the title from Asymmetrical json requirements for unmarshaling custom types to encoding/json: confusing errors when unmarshaling custom types Oct 15, 2018

@FiloSottile FiloSottile added this to the Go1.12 milestone Oct 15, 2018

@FiloSottile

This comment has been minimized.

Member

FiloSottile commented Oct 15, 2018

Sounds like the error messages for all those cases should be improved. Also, if you could provide a reproducer for "JSON decoder out of sync - data changing underfoot?", I believe that should never happen.

/cc @rsc @dsnet @bradfitz @mvdan per owners.

@adnsv

This comment has been minimized.

adnsv commented Oct 15, 2018

Here is the example that produces the ...underfoot message:

package main

import (
	"encoding/json"
	"fmt"
)

type Lang int

const (
	LangC = Lang(iota)
	LangCPP
)

type Project struct {
	OptionMap map[Lang]*Option `json:"options"`
	Typedefs  []string         `json:"typedefs"`
}

type Option struct {
	Active bool `json:"active"`
}

func main() {
	var pr *Project
	err := json.Unmarshal([]byte(data), &pr)
	if err != nil {
		fmt.Println(err)
		// Output: JSON decoder out of sync - data changing underfoot?
		// Uncommenting UnmarshalText below gets rid of this problem
	} else {
		fmt.Println("ok")
	}
}

func (lang *Lang) unmarshal(b []byte) error {
	var s string
	err := json.Unmarshal(b, &s)
	if err != nil {
		return err
	}
	switch s {
	case "C":
		*lang = LangC
	case "C++":
		*lang = LangCPP
	default:
		return fmt.Errorf("Unsupported Lang '%s'", s)
	}
	return nil
}

func (lang *Lang) UnmarshalJSON(b []byte) error {
	return lang.unmarshal(b)
}

/*
func (lang *Lang) UnmarshalText(b []byte) error {
	return lang.unmarshal(b)
}
/**/

const data = `
{
    "options": {
        "C": { "active": true },
        "C++": { "active": false }
    },
    "typedefs": [ "hello" ]
}`
@mvdan

This comment has been minimized.

Member

mvdan commented Oct 16, 2018

Thanks for raising this issue. The "underfoot" error is definitely wrong, so we can start with that one.

@mvdan

This comment has been minimized.

Member

mvdan commented Oct 16, 2018

Manually minimized reproducer below - will continue digging.

package main

import "encoding/json"

func main() {
        json.Unmarshal([]byte(data), new(T))
}

type T struct {
        F map[int]int
}

const data = `{
        "F": {
                "a": 2,
                "b": 3
        }
}`
@gopherbot

This comment has been minimized.

gopherbot commented Oct 16, 2018

Change https://golang.org/cl/142518 mentions this issue: encoding/json: fix "data changed underfoot?" panic

gopherbot pushed a commit that referenced this issue Oct 16, 2018

encoding/json: fix "data changed underfoot?" panic
Given a program as follows:

	data := []byte(`{"F": {
		"a": 2,
		"3": 4
	}}`)
	json.Unmarshal(data, &map[string]map[int]int{})

The JSON package should error, as "a" is not a valid integer. However,
we'd encounter a panic:

	panic: JSON decoder out of sync - data changing underfoot?

The reason was that decodeState.object would return a nil error on
encountering the invalid map key string, while saving the key type error
for later. This broke if we were inside another object, as we would
abruptly end parsing the nested object, leaving the decoder in an
unexpected state.

To fix this, simply avoid storing the map element and continue decoding
the object, to leave the decoder state exactly as if we hadn't seen an
invalid key type.

This affected both signed and unsigned integer keys, so fix both and add
two test cases.

Updates #28189.

Change-Id: I8a6204cc3ff9fb04ed769df7a20a824c8b94faff
Reviewed-on: https://go-review.googlesource.com/c/142518
Reviewed-by: Ian Lance Taylor <iant@golang.org>
@mvdan

This comment has been minimized.

Member

mvdan commented Oct 18, 2018

The underfoot panic is now fixed - the example above prints:

json: cannot unmarshal number C into Go struct field Option.active of type main.Lang

I'll try to look at the other cases before the cycle is over.

@mvdan mvdan self-assigned this Oct 18, 2018

@mvdan mvdan removed the help wanted label Nov 6, 2018

@mvdan

This comment has been minimized.

Member

mvdan commented Nov 16, 2018

Output when UnmarshalJSON is commented out

I think this block is just the user's bug. If you add fmt.Println(string(b)) to the beginning of both UnmarshalText and UnmarshalJSON, you'll find that they get strings like ONE and "ONE", respectively.

When you comment out UnmarshalJSON, the JSON package falls back to using UnmarshalText. Which then essentially does:

var s string
err := json.Unmarshal([]byte("ONE"), &s)

And this, as you'd expect, errors with invalid character 'O' looking for beginning of value. The bug here is that you should generally not use json.Unmarshal within a TextUnmarshaler implementation, because it will receive the string's contents, not the string object's input bytes.

@mvdan

This comment has been minimized.

Member

mvdan commented Nov 16, 2018

To clarify my comment above: I don't think we can improve those errors, because the nested call to json.Unmarshal has no context about where it's being used. The user code is using it incorrectly, so it's not surprising that the returned error is confusing.

As for using Enum as a map key when it only implements UnmarshalJSON, it makes sense because of this piece of documentation:

The map's key type must either be a string, an integer, or implement encoding.TextUnmarshaler.

However, the error could be clearer. Instead of cannot unmarshal number ONE into Go value of type main.Enum, we could have an error like main.Enum is not a valid map key type. Will look into a CL for that, and close this issue.

@mvdan

This comment has been minimized.

Member

mvdan commented Nov 16, 2018

Smaller repro for the last case:

type T int

func (*T) UnmarshalJSON(b []byte) error {
        return nil
}

func main() {
        data := []byte(`{"foo":"bar"}`)
        var m map[T]string
        fmt.Println(json.Unmarshal(data, &m))
}

Not having carefully read the encoding/json docs, it's understandable if a user sees the error json: cannot unmarshal number foo into Go value of type main.T and thinks "that doesn't make any sense, I implement UnmarshalJSON".

@mvdan mvdan added NeedsDecision and removed NeedsFix labels Nov 28, 2018

@mvdan mvdan modified the milestones: Go1.12, Go1.13 Nov 28, 2018

@mvdan

This comment has been minimized.

Member

mvdan commented Nov 28, 2018

It's not clear that the last remaining case needs a fix, so I'm moving this to 1.13 and marking as NeedsDecision.

We could modify UnmarshalTypeError to have a hint that a json.Unmarshaler type was found but was ignored because it was a map key, but that seems like a bit of a hacky solution.

Perhaps @rsc has some thoughts.

@mvdan mvdan removed their assignment Dec 2, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment