New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/json: unmarshal into slice reuses element data between len and cap #21092

Open
trotha01 opened this Issue Jul 19, 2017 · 13 comments

Comments

Projects
None yet
10 participants
@trotha01

trotha01 commented Jul 19, 2017

What version of Go are you using (go version)?

go1.8

What operating system and processor architecture are you using (go env)?

GOOS=nacl
GOARCH=amd64p32

What did you do?

https://play.golang.org/p/lbYUhgOe--

What did you expect to see?

According to the json Unmarshal docs, Unmarshal resets the slice length to zero and then appends each element to the slice.

What did you see instead?

Instead Unmarshal resets the slice length to zero and then modifies the elements of the underlying slice.

@bradfitz

This comment has been minimized.

Member

bradfitz commented Jul 19, 2017

This is working as intended. See https://blog.golang.org/slices for how slices work.

"resets the slice length to zero and then appends each element to the slice" and "resets the slice length to zero and then modifies the elements of the underlying slice" are the same thing.

@bradfitz bradfitz closed this Jul 19, 2017

@trotha01

This comment has been minimized.

trotha01 commented Jul 19, 2017

https://golang.org/pkg/encoding/json/#Unmarshal

To unmarshal a JSON array into a slice, Unmarshal resets the slice length to zero and then appends each element to the slice

@bradfitz This may be working as intended, but not as described in the docs. The docs describe the slice length being set to zero (which happens), then appending (which is not what is happening).

The builtin append will replace the struct in a slice wholly. Unmarshal does not. Since the builtin append does one thing, and Unmarshal does a second, we should not say that Unmarshal is appending.

@bradfitz bradfitz reopened this Jul 20, 2017

@bradfitz bradfitz changed the title from json unmarshal into slice does not append to encoding/json: unmarshal into slice does not append Jul 20, 2017

@bradfitz bradfitz added this to the Go1.10 milestone Jul 20, 2017

@thoeni

This comment has been minimized.

Contributor

thoeni commented Jul 20, 2017

TL;DR
Probably update documentation explaining the actual behaviour of Unmarshal as the current doc is misleading.

Extended version:
I had a look at this issue, and here are my findings:

On the encoding/json Unmarshal documentation this is mentioned:

If a JSON value is not appropriate for a given target type, or if a JSON number overflows the target type, Unmarshal skips that field and completes the unmarshaling as best it can.

Now, we could argue that in the use case that is described in this issue, a json object with just Name and no Age is a value which is not "appropriate" for the struct target type which expects both, and therefore the code is completing the unmarshaling "as best it can".

In the counterexample of the manual SliceAppend, the creation of a new Person initialises the age to the zero value, hence the append overrides everything, while during unmarshalling this doesn't happen.

An option is to set the element value to the target type "zero value" before updating it by assigning the decoded value from the json: in such a way everything would be assigned to its zero value (as the new Person{} example) and the unmarshal would do its job "as best it can" for the other data, for example here:

if i < v.Len() {
   // Calculate zero value for target type 
   z := reflect.Zero(v.Type().Elem())
   // Assign the zero value to the element
   v.Index(i).Set(z)
   // Decode into element.
   d.value(v.Index(i))
}

In general, I miss the use case of passing an initialised slice to unmarshal new values that override the old values, where I could just pass a new empty slice (and I agree that the description in this case is misleading).

I have a green test, but I'm not sure of the performance implications of adding those steps (i.e. setting the value twice, zero and then decoded one).
If it makes sense to proceed, I can write more tests to cover types with different zero values, as at the moment I'm using strings and ints).

At a first sight, I would probably suggest to update the documentation as I expect someone is relying on that behaviour.

[update]
If I run all tests with this solution, jsonrpc tests break, therefore the solution indicated might need further investigation should we decide to proceed in that direction.

@dvyukov

This comment has been minimized.

Member

dvyukov commented Jul 25, 2017

This does not look like working as intended and more like data corruption. Appending must overwrite the old value. This must print 1/1, but it prints 1/3:
https://play.golang.org/p/myimTJdn42

@dvyukov

This comment has been minimized.

Member

dvyukov commented Jul 25, 2017

The code does not seem to regard even slice len. So if one uses this to get some kind of merging, she can get unexpected garbage with whatever was there between len and cap.

dvyukov added a commit to google/syzkaller that referenced this issue Jul 25, 2017

dashboard/dashapi: always zero reply
json decoding behavior is somewhat surprising
(see // golang/go#21092).
This behavior is especially easy to hit in tests
that reuse reply objects.
To avoid any surprises, we zero the reply.
@rsc

This comment has been minimized.

Contributor

rsc commented Nov 22, 2017

I agree that when appending to a slice, unmarshal should start with fresh zeroed slice elements and not the old slice elements that happen to sit between len and cap. Too late for Go 1.10 though.

Dmitry's example is clearly incorrect behavior and should be fixed. However, the fix should preserve the current behavior in this variant: https://play.golang.org/p/H0kRWEEiEW. (Resetting the length to 0 does not mean resetting the capacity to 0.)

@rsc rsc modified the milestones: Go1.10, Go1.11 Nov 22, 2017

@rsc rsc added NeedsFix and removed NeedsInvestigation labels Nov 22, 2017

@rsc rsc changed the title from encoding/json: unmarshal into slice does not append to encoding/json: unmarshal into slice reuses element data between len and cap Nov 22, 2017

@bradfitz

This comment has been minimized.

Member

bradfitz commented Feb 27, 2018

Duplicate (closed) bug with other repros: #24155

JeremyLoy added a commit to JeremyLoy/go that referenced this issue Mar 4, 2018

@gopherbot

This comment has been minimized.

gopherbot commented Mar 4, 2018

Change https://golang.org/cl/98516 mentions this issue: encoding/json: Fix for #21092 - Zero out target before decoding

@OneOfOne

This comment has been minimized.

Contributor

OneOfOne commented Mar 17, 2018

As someone who actually depends on the fact that Unmarshal does NOT zero out maps/slices, this would break a lot of my code.

@JeremyLoy

This comment has been minimized.

JeremyLoy commented Mar 18, 2018

@OneOfOne out of curiosity, what is your usecase that requires data to be retained for maps/slices?

@OneOfOne

This comment has been minimized.

Contributor

OneOfOne commented Mar 19, 2018

@JeremyLoy in one case to override the default values in a map and in a few other, filling the map from multiple input files.

@OneOfOne

This comment has been minimized.

Contributor

OneOfOne commented Mar 19, 2018

Maybe you can add an extra field to the json decoder to make this optional.

@robmccoll

This comment has been minimized.

robmccoll commented Apr 4, 2018

@OneOfOne Based on reading @rsc's comment, I don't think your case will be affected.

Please excuse my rambling explanation: Unmarshalling onto existing slices would continue to merge data for elements up to len() and only zero elements between len() and cap(). If you read in a file with 3 slice elements and then read a second file with 4 slice elements, the first 3 elements will be merged (in that values in fields contained in the JSON objects in the second file will overwrite values from the JSON objects in the first but leave any values contained in the objects in the first file that were not also in the second file). The fourth element would just be exactly what was contained in the fourth element of the second file.

If you are for some reason reading in the first file, then setting your slice header back a = a[:0] before reading in the second file, this will affect you. But also, I don't see why you would be doing that.

@bradfitz bradfitz modified the milestones: Go1.11, Go1.12 Jun 13, 2018

@ianlancetaylor ianlancetaylor removed this from the Go1.12 milestone Dec 11, 2018

@ianlancetaylor ianlancetaylor added this to the Go1.13 milestone Dec 11, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment