Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/json: decoding performance regression in 1.9beta1 #20693

Closed
cespare opened this issue Jun 16, 2017 · 14 comments

Comments

Projects
None yet
6 participants
@cespare
Copy link
Contributor

commented Jun 16, 2017

This is with these Go versions:

go version go1.8 linux/amd64
go version go1.9beta1 linux/amd64

The following benchmark unmarshals a JSON object which has a large sub-object that's irrelevant (it's associated with a key that isn't in the destination struct):

package main

import (
	"encoding/json"
	"fmt"
	"testing"
)

type T0 struct {
	Foo map[string]string
}

type T1 struct {
	Bar string
}

func Benchmark19Regression(b *testing.B) {
	junk := make(map[string]string)
	for i := 0; i < 100; i++ {
		junk[fmt.Sprintf("key%d", i)] = fmt.Sprintf("value%d", i)
	}
	input, err := json.Marshal(T0{Foo: junk})
	if err != nil {
		b.Fatal(err)
	}
	b.ResetTimer()

	for i := 0; i < b.N; i++ {
		var t T1
		if err := json.Unmarshal(input, &t); err != nil {
			b.Fatal(err)
		}
	}
}

It gets about 30% slower for me in Go 1.9.beta1:

name            old time/op    new time/op    delta
19Regression-4    20.8µs ± 3%    27.2µs ± 4%  +30.90%  (p=0.001 n=7+7)

name            old alloc/op   new alloc/op   delta
19Regression-4      336B ± 0%      328B ± 0%   -2.38%  (p=0.001 n=7+7)

name            old allocs/op  new allocs/op  delta
19Regression-4      5.00 ± 0%      4.00 ± 0%  -20.00%  (p=0.001 n=7+7)

You can also see that it's not due to increased allocations; they went down in 1.9.

@cespare

This comment has been minimized.

Copy link
Contributor Author

commented Jun 16, 2017

I can investigate this later if nobody else gets to it first.

@cespare

This comment has been minimized.

Copy link
Contributor Author

commented Jun 16, 2017

(Added to the 1.9 milestone since it was discovered while testing the 1.9 beta. Please fix it if that's not right.)

@cespare cespare added this to the Go1.9 milestone Jun 16, 2017

@vcabbage

This comment has been minimized.

Copy link
Member

commented Jun 16, 2017

Appears to be df68afd, the commit that removes the allocation, which causes this benchmark to be slower.

@bradfitz

This comment has been minimized.

Copy link
Member

commented Jun 16, 2017

Thanks for the report @cespare and investigation @vcabbage.

I suppose we could just revert df68afd but then we lose some performance in some case too. But it'd at least be equivalent to Go 1.8. @cespare, could you locally do a manual revert and see what happens performance wise?

@bradfitz

This comment has been minimized.

Copy link
Member

commented Jun 16, 2017

@pascaldekloe, could you investigate too?

@pascaldekloe

This comment has been minimized.

Copy link
Contributor

commented Jun 16, 2017

Sorry guys, my holidays lasts till the 26th and there are no computers around. If the issue can wait then I'm happy to resolve this once home.

@cespare

This comment has been minimized.

Copy link
Contributor Author

commented Jun 17, 2017

Yeah, reverting df68afd makes the performance hit go away and then some (but also makes allocations become equal again).

This benchmark compares 1.8 vs. [tip with df68afd reverted].

name            old time/op    new time/op    delta
19Regression-4    20.7µs ± 3%    19.0µs ± 5%  -8.47%  (p=0.008 n=5+5)

name            old alloc/op   new alloc/op   delta
19Regression-4      336B ± 0%      336B ± 0%    ~     (all equal)

name            old allocs/op  new allocs/op  delta
19Regression-4      5.00 ± 0%      5.00 ± 0%    ~     (all equal)

I wonder if there's some fix for the CPU time which can maintain the alloc improvement.

@cespare

This comment has been minimized.

Copy link
Contributor Author

commented Jun 21, 2017

By the way, if it comes down to keeping or reverting df68afd for 1.9, I'd vote to keep it. This benchmark isn't critical for us and the benefit of reduced allocs helps us as well.

@pascaldekloe

This comment has been minimized.

Copy link
Contributor

commented Jun 26, 2017

The root of the problem lies in that function json.decodeState.value leaves the state to as it was on the last applicable character. Such functionality requires read ahead for some literals. As a result we have workaround json.decodeState.nextscan with function next and nextValue (original) in place. I'll try to eliminate the workarounds including function json.scanner.undo.

@bradfitz

This comment has been minimized.

Copy link
Member

commented Jun 26, 2017

@pascaldekloe, thanks.

@pascaldekloe

This comment has been minimized.

Copy link
Contributor

commented Jun 28, 2017

Starting to work now. Probably too big of a change for the 1.9 beta release.

benchmark                   old ns/op     new ns/op     delta
Benchmark19Regression-4     29291         21890         -25.27%
benchmark                          old ns/op     new ns/op     delta
BenchmarkCodeEncoder-4             8321258       8571049       +3.00%
BenchmarkCodeMarshal-4             9539306       10014633      +4.98%
BenchmarkCodeDecoder-4             40581703      40218844      -0.89%
BenchmarkDecoderStream-4           426           450           +5.63%
BenchmarkCodeUnmarshal-4           42960681      43142710      +0.42%
BenchmarkCodeUnmarshalReuse-4      40541573      40019672      -1.29%
BenchmarkUnmarshalString-4         432           351           -18.75%
BenchmarkUnmarshalFloat64-4        368           316           -14.13%
BenchmarkUnmarshalInt64-4          337           279           -17.21%
BenchmarkIssue10335-4              489           435           -11.04%
BenchmarkUnmapped-4                1456          1178          -19.09%
BenchmarkNumberIsValid-4           32.4          32.1          -0.93%
BenchmarkNumberIsValidRegexp-4     884           920           +4.07%
BenchmarkEncoderEncode-4           269           265           -1.49%

benchmark                    old MB/s     new MB/s     speedup
BenchmarkCodeEncoder-4       233.19       226.40       0.97x
BenchmarkCodeMarshal-4       203.42       193.76       0.95x
BenchmarkCodeDecoder-4       47.82        48.25        1.01x
BenchmarkCodeUnmarshal-4     45.17        44.98        1.00x

benchmark                    old allocs     new allocs     delta
BenchmarkIssue10335-4        3              3              +0.00%
BenchmarkUnmapped-4          4              4              +0.00%
BenchmarkEncoderEncode-4     0              0              +0.00%

benchmark                    old bytes     new bytes     delta
BenchmarkIssue10335-4        312           184           -41.03%
BenchmarkUnmapped-4          344           216           -37.21%
BenchmarkEncoderEncode-4     0             0             +0.00%
 src/encoding/json/decode.go       | 314 ++++++++++++++++++++++++++++++++++++++++++++++++++++--------------------------------------------------
 src/encoding/json/scanner.go      |  57 +------------------
 src/encoding/json/scanner_test.go |  37 ------------
 3 files changed, 160 insertions(+), 248 deletions(-)
@gopherbot

This comment has been minimized.

Copy link

commented Jun 29, 2017

CL https://golang.org/cl/47152 mentions this issue.

@rsc

This comment has been minimized.

Copy link
Contributor

commented Jun 29, 2017

Please revert the change in question. Reducing allocated bytes by 2% (8 out of 336) is not worth a 30% CPU performance regression. Thanks.

@gopherbot

This comment has been minimized.

Copy link

commented Jun 29, 2017

CL https://golang.org/cl/47211 mentions this issue.

@gopherbot gopherbot closed this in 912bb81 Jun 29, 2017

gopherbot pushed a commit that referenced this issue Mar 1, 2018

encoding/json: read ahead after value consumption
Eliminates the need for an extra scanner, read undo and some other tricks.

name                    old time/op    new time/op    delta
CodeEncoder-12            1.92ms ± 0%    1.91ms ± 1%   -0.65%  (p=0.000 n=17+20)
CodeMarshal-12            2.13ms ± 2%    2.12ms ± 1%   -0.49%  (p=0.038 n=18+17)
CodeDecoder-12            8.55ms ± 2%    8.49ms ± 1%     ~     (p=0.119 n=20+18)
UnicodeDecoder-12          411ns ± 0%     422ns ± 0%   +2.77%  (p=0.000 n=19+15)
DecoderStream-12           320ns ± 1%     307ns ± 1%   -3.80%  (p=0.000 n=18+20)
CodeUnmarshal-12          9.65ms ± 3%    9.58ms ± 3%     ~     (p=0.157 n=20+20)
CodeUnmarshalReuse-12     8.54ms ± 3%    8.56ms ± 2%     ~     (p=0.602 n=20+20)
UnmarshalString-12         110ns ± 1%      87ns ± 2%  -21.53%  (p=0.000 n=16+20)
UnmarshalFloat64-12        101ns ± 1%      77ns ± 2%  -23.08%  (p=0.000 n=19+20)
UnmarshalInt64-12         94.5ns ± 2%    68.4ns ± 1%  -27.60%  (p=0.000 n=20+20)
Issue10335-12              128ns ± 1%     100ns ± 1%  -21.89%  (p=0.000 n=19+18)
Unmapped-12                427ns ± 3%     247ns ± 4%  -42.17%  (p=0.000 n=20+20)
NumberIsValid-12          23.0ns ± 0%    21.7ns ± 0%   -5.73%  (p=0.000 n=20+20)
NumberIsValidRegexp-12     641ns ± 0%     642ns ± 0%   +0.15%  (p=0.003 n=19+19)
EncoderEncode-12          56.9ns ± 0%    55.0ns ± 1%   -3.32%  (p=0.012 n=2+17)

name                    old speed      new speed      delta
CodeEncoder-12          1.01GB/s ± 1%  1.02GB/s ± 1%   +0.71%  (p=0.000 n=18+20)
CodeMarshal-12           913MB/s ± 2%   917MB/s ± 1%   +0.49%  (p=0.038 n=18+17)
CodeDecoder-12           227MB/s ± 2%   229MB/s ± 1%     ~     (p=0.110 n=20+18)
UnicodeDecoder-12       34.1MB/s ± 0%  33.1MB/s ± 0%   -2.73%  (p=0.000 n=19+19)
CodeUnmarshal-12         201MB/s ± 3%   203MB/s ± 3%     ~     (p=0.151 n=20+20)

name                    old alloc/op   new alloc/op   delta
Issue10335-12               320B ± 0%      184B ± 0%  -42.50%  (p=0.000 n=20+20)
Unmapped-12                 568B ± 0%      216B ± 0%  -61.97%  (p=0.000 n=20+20)
EncoderEncode-12           0.00B          0.00B          ~     (all equal)

name                    old allocs/op  new allocs/op  delta
Issue10335-12               4.00 ± 0%      3.00 ± 0%  -25.00%  (p=0.000 n=20+20)
Unmapped-12                 18.0 ± 0%       4.0 ± 0%  -77.78%  (p=0.000 n=20+20)
EncoderEncode-12            0.00           0.00          ~     (all equal)

Fixes #17914
Updates #20693
Updates #10335

Change-Id: I0459a52febb8b79c9a2991e69ed2614cf8740429
Reviewed-on: https://go-review.googlesource.com/47152
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>

@golang golang locked and limited conversation to collaborators Jun 29, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.